-
Name Previous Value Current Value Type
Question
Improvement
-
-
This is expected behavior as explained in OD-82
If the code is changed to use original encoding, you will encounter below issue:
- Commit a file with all English words. The file encoding will be ISO-8859-1
- Now edit the file to add some Chinese characters. File content will be converted to ISO-8859-1 bytes and Chinese characters will be messed up.
-
This is expected behavior as explained in OD-82
If the code is changed to use original encoding, you will encounter below issue:
- Commit a file with all English words. The file encoding will be ISO-8859-1
- Now edit the file to add some Chinese characters. File content will be converted to ISO-8859-1 bytes and Chinese characters will be messed up.
pr OD-55 has a new commit to solve this issue. This commit sets the encoding of a file with all English words to be Utf-8 instead of ISO-8859-1, so as to enable Chinese character editing.
-
The all-english-words file may exist already. So this fix is not complete. Actually I do not think there will be a complete fix for this. So once edited from UI, OneDev will always use UTF-8 for encoding.
-
The all-english-words file may exist already. So this fix is not complete. Actually I do not think there will be a complete fix for this. So once edited from UI, OneDev will always use UTF-8 for encoding.
This fix will consider all-english-words file as UTF-8 encoding instead of ISO-8859-1. Since our team use onedev for some Visual Studio Projects, this fix may be useful when managing VS Projects and editing GBK encoded file. Please have a try on this fix and figure out if there is any other bug after this fix.
As we know, GBK encoding is default for VS projects and when changing the encoding of file, some unexpected issues may appear.
-
This fix still has problem:
- Assume a file test.c already committed to repository, and it only contains ascii characters
- When edited from UI, the initial encoding will be detected as ISO-8859-1 (even if it is encoded as UTF-8 initially, as bytes are the same for both encoding for pure ascii characters)
- When some Chinese characters are added and saved, sticking to initial encoding (ISO-8859-1) will mess up the file.
As mentioned before, the approach to use original encoding will not work. Please either avoid editing online, or change your encoding to UTF-8 which is default for mojarity of code editors.
-
This fix still has problem:
- Assume a file test.c already committed to repository, and it only contains ascii characters
- When edited from UI, the initial encoding will be detected as ISO-8859-1 (even if it is encoded as UTF-8 initially, as bytes are the same for both encoding for pure ascii characters)
- When some Chinese characters are added and saved, sticking to initial encoding (ISO-8859-1) will mess up the file.
As mentioned before, the approach to use original encoding will not work. Please either avoid editing online, or change your encoding to UTF-8 which is default for mojarity of code editors.
Sorry, I don't seem to fully understand what you're trying to express. Based on the fixed code, the test results we obtained are:
- Set a file test.c already committed to repository, and it only contains ascii characters.
- When edited from UI, the initial encoding will be detected as UTF-8 instead of ISO-8859-1. (We changed the default encoding here.)
- When some Chinese characters are added and saved, sticking to initial encoding (UTF-8) will NOT mess up the file.
Please have a look at the test result below. 20240902_001442.mp4
-
I did not realize you are changing default encoding from ISO-8859-1 to UTF-8 in UniversalEncodingListener.java. This may cause backward compatibilities, as not all ISO-8859-1 char a valid UTF-8 char.
-
I did not realize you are changing default encoding from ISO-8859-1 to UTF-8 in UniversalEncodingListener.java. This may cause backward compatibilities, as not all ISO-8859-1 char a valid UTF-8 char.
A newly commit has be updated as to only keep the encoding of GBK encoded file. Pls review the update.
This fix can make adaption to Windows Visual Studio Projects, in which source code files are encoded in GBK.
-
Previous Value Current Value Open
Closed
-
Sorry the workaround to process GBK alone is not acceptable. I am closing the issue.
-
GIT doesn't really care about encoding as it only stores binary data but it works best with UTF-8 because if git detects a file to be text it chooses UTF-8 by default.
If you use a tool that cannot produce UTF-8 files then you should tell GIT the file encoding using
.gitattributesfile and itsworking-tree-encodingoption. GIT will then convert back and forth between the specified custom encoding and UTF-8 (used to store the data).If OneDev would honor any existing
.gitattributesfile then OneDev could convert the String received from the browser to the encoding specified in.gitattributesbefore committing.It is important to understand that ALL git client applications you use would then need to understand
working-tree-encoding. If you use any git client that does not understandworking-tree-encodingthen you will mess up the encoding./.gitattributes: *.c working-tree-encoding=GB18030
| Type |
Improvement
|
| Priority |
Normal
|
| Assignee | |
| Labels |
No labels
|
The original file encoding is GB18030, and after edit in the online editor, its encoding changes to utf-8. Is there any way that backend can write file into its original encoding?