-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Will not open file (Charset Exception when reading file) #8870
Comments
Hi, thanks for the report. This seems to be some problem with determining the right charset for reading. Can you check which charset it shows in emacs? JabRef always tries to guess the charset and ideally uses UTF8 |
That was it! When I saved as utf-8 from emacs, jabref was able to open.
1. I have no idea why a file that jabref was previously happy to open suddenly acquired this problem.
2. My bad. I always thought that saving as “undecided-unix” from emacs gave simply ascii files. Evidently that does not do the trick.
Thank you,
Sherwin
From: Christoph ***@***.***>
Sent: Monday, May 30, 2022 3:21 PM
To: JabRef/jabref ***@***.***>
Cc: Singer, Sherwin ***@***.***>; Author ***@***.***>
Subject: Re: [JabRef/jabref] Will not open file (Issue #8870)
Hi, thanks for the report. This seems to be some problem with determining the right charset for reading. Can you check which charset it shows in emacs? JabRef always tries to guess the charset and ideally uses UTF8 Can you send us the file
Hi, thanks for the report. This seems to be some problem with determining the right charset for reading. Can you check which charset it shows in emacs? JabRef always tries to guess the charset and ideally uses UTF8
Can you send us the file so we can use it for debugging purposes? (You can send it to ***@***.******@***.***>, if you do not want to upload it:
—
Reply to this email directly, view it on GitHub<https://urldefense.com/v3/__https:/github.com/JabRef/jabref/issues/8870*issuecomment-1141428789__;Iw!!KGKeukY!z65f-vdKsP-xG_wCLVruN_3dBG6_Op3_lw_CBDY28c3OjU0EFbS6Ys_qX4yib1JVScyeMiijxASZINlTwZ-WJcJUteKro32nS6_t$>, or unsubscribe<https://urldefense.com/v3/__https:/github.com/notifications/unsubscribe-auth/AJII6NBXIDR472RQVRU2MXLVMUIJTANCNFSM5XLL5PBA__;!!KGKeukY!z65f-vdKsP-xG_wCLVruN_3dBG6_Op3_lw_CBDY28c3OjU0EFbS6Ys_qX4yib1JVScyeMiijxASZINlTwZ-WJcJUteKro2FW5ctt$>.
You are receiving this because you authored the thread.Message ID: ***@***.******@***.***>>
|
This problem goes both ways: Emacs could have changed their "undecided unix" export. |
More info needed. Since you used One-drive, a file transfer could have destroyed some parts of your file; Old HDDs can wear out and lead to bits and bytes flipping; etc. There are a multitude of reasons why files can stop working.
I installed Emacs 26.3 to debug this and was not able to change the encoding from UTF-8 (default) to something else. How did you do it? I tried the multi language settings... but it seems I have to enter some commands. Also, could not find the "undecided-unix" option. From JabRef side: I remember the UTF-8 comment was removed in one of the recent versions. Might have been 5.5 or 5.6. |
I have attached the file that 5.3 will open, but 5.6 and 5.7 will not open.
I’m using emacs 28.1, but the following saving procedure has not changed.
Click on Options ***@***.*** Multilingual Environment ***@***.*** Set Coding System ***@***.*** For Saving This Buffer
This will put you into a 1-line window at the bottom of the frame. You can type in “undecided-unix” or “utf-8” If you tab while you are typing, it will auto-complete as much as possible. There are many variations of each system. For example, after entering utf-8 keep tabbing and it will show many options like utf-8, utf-8-auto,… I just used utf-8.
After that you must save the file, either from the File menu or else by typing “control-x control-s”
After I saved the troublesome bib file as utf-8 in this manner, jabref 5.6 and 5.7 could open the file.
I have no idea where I picked up the “undecided-unix” option, but it has been my go-to for converting Microsoft files with a control character at the end of the line to (what I thought was) a standard unix ascii file.
Thanks again for your help,
Sherwin
From: ThiloteE ***@***.***>
Sent: Tuesday, May 31, 2022 6:24 PM
To: JabRef/jabref ***@***.***>
Cc: Singer, Sherwin ***@***.***>; Author ***@***.***>
Subject: Re: [JabRef/jabref] Will not open file (Charset Exception when reading file) (Issue #8870)
More info needed. Since you used One-drive, a file transfer could have destroyed some parts of your file; Old HDDs can wear out and lead to bits and bytes flipping; etc. There are a multitude of reasons why files can stop working. Could you
More info needed. Since you used One-drive, a file transfer could have destroyed some parts of your file; Old HDDs can wear out and lead to bits and bytes flipping; etc. There are a multitude of reasons why files can stop working.
* Could you please provide the faulty .bib file?
* What emacs version are you working with?
I installed Emacs 26.3 to debug this and was not able to change the encoding from UTF-8 (default) to something else. How did you do it? I tried the multi language settings... but it seems I have to enter some commands. Also, could not find the "undecided-unix" option.
—
Reply to this email directly, view it on GitHub<https://urldefense.com/v3/__https:/github.com/JabRef/jabref/issues/8870*issuecomment-1142694596__;Iw!!KGKeukY!1iKBqVBGqaMD85diiNR53fQt0hQS6PUmbapxUJRVcPOtjSCT_6X8QT4UnFBc93suopG_1nK4aeB2Cwsxtb3ixhSm9DhrADkilEcy$>, or unsubscribe<https://urldefense.com/v3/__https:/github.com/notifications/unsubscribe-auth/AJII6NAL5FOUM7AVATZNVATVM2GO7ANCNFSM5XLL5PBA__;!!KGKeukY!1iKBqVBGqaMD85diiNR53fQt0hQS6PUmbapxUJRVcPOtjSCT_6X8QT4UnFBc93suopG_1nK4aeB2Cwsxtb3ixhSm9DhrABaTSn8m$>.
You are receiving this because you authored the thread.Message ID: ***@***.******@***.***>>
|
Thank you that helped a lot. Two things:
|
I will have to email file to [email protected]. github will not accept a bib file. |
I will make the subject line attn ThiloteE |
You can simply add .txt or something similar, e.g. .bib.txt will be accepted by github |
So I was able to reproduce. Could not open your file. At which line do you think is the error? Btw.:
Source: https://www.gnu.org/software/emacs/manual/html_node/emacs/Coding-Systems.html Btw. my text editor (xed 3.2.2) also was not able to detect the encoding of the file. |
Were you able to reproduce the problem opening this file with 5.6, 5.7? And no problem with 5.3?
As far as identifying the line, this is as far as I got. (After saving as utf-8 made the file readable by 5.6, 5.7, I lost all motivation.)
If I deleted the entry ***@***.***{RohrigLaioTantaloParrinelloPetronzio2006” and all following entries, 5.6 would read the remaining file. However, if I deleted that one entry alone, 5.6 still had a problem. There must be other incompatibilities in the file.
Thanks,
Sherwin
From: ThiloteE ***@***.***>
Sent: Wednesday, June 1, 2022 5:23 PM
To: JabRef/jabref ***@***.***>
Cc: Singer, Sherwin ***@***.***>; Author ***@***.***>
Subject: Re: [JabRef/jabref] Will not open file (Charset Exception when reading file) (Issue #8870)
At which line do you think is the error? Btw.: The coding systems unix, dos, and mac are aliases for undecided-unix, undecided-dos, and undecided-mac, respectively. These coding systems specify only the end-of-line conversion, and leave the
At which line do you think is the error?
Btw.:
The coding systems unix, dos, and mac are aliases for undecided-unix, undecided-dos, and undecided-mac, respectively. These coding systems specify only the end-of-line conversion, and leave the character code conversion to be deduced from the text itself.
Source: https://www.gnu.org/software/emacs/manual/html_node/emacs/Coding-Systems.html<https://urldefense.com/v3/__https:/www.gnu.org/software/emacs/manual/html_node/emacs/Coding-Systems.html__;!!KGKeukY!zU0sANM4Od770n9dMvNhMR4V1NzwHtMci2MUorp7OCYWQwTDdIqVMVX8q_gkJhFh966ycopaF-SQ_EM6_cshSjvfhOHygqQ2BoNu$>
Since JabRef does not do any deduction, but assumes (at least I assume so it does), it could be that the file you have had some non-utf-8 characters in there and thereby produced some incompatibility.
Btw. my text editor (xed 3.2.2) also was not able to detect the encoding of the file.
—
Reply to this email directly, view it on GitHub<https://urldefense.com/v3/__https:/github.com/JabRef/jabref/issues/8870*issuecomment-1144151445__;Iw!!KGKeukY!zU0sANM4Od770n9dMvNhMR4V1NzwHtMci2MUorp7OCYWQwTDdIqVMVX8q_gkJhFh966ycopaF-SQ_EM6_cshSjvfhOHygvI28bHq$>, or unsubscribe<https://urldefense.com/v3/__https:/github.com/notifications/unsubscribe-auth/AJII6NDMRKCB4VSZQ6UEOYLVM7IEHANCNFSM5XLL5PBA__;!!KGKeukY!zU0sANM4Od770n9dMvNhMR4V1NzwHtMci2MUorp7OCYWQwTDdIqVMVX8q_gkJhFh966ycopaF-SQ_EM6_cshSjvfhOHygihOcGVO$>.
You are receiving this because you authored the thread.Message ID: ***@***.******@***.***>>
|
JabRef behaviour as explained in #8895 (comment) might be the cause for this issue here? |
Other cause could be: JabRef#75 (comment) |
I digged into this, know where the problem comes from. Need to explicity check for utf16be in the charset detector on import |
* Fix charset detection with utf16 and others Fixes #8895 Fixes #8870 * checkstyöe * Fix typo in method names * change newlines * get bytes * Set newline character to LF * Revert "get bytes" This reverts commit 1082f8a. * progress * switch line sep to LF * Please work * Try jitpack * Add manual build of icu4j * Check if we have ascii in the list of charsets * fix checkstyle * Update external-libraries.md * Enocde with UTF-16BE * Fix umlaut * Hack to get test running * Also compare meta data * Add enforced ignorance of malformed characters Source: http://biercoff.com/malformedinputexception-input-length-1-exception-solution-for-scala-and-java/ Co-authored-by: Christoph <[email protected]> * checkstyle * IntelliJ now also renders the file correctly * Add test Additionally - Replace unknown characters - Remove obsolete wrapping classes in test * Refine CHANGELOG.md * Remove non-working jpackage reference Co-authored-by: Oliver Kopp <[email protected]> Co-authored-by: Houssem Nasri <[email protected]>
5.3 opens the old file successfully! 5.6 and 5.7 do not.
The old version last saw emacs several years ago, before I saw the light and switched to jabref. It had been saved by jabref many times.
The only time I saved the file from emacs using undecided-unix was the other day, (wrongly) thinking that would take care of character set issues. Now I know to save using utf-8.
Thanks for your help,
Sherwin
From: ThiloteE ***@***.***>
Sent: Tuesday, May 31, 2022 7:29 AM
To: JabRef/jabref ***@***.***>
Cc: Singer, Sherwin ***@***.***>; Author ***@***.***>
Subject: Re: [JabRef/jabref] Will not open file (Charset Exception when reading file) (Issue #8870)
This problem goes both ways: Emacs could have changed their "undecided unix" export. What happens if you import or open the non-working file with older versions of JabRef? — Reply to this email directly, view it on GitHub, or
This problem goes both ways: Emacs could have changed their "undecided unix" export.
What happens if you import or open the non-working file with older versions of JabRef?
—
Reply to this email directly, view it on GitHub<https://urldefense.com/v3/__https:/github.com/JabRef/jabref/issues/8870*issuecomment-1142011129__;Iw!!KGKeukY!xHet1mEzjsRKLUyE2nxsUm-CmvrHI4qHiEpqV-KMbjF3ae7nt5QepuSFz1_hcO655kYiwGJXkoWYslyNR0dqFuNqEH1Gw9sHJfnI$>, or unsubscribe<https://urldefense.com/v3/__https:/github.com/notifications/unsubscribe-auth/AJII6NCAO2PSROYB25ERWR3VMXZY7ANCNFSM5XLL5PBA__;!!KGKeukY!xHet1mEzjsRKLUyE2nxsUm-CmvrHI4qHiEpqV-KMbjF3ae7nt5QepuSFz1_hcO655kYiwGJXkoWYslyNR0dqFuNqEH1Gw0UWt8rK$>.
You are receiving this because you authored the thread.Message ID: ***@***.******@***.***>>
|
JabRef version
Latest development branch build (please note build date below)
Operating system
Windows
Details on version and operating system
jabref 5.6 or 5.7 on WIndows 10
Checked with the latest development build
Steps to reproduce the behaviour
I moved a bib file to a new location and suddenly the file will not load. I will attach the log file - it's some java error that I cannot decipher. I opened the bib file in emacs, and had emacs check the syntax, and it passed the test. I beautified the file in emacs and jabref still will not load the file.
If I remove two-thirds of the file, jabref will load the file. I thought there might be a malformed entry. By trial and error, I determined the exact stopping point. However, when I remove that entry, jabref still fails.
The problem arose with jabref 5.6. I installed jabref 5.7 and the same problem arose.
jabref opens another file without a problem.
Appendix
Paste an excerpt of your log file here
The text was updated successfully, but these errors were encountered: