Troubles with reading cyrillic chars #2

tsouvarev · 2014-12-01T07:19:03Z

Hello, first of all, thanks for this port of dbfread

I have DBF file with cyrillic fields. When I try to read these fields, I get
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte in fields.py:213 which is return value.rstrip(b' ').decode('utf-8')

When I change utf-8 to cp866 (default encoding for DBF), everything is working just fine.

The text was updated successfully, but these errors were encountered:

Updates issue #2.

phargogh · 2015-06-01T23:26:49Z

Hi @tsouvarev, Thanks for your feedback! I've changed a couple things in fields.py so that instead of trying to decode using utf-8, we decode with your system's encoding. This isn't really an ideal solution, but I think it might at least solve your issue for the moment.

Could you try a fresh clone and install and see if that fixes the issue for you?

tsouvarev · 2015-06-02T17:44:42Z

@phargogh unfortunately, I have no dbf files right now to check your fix. But feel free to close this issue, if you think, that it is solved

phargogh · 2015-06-04T18:46:23Z

No worries! I'll leave it open for the time being until I can verify that it will work as expected.

nmset · 2017-04-01T20:07:03Z

On Linux, I had to change 'locale.getpreferredencoding' to 'cp850' to fully import a DBF file created in Windows. Else, fields with accented characters are dropped. Could it be made to look for an user defined environment variable, kind of 'DBFPY3_DECODE_FROM', that points to the source encoding ? If none is declared, use 'locale.getpreferredencoding'. I know, it's hacky and not smart, just thinking it would be pragmatic. Thanks for this useful tool.

phargogh added a commit that referenced this issue Jun 1, 2015

Using preferred encoding instead of UTF-8

823e2b3

Updates issue #2.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Troubles with reading cyrillic chars #2

Troubles with reading cyrillic chars #2

tsouvarev commented Dec 1, 2014

phargogh commented Jun 1, 2015

tsouvarev commented Jun 2, 2015

phargogh commented Jun 4, 2015

nmset commented Apr 1, 2017 •

edited

Loading

Troubles with reading cyrillic chars #2

Troubles with reading cyrillic chars #2

Comments

tsouvarev commented Dec 1, 2014

phargogh commented Jun 1, 2015

tsouvarev commented Jun 2, 2015

phargogh commented Jun 4, 2015

nmset commented Apr 1, 2017 • edited Loading

nmset commented Apr 1, 2017 •

edited

Loading