Skip to content

Conversation

pyup-bot
Copy link
Collaborator

This PR updates charset-normalizer from 2.0.4 to 2.0.7.

Changelog

2.0.7

We arrived in a pretty stable state.

*Changes:*

- **Addition:**  :bento:  Add support for Kazakh (Cyrillic) language detection 109
- **Improvement:**  :sparkle: Further improve inferring the language from a given code page (single-byte) 112
- **Removed:**  :fire: Remove redundant logging entry about detected language(s) 115 
- **Miscellaneous:** :wrench:  Trying to leverage PEP263 when PEP3120 is not supported 116
- *While I do not think that this (116) will actually fix something, it will rather raise a `SyntaxError` (Not about ASCII decoding error) for those trying to install this package using a non-supported Python version*
- **Improvement:** :zap: Refactoring for potential performance improvements in loops 113 adbar 
- **Improvement:** :sparkles: Various detection improvement (MD+CD) 117
- **Bugfix:** :bug: Fix a minor inconsistency between Python 3.5 and other versions regarding language detection 117 102 

This version pushes forward the detection-coverage to 98%! https://github.com/Ousret/charset_normalizer/runs/3863881150
_The great filter (cannot be better than) shall be 99% in conjunction with the current dataset. In future releases._

2.0.6

*Changes:*

- **Bugfix:** :bug:  Unforeseen regression with the loss of the backward-compatibility with some older minor of Python 3.5.x 100 
- **Bugfix:** :bug: Fix CLI crash when using --minimal output in certain cases 103
- **Improvement:** :sparkles: Minor improvement to the detection efficiency (less than 1%) 106 101

2.0.5

*Changes:*

**Internal:** :art: The project now comply with: flake8, mypy, isort and black to ensure a better overall quality 81 
**Internal:** :art: The MANIFEST.in was not exhaustive 78 
**Improvement:** :sparkles: The BC-support with v1.x was improved, the old staticmethods are restored 82 
**Remove:** :fire: The project no longer raise warning on tiny content given for detection, will be simply logged as warning instead 92 
**Improvement:** :sparkles: The Unicode detection is slightly improved, see 93 
**Bugfix:** :bug: In some rare case, the chunks extractor could cut in the middle of a multi-byte character and could mislead the mess detection 95 
**Bugfix:** :bug: Some rare 'space' characters could trip up the `UnprintablePlugin`/Mess detection 96 
**Improvement:** :art: Add syntax sugar \_\_bool\_\_ for results `CharsetMatches` list-container see 91 

This release push further the detection coverage to 97 % !
Links

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant