Enable reading non-utf-8 encodings for java pom.xml files#2047
Merged
Conversation
Benchmark Test ResultsBenchmark results from the latest changes vs base branch |
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
1ac86cf to
7566b29
Compare
spiffcs
approved these changes
Aug 22, 2023
kzantow
reviewed
Aug 22, 2023
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
…test unknown encoding Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
This was referenced Aug 28, 2023
Closed
GijsCalis
pushed a commit
to GijsCalis/syft
that referenced
this pull request
Feb 19, 2024
* fix reading non utf8 encodings Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * in cases where we cant tell the encoding use the UTF8 replacement char Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * decompose the xml decoding func to get a valid utf8 reader first and test unknown encoding Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> --------- Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #2204
Currently the pom.xml decoding will return an error when reading a document that does not have an
encoding="..."attribute at the top of an XML document and there are non-utf-8 characters within the document. This PR adds encoding detection before using the XML decoder so that the reader can be wrapped with an adapter to transform the input to UTF-8 during Read() calls.Fixes: #2044
CC: @westonsteimel