-
Notifications
You must be signed in to change notification settings - Fork 445
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
project_loader: handle invalid unicode chars #1941
Conversation
Why is that invalid in a summary/title/description? What is a valid summary/title/description? (I don't think summaries, titles nor descriptions can nor should be ascii-only) |
@chipaca See my note in the description. There's a bug in pyyaml's handling of the unicode spec. |
yes, that's what's brought me here: you're upgrading a "the yaml parser we chose to use does not handle this" to "this is not allowed". it's fine to say "not supported (use a \u escape sequence instead)", or even do the conversion ourselves ahead of time, but to say "not allowed" is a lie (you can work around it by using a \u escape), and not what we want either (these fields are for human-readable text, ie unicode). There is validation to be done on these fields: I don't think they should include any character from the private use area nor any control character. But that's separate. |
FWIW,
|
Seems like the Debian package was patched, so the version in pip behaves differently, causing the failure in CI. I changed the original test case now to use a code point that's actually invalid vs. parsed as invalid upstream, and also added a work-around so that emoji like in the bug report are considered valid regardless of version. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, and sorry for the noise. 👍
the patch should be brought into the snap
@@ -47,6 +48,7 @@ parts: | |||
TRIPLET_PATH="$SNAPCRAFT_PART_INSTALL/usr/lib/$(gcc -print-multiarch)" | |||
LIBSODIUM=$(readlink -n $TRIPLET_PATH/libsodium.so.18) | |||
ln -s $LIBSODIUM $TRIPLET_PATH/libsodium.so | |||
patch -d $SNAPCRAFT_PART_INSTALL/lib/python3.6/site-packages -p1 < $SNAPCRAFT_STAGE/pyyaml-support-high-codepoints.diff | |||
after: [python, apt] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You probably want to add patches
to after here. It has python which has patches
in ITS after... but that will break if it ever changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the after for python includes this, so the chain is somewhat complete; I will sort of ignore this as we are going back to stage-packages
as soon as our SRUed deb gets in with the fixes for classic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch otherwise ;-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One nit in the YAML, but I'll leave my +1 here anyway.
tested the snap, also works. |
Note that the exception is currently raised with valid and invalid unicode chbaracters due to the upstream bug pyyaml#25. But we'll want to handle the error cleanly even if the upstream issue is fixed. This branch adds a patch to the PyYAML used by the Snapcraft snap to handle the unicode code points erroneously flagged as invalid, such as the hankey emoji. A Snapcraft snap built from this PR will successfully validate a summary or description making use of the hankey emoji. New test cases: tests.unit.project_loader.test_config.test_invalid_yaml_invalid_unicode_chars tests.integration.general.test_global_properties Note: The integration level test for the PyYAML work-around is skipped unless testing with a snap or Debian package (ie. SNAPCRAFT_FROM_SNAP=1 is set or SNAPCRAFT_FROM_DEB=1 is set), both of which are patched. I verified with a snap from the branch, and running the tests in a virtual environment respectively. LP: #1737571
./runtests.sh static
?./runtests.sh unit
?Fixes: LP: #1737571
Note that the exception is currently raised with valid and invalid unicode chbaracters due to the upstream bug pyyaml#25. But we'll want to handle the error cleanly even if the upstream issue is fixed.
Steps to reproduce:
This branch adds a patch to the PyYAML used by the Snapcraft snap to handle the unicode code points erroneously flagged as invalid, such as the 💩 emoji. A Snapcraft snap built from this PR will successfully validate a summary or description making use of the 💩 emoji.
New test cases:
Note: The integration level test for the PyYAML work-around is skipped unless testing with a snap or Debian package (ie.
SNAPCRAFT_FROM_SNAP=1
is set orSNAPCRAFT_FROM_DEB=1
is set), both of which are patched. I verified with a snap from the branch, and running the tests in a virtual environment respectively.