Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix crash when PARTNAME is invalid utf-8 #136

Merged
merged 1 commit into from
May 3, 2023

Conversation

ogayot
Copy link
Member

@ogayot ogayot commented Apr 27, 2023

In GPT partitions, the PartitionName field is stored as a utf-16 string. For some reason though, in udev, the value is not converted from utf-16 to utf-8. Instead, the MSB is dropped and the LSB is kept. e.g.,

  • letter A in UTF-16 is 0x0041. It becomes 0x41 (which is letter A)
  • e-ecute (é) in UTF-16 is 0x00E9. It becomes 0xE9
  • euro sign (€) in UTF-16 is 0x20AC. It becomes 0xAC
  • pound sign (£) in UTF-16 is 0x00A3. It becomes 0xA3

In the above examples, only the letter A results in a valid UTF-8 character.

When using pyudev, the values are all expected to be decodable using the system's default encoding.

In probert, it can result in a crash when iterating over all the properties of a block device, that are mapped by pyudev.

To work around the issue, we now stop iterating over all the values. We iterate over the keys and only access the value if we need it.

NOTE: pyudev deprecated the use of Device.__getitem__, Device.__iter__ and similar constructs to access a pyudev.Device property. Instead, one should use Device.properties.__getitem__, Device.properties.__iter__ and so on. In a future version of pyudev, the support for the deprecated constructs will be dropped so we need to start using the recommended alternative.

https://bugs.launchpad.net/subiquity/+bug/2017862

In GPT partitions, the PartitionName field is stored as a utf-16 string.
For some reason though, in udev, the value is not converted from utf-16
to utf-8. Instead, the MSB is dropped and the LSB is kept. e.g.,

 * letter A in UTF-16 is 0x0041. It becomes 0x41 (which is letter A)
 * e-ecute (é) in UTF-16 is 0x00E9. It becomes 0xE9
 * euro sign (€) in UTF-16 is 0x20AC. It becomes 0xAC
 * pound sign (£) in UTF-16 is 0x00A3. It becomes 0xA3

In the above examples, only the letter A results in a valid UTF-8
character.

When using pyudev, the values are all expected to be decodable using the
system's default encoding.

In probert, it can result in a crash when iterating over all the
properties of a block device, that are mapped by pyudev.

To work around the issue, we now stop iterating over all the values. We
iterate over the keys and only access the value if we need it.

Signed-off-by: Olivier Gayot <[email protected]>
@ogayot ogayot requested a review from dbungert April 27, 2023 08:42
@xnox
Copy link
Contributor

xnox commented May 1, 2023

I wonder if we have this bug inside udev/systemd. I bet it treats it as binary stream, and just dumps it, which probably corrupts the output of udevadm info -e and then bindings follow that. I wonder if udev should be decoding those things with proper conversion and replacement. Or at least declaring them as raw bytes, not as a string in bindings.

@ogayot
Copy link
Member Author

ogayot commented May 1, 2023

I wonder if we have this bug inside udev/systemd. I bet it treats it as binary stream, and just dumps it, which probably corrupts the output of udevadm info -e and then bindings follow that. I wonder if udev should be decoding those things with proper conversion and replacement. Or at least declaring them as raw bytes, not as a string in bindings.

@xnox I think we have the bug in udev too, yes. I've done some experiments with udevadm info -e in the bug I opened on pyudev (pyudev/pyudev#490).

Dan and I also looked into the kernel and believe https://elixir.bootlin.com/linux/v6.3/source/block/partitions/efi.c#L678 is the root cause.

@ogayot ogayot merged commit a205308 into canonical:master May 3, 2023
@ogayot ogayot deleted the fix-invalid-utf-8-causing-trouble branch May 4, 2023 07:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants