Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Display for CStr and BStr #534

Closed
ojeda opened this issue Oct 26, 2021 · 11 comments
Closed

Implement Display for CStr and BStr #534

ojeda opened this issue Oct 26, 2021 · 11 comments
Labels
good first issue Good for newcomers • lib Related to the `rust/` library.

Comments

@ojeda
Copy link
Member

ojeda commented Oct 26, 2021

In addition to #532, it would be nice to implement Display for CStr and BStr.

Be careful to avoid panicking if invalid UTF-8 is found (i.e. do not attempt to convert it into a &str). We could escape all non-printable ASCII bytes to avoid losing data (e.g. \x01). Another way (but lossy) is using a replacement character (e.g. the Unicode one).

Unit-testing is a plus!

[If there was a Display-like trait that would work for non-UTF-8, we could leave the escaping to Debug and just submit the bytes as-is for Display like printk() does for %s. But do we actually want this anyway? I don't know about a use case; in printk it is done like this because it is how printf works, I assume, and because we use macros to prefix the level etc.]

@ojeda ojeda added • lib Related to the `rust/` library. prio: normal good first issue Good for newcomers labels Oct 26, 2021
@Armavica
Copy link

Hi, I started implementing this and I have a couple of questions:

  • Should we print the final nul byte for CStr?
  • I didn't find a way to implement the Display trait for the BStr type alias. Apparently it has to do with the fact that BStr is actually just [u8], and both trait and type are defined outside of the module so rustc doesn't allow the implementation. Is there a way around that, or should I redefine BStr as a struct containing only [u8], in the same way as CStr?

@nbdd0121
Copy link
Member

  • Should we print the final nul byte for CStr?

No

  • I didn't find a way to implement the Display trait for the BStr type alias. Apparently it has to do with the fact that BStr is actually just [u8], and both trait and type are defined outside of the module so rustc doesn't allow the implementation. Is there a way around that, or should I redefine BStr as a struct containing only [u8], in the same way as CStr?

No, there are no way around that. I have considered defining BStr similar to CStr but I didn't have a compelling reason to do so at the time; implementing additional methods and traits would be a good motivation to make the change.


I wonder if implementing Display is the correct design, since CStr and BStr are not guaranteed to be valid UTF-8. Maybe we should use similar design to Path::display?

@ojeda
Copy link
Member Author

ojeda commented Nov 2, 2021

I wonder if implementing Display is the correct design, since CStr and BStr are not guaranteed to be valid UTF-8. Maybe we should use similar design to Path::display?

As far as I understand the docs, nothing says a type should try to interpret contents as UTF-8 if possible. Our use case is printing some bytes into the kernel log.

Whatever we do, we should avoid losing data. So escaping would be one way of doing that, but given printk() with %s dumps the bytes as-is, we should probably just do that (I have edited OP), and leave escaping for Debug (with quotes, e.g. something like b"my foo \x01" would be nice).

@nbdd0121
Copy link
Member

nbdd0121 commented Nov 2, 2021

Whatever we do, we should avoid losing data. So escaping would be one way of doing that, but given printk() with %s dumps the bytes as-is, we should probably just do that (I have edited OP), and leave escaping for Debug (with quotes, e.g. something like b"my foo \x01" would be nice).

There is no way to write bytes with formatter.

@ojeda
Copy link
Member Author

ojeda commented Nov 2, 2021

There is no way to write bytes with formatter.

Yeah, with the upstream formatter. But can it be extended to allow to emit to non-UTF-8 streams? After all, the formatting machinery seems like it could be reused without the target needing to be UTF-8 (I mean format_args!, Arguments, etc.).

For instance, could we have a DisplayRaw to avoid ToString etc.? I am wondering if format_args! could pick the types implementing DisplayRaw automatically without needing a different macro, e.g. marking the Arguments at compile-time such that it cannot be used with e.g. format!.

@bjorn3
Copy link
Member

bjorn3 commented Nov 2, 2021

But can it be extended to allow to emit to non-UTF-8 streams?

Not without a breaking change. You can call .write_fmt() on a value implementing core::fmt::Write. All methods .write_fmt() can use to write the core::fmt::Arguments require UTF-8 input. What you could do is make a format_args!() like proc macro allowing non-utf-8 data.

@ojeda
Copy link
Member Author

ojeda commented Nov 2, 2021

Not without a breaking change. You can call .write_fmt() on a value implementing core::fmt::Write. All methods .write_fmt() can use to write the core::fmt::Arguments require UTF-8 input. What you could do is make a format_args!() like proc macro allowing non-utf-8 data.

I am not sure I understand -- what I am proposing is to make format_args! work for more types (but restricting who can consume the result if not guaranteed to be UTF-8 because something used DisplayRaw; e.g. by returning a different type; which should be possible since it is a macro, no?). What kind of existing code would break?

If it is not doable (or reasonable, e.g. because we want always the same return type), then we would need something like format_args_raw!, yeah. I think it could be a reasonable addition either way.

@ojeda
Copy link
Member Author

ojeda commented Nov 2, 2021

Updated OP to leave this PR to the escaping case -- which seems reasonable and makes it a good "good first issue".

@KimWang906
Copy link

If this problem has not been solved yet, I would like to contribute to solving this problem.

@ohno418
Copy link

ohno418 commented Mar 17, 2024

Now we can close this issue as completed. (CStr, BStr)

@ojeda
Copy link
Member Author

ojeda commented Mar 17, 2024

Indeed, thanks Yutaro!

@ojeda ojeda closed this as completed Mar 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers • lib Related to the `rust/` library.
6 participants