Skip to content

Conversation

@svetli-n
Copy link
Contributor

@svetli-n svetli-n commented Feb 2, 2022

Closes #326

@svetli-n svetli-n marked this pull request as draft February 2, 2022 21:29
@svetli-n
Copy link
Contributor Author

svetli-n commented Feb 2, 2022

@Byron Related to the encoding discussion earlier, I return Err when encoding error, like the existing code, e.g. https://github.com/Byron/gitoxide/blob/main/git-config/src/values.rs#L332

WDYT?

Copy link
Member

@Byron Byron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tl;dr: It's probably a good idea to read all comments in order, they 'add up' as my understanding improves as I write them. Here is a summary of what I know, hopefully that leads to a course of action.


I took a look and couldn't find any code that would actually test serialization. It appears that serde was originally intended to be used for serialization, but without tests I am not sure.

Here is what I know:

  • this crate has a complete parser and machinery for producing human-friendly errors
  • it can do zero-copy
  • it maintains enough information to theoretically write values back as it parsed them (leaving comments). However, we can't find tests for that.
  • everything is well documented, there are many doc-tests, which should help a lot in figuring out what's needed to get this crate over the finishing line.
  • There is a lot of complexity to support round-tripping, but it's seemingly not tested.
    • In that vein, I didn't see normalized values being denormalized when serialized, so probably serde isn't meant to be used for writing git-config files (or I am missing something, or it's not yet implemented)

For now it seems like denormalizing values eagerly is how it's done here (I would have preferred to do that lazily), so eager normalization is probably what would happen to paths as well.

I think it's worth focussing on tests that interpret strings as paths and experiment with an API that makes this somewhat comfortable. Maybe from there one gains enough knowledge to understand how to approach this crate and which steps to take next. Right now, my main issue is that there is a lot of complexity without the necessary tests/features to justify it. It's still possible I am missing something obvious here, and a way to deal with this would be a complete study/read of all 7k lines here.

@svetli-n
Copy link
Contributor Author

svetli-n commented Feb 5, 2022

Looked at git and it seems to me that its treating Path as possibly interpolated string: https://github.com/git/git/blob/master/config.c#L1253

hence these changes, WDYT?

Copy link
Member

@Byron Byron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently gitoxide generally has the problem of having to convert BString/Bytes back to Paths, which goes through a UTF-8 decode step which isn't what git does, thus making it less compatible.

I will take a look and see how this can be solved more generally.

@Byron Byron mentioned this pull request Feb 8, 2022
24 tasks
@Byron
Copy link
Member

Byron commented Feb 8, 2022

Despite of what I said earlier, it's probably in our interest to leave git-url out of it for now. It doesn't support (and shouldn't support) %(prefix) expansion. Maybe it's possible to reuse some code later, so definitely something to keep an eye out, for later :).

@svetli-n
Copy link
Contributor Author

svetli-n commented Feb 9, 2022

Currently gitoxide generally has the problem of having to convert BString/Bytes back to Paths, which goes through a UTF-8 decode step which isn't what git does, thus making it less compatible.

I will take a look and see how this can be solved more generally.

Cool, I see #334

@Byron
Copy link
Member

Byron commented Feb 10, 2022

Cool, I see #334

Yes, and the short version is that functions in git_features::path:: should be used to do transformation between Path/OsStr and bytes and vice-versa. It's a fallible operation, and depending on the case one should either bubble it up, or call the corresponding *_or_panic_on_windows() to panic instead.

@svetli-n
Copy link
Contributor Author

Cool, I see #334

Yes, and the short version is that functions in git_features::path:: should be used to do transformation between Path/OsStr and bytes and vice-versa. It's a fallible operation, and depending on the case one should either bubble it up, or call the corresponding *_or_panic_on_windows() to panic instead.

Ok, I'll refactor to using git_features::path:: and quick_error

How about having something like
fn replace<'a>(path: impl Into<Cow<'a, [u8]>>, find: &[u8], replace: &[u8]) -> Cow<'a, [u8]> in pub mod convert to simplify interpolate?

@Byron
Copy link
Member

Byron commented Feb 12, 2022

How about having something like
fn replace<'a>(path: impl Into<Cow<'a, [u8]>>, find: &[u8], replace: &[u8]) -> Cow<'a, [u8]> in pub mod convert to simplify interpolate?

That's definitely something to do once there is another crate needing that functionality, that is, replacing a substring globally without another string. Right now that doesn't seem to be the case, so I'd hold off and keep that local and specific to git-config.

@svetli-n svetli-n force-pushed the path_value branch 2 times, most recently from 57eeb14 to fb51e76 Compare February 13, 2022 16:09
@svetli-n svetli-n marked this pull request as ready for review February 13, 2022 21:34
…eLabs#331)

Add a `Path` type to the `git_config::values` which
can be interpolated according to gits own path interpolation
rules.
Byron and others added 8 commits February 15, 2022 17:57
…bs#331)

The lack of context would make %(prefix)/… paths generally fail
which makes the API prone to misuse.

This probably means that another API needs to be added to facilitate
obtaining paths while providing a way to pass the context, too.

For now having the `Path` type as intermediary seems sufficient,
but also might point towards other a more approach.
This composes better with other errors, if necessary.
…f` (GitoxideLabs#331)

That way users will want to call it or something like it in order to
make use of the `Path`, which otherwise is just a bunch of bytes which
aren't very useful at all in standard Rust.

Keeping the `Path` wrapper type is probably the right choice as it
matches the current API and allows to get paths explicitly which can
then be interpolated in just another step.
@Byron
Copy link
Member

Byron commented Feb 20, 2022

Hi @svetli-n, thanks a lot for all the work! I took some time myself to dig in and came up with a few changes that are explained in detail in the respective commits.

Please let me know what you think (except that CI seems to be failing 😅), I will fix that later today. Thank you

@Byron Byron force-pushed the path_value branch 2 times, most recently from 70b346f to 8be471a Compare February 20, 2022 13:14
Always a good idea to run all tests beforehand.
This means we don't necessarily enforce slashes everywhere.

Intiially the implementation was meant to be as close to `git` as
possible, even though certain requirements there stem from the
assumption that the path separtor is `/`.

In Rust, however, we now return `std::path::Path` which deals with
that for us and allows to operate on paths more comfortably and safely.

We now leverage this fully even if that means that the paths we output
aren't entirely conformant with what git would use, but it's an
implementation detail.

Lastly, the interpolated path is a transformation step so there should
be no easy way to accidentally put that version of the path back
into the file, which would be undesirable due to the various
transformations.
)

Previously PathError was meant to be used primarily in `Path::try_from()`
so a case could be made to give it that name. Now the error is exclusive
to `Path`, so that should be reflected in the module location as well.

This helps with organizing the code as well, putting everything related
to Path into its own module, which helps refactoring later.
@Byron
Copy link
Member

Byron commented Feb 21, 2022

I had a little bit of an epiphany while continuing to refactor which reduced the complexity of the implementation quite a bit while containing it in its own module. The main realization was that git does what it does with slashes because it assumes slashes internally. Once there is an std::path::Path though, there is no need for doing this as it's handled for you - a type that Path::interpolate(…) now conveniently returns.

The learning here could be that these separator conversions only need to be used to assure paths are using slashes at rest to not break compatibility with git, while transforming them to std::path::Path right before usage.

@Byron
Copy link
Member

Byron commented Feb 21, 2022

Whohooo, great catch, thanks a lot!
Merged :).

@Byron Byron closed this Feb 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add Path value to git-config

2 participants