Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

substitute_one_step: stop after first variable has been substituted #11

Closed
wants to merge 1 commit into from

Conversation

cre4ture
Copy link
Contributor

@cre4ture cre4ture commented Jan 7, 2024

This change provides the posibility to only replace the next variable in a given string, returning the position after the variable name.
This is usefull if the substitution is to be integrated into an existing string parsing statemachine.
E.g. I'm integrating in into a "shell-words" like funktionality needed by the uutils/coreutils-env app.

To avoid that we need to do a fork of your code, we want to bring it into your main branch.
Please tell me what you think.

Kind regards

@de-vri-es
Copy link
Contributor

Hey! Sounds interesting. But maybe we can make it return a tuple of 3 slices instead of a Option<(String, usize)>: the leading part of the string before the next variable, the substitute value, and the remainder of the unprocessed string.

@cre4ture cre4ture force-pushed the feature/substitute_one_step branch 2 times, most recently from 1f21449 to 06209a3 Compare January 7, 2024 17:26
@cre4ture cre4ture force-pushed the feature/substitute_one_step branch from 06209a3 to 3efe2a4 Compare January 7, 2024 17:26
@cre4ture
Copy link
Contributor Author

cre4ture commented Jan 7, 2024

You where right, the interface was a bit mixed up. I did refactorings and I think this fits now more to your idea.
What do you think?

@de-vri-es
Copy link
Contributor

I think we still need to prevent returning a String, since it will re-allocate every time a change is made now.

But I also wonder, do you still need substitute_one(...) if we have a parse(..) -> Template function? Or can you achieve your desired end result already if parsing and performing substitution is split?

Copy link
Contributor

@de-vri-es de-vri-es left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For unit tests, we use assert2 :)

Comment on lines +497 to +500
assert_eq!(
Ok("Hello cruel world!"),
substitute("Hello ${not_name:cruel $name}!", &map).as_deref()
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't replace these. The check!() and assert!() macros from assert2 give far better error messages than assert_eq!().

@@ -591,7 +670,7 @@ mod test {
let variables: &dyn VariableMap<Value = &String> = &variables;

let_assert!(Ok(expanded) = substitute("one ${aap}", variables));
assert!(expanded == "one noot");
assert_eq!(expanded, "one noot");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, don't change this. assert2::assert!() is better.

Comment on lines +697 to +700
assert_eq!(result.0, "subst");
assert_eq!(result.1.slice_before_ends, 6);
assert_eq!(result.1.slice_after_starts, 11);
assert_eq!(result.1.subst_type, SubstitutionType::Variable);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
assert_eq!(result.0, "subst");
assert_eq!(result.1.slice_before_ends, 6);
assert_eq!(result.1.slice_after_starts, 11);
assert_eq!(result.1.subst_type, SubstitutionType::Variable);
assert!(result.0 == "subst");
assert!(result.1.slice_before_ends == 6);
assert!(result.1.slice_after_starts == 11);
assert!(result.1.subst_type == SubstitutionType::Variable);

variables.insert(String::from("NAME"), String::from("subst"));

let source = r"hello $NAME. Nice\$to meet you $NAME.";
let result = substitute_one_step(source, &variables).unwrap().unwrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let result = substitute_one_step(source, &variables).unwrap().unwrap();
let_assert!(Ok(Some(result)) = substitute_one_step(source, &variables));

Comment on lines +702 to +708
let result = substitute_one_step(source.get(result.1.slice_after_starts..).unwrap(), &variables)
.unwrap()
.unwrap();
assert_eq!(result.0, "$");
assert_eq!(result.1.slice_before_ends, 6);
assert_eq!(result.1.slice_after_starts, 8);
assert_eq!(result.1.subst_type, SubstitutionType::UnescapeOne);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let result = substitute_one_step(source.get(result.1.slice_after_starts..).unwrap(), &variables)
.unwrap()
.unwrap();
assert_eq!(result.0, "$");
assert_eq!(result.1.slice_before_ends, 6);
assert_eq!(result.1.slice_after_starts, 8);
assert_eq!(result.1.subst_type, SubstitutionType::UnescapeOne);
let_assert!(Ok(Some(result)) = substitute_one_step(source.get(result.1.slice_after_starts..).unwrap(), &variables));
assert!(result.0 == "$");
assert!(result.1.slice_before_ends == 6);
assert!(result.1.slice_after_starts == 8);
assert!(result.1.subst_type == SubstitutionType::UnescapeOne);

Comment on lines +716 to +717
let result = substitute_one_step(source, &variables).unwrap();
assert!(result.is_none());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let result = substitute_one_step(source, &variables).unwrap();
assert!(result.is_none());
assert!(let Ok(None) = substitute_one_step(source, &variables));

@de-vri-es de-vri-es self-assigned this Jan 8, 2024
@cre4ture
Copy link
Contributor Author

cre4ture commented Jan 8, 2024

But I also wonder, do you still need substitute_one(...) if we have a parse(..) -> Template function? Or can you achieve your desired end result already if parsing and performing substitution is split?

I would still need a funktion like parse_one_template_part as I only trigger the variable substitution when I detected that there is a $. The part of the input after that variable is again parsed with my own parser.

But we can go for the template based approach anyway if you like. Just let me know what you thing about #14

@de-vri-es
Copy link
Contributor

I would still need a funktion like parse_one_template_part as I only trigger the variable substitution when I detected that there is a $. The part of the input after that variable is again parsed with my own parser.

I think I'm not entirely seeing your use case clearly. It feels somewhat odd to perform only partial variable substitution. Wouldn't it make more sense to run your own parser, and then run the substitution on the relevant parts that your parser indicates?

Or maybe the opposite makes sense: parse a full template first, and then run your parser on the non-variable parts of the template?

Do you have a link to how you're using this maybe? Or otherwise could you explain with some pseudo-code how this is supposed to integrate with your parser?

But we can go for the template based approach anyway if you like. Just let me know what you thing about #14

I will! Probably not today though, as I'm on holiday :p So I have only limited time for reviews. But the comments about the unit tests apply to both PRs. If you have the time you could already process those.

@cre4ture
Copy link
Contributor Author

cre4ture commented Jan 9, 2024

This pull request contains (temporarily) the modified subst such that it works. If you like you can have a look at it:
uutils/coreutils#5801

I think the main problems I encountered when I tried to use the unmodified subst where these:

  • When running the subst befor or after my parser: Unescaping was done twice such that some tests where this is relevant where failing. E.g.\$NAME should not be substituted, but was because my parser needs to deal and remove the escape char for other cases anyway.
  • When running subst only on the variables: I would need to implement the detection of the end of the variable name myself. Only this way I can forward the exact relevant part.

With the "one-step" way, I just need to stop at an unescaped $ and call subst_one_step which tells me about the end of the variable name and the substituted value.

In the beginning, when I didn't know the subst source code that well, for me the easiest way to achieve my goal was to provide access to this single/one-substitution step. Now, as I know your code pretty well, I detected that maybe the only function that I need from subst is the parse_variable. Making that function public woult probably be enought.

@de-vri-es
Copy link
Contributor

Hmm, interesting. I still like the template approach, but I also like to unblock progress for uutils. I'll have a look at that PR.

@cre4ture cre4ture closed this Jan 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants