-
Notifications
You must be signed in to change notification settings - Fork 450
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
raw/unescaped string literals? #246
Comments
Python's syntax is a bit weird though: (Copy/pasted from https://docs.python.org/2/reference/lexical_analysis.html):
I doubt we want that, so we probably don't want to use the r"" syntax as it'd be confusing to use the same syntax but have different meaning. I'm not sure what we should do about \u and escaping of " itself. The simplest is to not allow any escaping at all, so if you want to put a " in a raw string, you have to concatenate it in using a non-raw string. Of course you can do r'foo' as well, but this doesn't help if you need to put both a ' and a " in the same string. |
Yow, I clearly didn't think that one all the way through. As for mixed quotes within a string, |
||| always has a terminating \n (assuming unix line endings in your Jsonnet file) which is probably not what you want. How about we do what C# does and have @"foo" and @'foo'. https://msdn.microsoft.com/en-us/library/aa691090(v=vs.71).aspx
|
The quote escape sequence means that you can say @'Simon''s Cat', for example. |
That is pretty much exactly what I was looking for! Sounds great. |
Would you like to implement it? |
I'm rather unskilled at C++ (it's been like 12 years...), but I'm happy to attempt it. |
It shouldn't be too hard, there are a bunch of small things to do. There's lexer.h/cpp where you'd want to add 2 more token types for it. Then that gets converted during parsing into a LiteralString ast which also has an enum for the kind of quotes, that you'd have to extend. The actual handling of escapes is in desugarer.cpp, see these lines:
After the desugarer the strings are truly raw sequences of unicode codepoints, ready for actual execution. You'd want to add something there to interpret '' and "" as ' and " respectively. You'd also need to add support in formatter.cpp for printing it back out in the original quoting style. You probably also want to adjust EnforceStringStyle in formatter.cpp to ignore the raw strings (like that filter currently ignores |||). For bonus points, we could have the formatter convert strings to raw form if they only have \ escapes :) |
Presumably we want to allow @"foo" and @'foo' in imports and field definitions, too. |
Cool, this turned out to be easier than I imagined. I think I have it all implemented now except for converting to verbatim form if a string only has \ escapes. Still need to add tests and update the docs, then I'll open a pull request. |
We sometimes need to write regexes in jsonnet documents, and the double escaping can get a bit awkward, e.g.
"^\\nx\\.y\\.z\\.com\\n$"
A raw string literal syntax would be very nice, perhaps like python's:
r"^\nx\.y\.z\.com\n$"
Desired json output for the above, of course:
"^\\nx\\.y\\.z\\.com\\n$"
The text was updated successfully, but these errors were encountered: