-
Notifications
You must be signed in to change notification settings - Fork 3.5k
[idl_parser] Add kTokenNumericConstant token #6432
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
src/idl_parser.cpp
Outdated
| TD(IntegerConstant, 258, "integer constant") \ | ||
| TD(FloatConstant, 259, "float constant") \ | ||
| TD(Identifier, 260, "identifier") | ||
| TD(NumericConstant, 260, "nan, inf or function name (signed)") \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will cause someone writing a schema field like inf:string to get a pretty confusing error? If they intended to use inf as short for information or whatever :)
Might it be better to keep it as Identifier and explicitly recognize the few identifiers we care about only when parsing values (not while parsing field names)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will cause someone writing a schema field like inf:string to get a pretty confusing error?
No. In this case inf will be the identifier. If someone writes +inf the parser will return an error that it expects Identifier token. This case added into tests.
I tried to find another solution but couldn't.
Method ParseAnyValue is recursive so it is impossible to detect which part is now parsed, identifier or value.
I tried to solve it with flags on the assignment expressions (= for scheme and : for JSON) but it looks terrible.
Maybe you have an idea where these keywords can be matched as values?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess what might lead to slightly more readable code is to split up the float and function name cases, i.e. +-inf/nan become kTokenFloatConstant (or kTokenSpecialFloat if we need to be able to distinguish), whereas -sin(.. actually gets returned as the separate tokens - followed by kTokenIdentifier ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the introduced kTokenNumericConstant.
Now all processing is located inside the Parser::ParseSingleValue(). It looks much better and it passed all tests.
Could you review this new solution?
04afd53 to
0ad06bb
Compare
This commit adds the new token for correct parsing of signed numeric constants.
Before this expressions `-nan` or `-inf` were treated as kTokenStringConstant.
This was ambiguous if a real string field parsed.
For example, `{ "text_field" : -name }` was accepted by the parser as valid JSON object.
Related oss-fuzz issue: 6200301176619008
0ad06bb to
f375acc
Compare
Probably the generated flatbuffers.pc should not be a part of repo.
6486d96 to
edce648
Compare
|
Thanks :) |
This commit adds the new token for correct parsing of signed numeric constants.
Before this expressions
-nanor-infwere treated askTokenStringConstant. This was ambiguous if a real string field parsed.For example,
{ "text_field" : -name }was accepted by the parser as valid JSON object.Related oss-fuzz issue: 6200301176619008