-
Notifications
You must be signed in to change notification settings - Fork 15.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conformance test: JSON parsing of unknown enum string values #9534
Conformance test: JSON parsing of unknown enum string values #9534
Conversation
As explained in #8358, I don't have permissions to add a label. |
f0e0866
to
675d781
Compare
Thanks. This is useful. I think we need a clear decision on the correct behavior in #7392 before we can proceed though. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would agree that this is the most reasonable behavior for "ignore unknown". I will comment on the related issue on this...
I am working on trying to nail down all this stuff in a forthcoming spec, which should hopefully, help clarify things.
string input_json; | ||
}; | ||
const std::vector<TestCase> test_cases = { | ||
{"InOptionalField", R"({ "optional_nested_enum": "UNKNOWN_ENUM_VALUE" })"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you mark these raw strings as R"json(
, and wrap them so that they span multiple lines like e.g.
R"json({
"foo": "bar"
})json"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TIL, thank you!
@mcy thank you for the approval! Due to a typo we need another
Did you end up commenting about this? I couldn't find your comment, please link :) |
@antongrbin @mcy Looks like the checks got stuck, would you be able to take a look? We're really looking forward to get this merged so we can move forward with apple/swift-protobuf#972 (comment). Thanks! |
I don't have permissions to trigger the checks (first time contributor here). @cassianomonteiro I am also looking forward to get this merged. My team is running a forked version of Python and SWIFT for the past few quarters and I am looking forward to contributing the fixes upstream and removing the fork. @mcy is there anything blocking us from merging this? |
Anton, Can you please rebase to the tip of main? That will allow us to qualify this against google internal code |
4e0a67a
to
d925ffa
Compare
@mkruskal-google thank you for your response! Rebased! |
@mkruskal-google Hi, would you be able to move this forward? Thanks! |
d925ffa
to
064b263
Compare
@mcy @mkruskal-google it's been 10 days since my last rebase so I'm rebasing again. GitHub automatically assigned @acozzette. Can we get clarity on the next steps here? |
+1! Thanks @antongrbin ! |
Hey all, Sorry for the delay! Both of us got sick and this fell through the cracks. I'll rerun tests and qualify it internally before submitting |
Awesome, thanks! |
Update: This has been imported and looks good to go. Just waiting on Miguel to confirm the internal changes look ok |
-- 064b263 by Anton Grbin <[email protected]>: squash and rebase COPYBARA_INTEGRATE_REVIEW=#9534 from noom:anton/MIN-2347/json_conformance_unknown_fields 064b263 PiperOrigin-RevId: 488667851
Recommended.Proto3.JsonInput.IgnoreUnknownEnumStringValueInMapValue.ProtobufOutput | ||
Recommended.Proto3.JsonInput.IgnoreUnknownEnumStringValueInRepeatedField.ProtobufOutput |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why were these added as failures? What's the expected behavior here? From the PR description it sounds like C++ and Java were implemented "correctly", so it's a little strange that two of the added tests are failing for C++
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this was committed already and is causing presubmits to fail since these indeed shouldn't be in the failure list: e.g. https://github.com/protocolbuffers/upb/actions/runs/3473202608/jobs/5804957297
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was, I'm trying to retroactively understand the goal here. We can roll it back if it's causing issues, but it looks like you solved it by bumping upb's protobuf version
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(sorry, i was actually wrong above -- the presubmit failure was for upb and wasn't from the cpp failure list)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies for the breakage!
Initially I was considering unknown enum string value in optional fields. That case works for C++ as expected, the unknown value is ignored.
When the unknown enum string value is encountered in a repeated field or a map, the C++ shows unexpected behavior too. It fails parsing the whole message although ignoreUnknownFields is set. This is inconsistent with Java, csharp, objc, but more importantly it breaks forward compatibility for enum evolution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zhangskz, based on your latest message I suspect you already executed the following steps to debug upb
failure, but to over-communicate adding it here.
I can reproduce the failure you linked above link locally. The error message is:
These tests were listed in the failure list, but they don't exist:
Recommended.Proto3.JsonInput.IgnoreUnknownEnumStringValueInMapValue.ProtobufOutput
Recommended.Proto3.JsonInput.IgnoreUnknownEnumStringValueInRepeatedField.ProtobufOutput
It looks like the conformance test runner and the failure list fell out of sync:
- conformance test runner was was taken from d938afd (defined in
bazel/workspace_deps.bzl
) - failure list is taken from 064b263 (copied to
upb/conformance_upb_failures.txt
)
There is an open PR that is supposed to fix this: protocolbuffers/upb#938. That PR is held back due to a python test failure that looks unrelated to the conformance test that we added.
This updates all generated code to match the contents of the latest v22.0 release of Protobuf. This involved a couple of changes to the script that does the sync'ing: 1. The new Protobuf version no longer includes autoconf configuration and instead requires using bazel to build things. 2. The new Protobuf release does not have an artifact named "protobuf-all-${VERSION}.tar.gz", but the one named "protobuf-${VERSION}.tar.gz" has all of the sources and was sufficient for the regenerate.bash script to complete. This change does NOT regenerate the protos related to benchmarks. The Protobuf repo no longer includes benchmarks. The CL removing them says they are superceded by google/fleetbench: protocolbuffers/protobuf@83c499d But that project's proto benchmark files are very different: https://github.com/google/fleetbench/tree/main/fleetbench/proto So I commented out those steps in the generation code since the benchmarks will need some work to reconcile with fleetbench. This code adds known failing tests for conformance. New test cases were added in protocolbuffers/protobuf#9534, but the Go protojson package does not behave according to the new tests. There is an existing issue in GitHub about this: golang/protobuf#1208 Change-Id: Iad796ec7889bc2a74b58db5224facf850cd1a1cd Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/469255 Reviewed-by: Michael Stapelberg <[email protected]> Reviewed-by: Damien Neil <[email protected]>
Motivation
Different clients yield different results when JSON parsing unknown enum string values:
I am proposing a new conformance test with the hope to align the implementations. The expected behavior I am proposing is the one followed by cpp and java already. This is also aligned with binary serialization behavior for unknown enum values.
More details in the issue:
#7392
To keep the PR and discussion small, we leave the unknown enum integer values out of scope. For integer values, the difference between proto2 and proto3 comes into discussion, which makes it more complex.
Related PRs and issues
Related issues/PR in the official repo:
Related PRs in other external protobuf implementations: