Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conformance test: JSON parsing of unknown enum string values #9534

Conversation

antongrbin
Copy link
Contributor

@antongrbin antongrbin commented Feb 23, 2022

Motivation

Different clients yield different results when JSON parsing unknown enum string values:

  1. java, cpp will ignore the unknown enum string value when ignoreUnknown is set
  2. python, swift, c# will fail although ignoreUnknown is set

I am proposing a new conformance test with the hope to align the implementations. The expected behavior I am proposing is the one followed by cpp and java already. This is also aligned with binary serialization behavior for unknown enum values.

More details in the issue:
#7392

To keep the PR and discussion small, we leave the unknown enum integer values out of scope. For integer values, the difference between proto2 and proto3 comes into discussion, which makes it more complex.

Related PRs and issues

Related issues/PR in the official repo:

Related PRs in other external protobuf implementations:

@antongrbin
Copy link
Contributor Author

antongrbin commented Feb 23, 2022

As explained in #8358, I don't have permissions to add a label.
I would like to add label conformance tests.

@antongrbin antongrbin changed the title Conformance test for JSON behaviour when parsing unknown enum string values Conformance test: JSON behavior when parsing unknown enum string values Feb 23, 2022
@antongrbin antongrbin changed the title Conformance test: JSON behavior when parsing unknown enum string values Conformance test: JSON parsing of unknown enum string values Feb 23, 2022
@antongrbin antongrbin force-pushed the anton/MIN-2347/json_conformance_unknown_fields branch from f0e0866 to 675d781 Compare February 23, 2022 10:42
@elharo
Copy link
Contributor

elharo commented Feb 23, 2022

Thanks. This is useful. I think we need a clear decision on the correct behavior in #7392 before we can proceed though.

Copy link
Contributor

@mcy mcy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would agree that this is the most reasonable behavior for "ignore unknown". I will comment on the related issue on this...

I am working on trying to nail down all this stuff in a forthcoming spec, which should hopefully, help clarify things.

string input_json;
};
const std::vector<TestCase> test_cases = {
{"InOptionalField", R"({ "optional_nested_enum": "UNKNOWN_ENUM_VALUE" })"},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you mark these raw strings as R"json(, and wrap them so that they span multiple lines like e.g.

R"json({
  "foo": "bar"
})json"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL, thank you!

@antongrbin
Copy link
Contributor Author

antongrbin commented Oct 14, 2022

@mcy thank you for the approval! Due to a typo we need another kokoro:run to see if any failure list needs updating.

I would agree that this is the most reasonable behavior for "ignore unknown". I will comment on the related issue on this...

Did you end up commenting about this? I couldn't find your comment, please link :)

@cassianomonteiro
Copy link

@antongrbin @mcy Looks like the checks got stuck, would you be able to take a look? We're really looking forward to get this merged so we can move forward with apple/swift-protobuf#972 (comment).

Thanks!

@antongrbin
Copy link
Contributor Author

I don't have permissions to trigger the checks (first time contributor here).

@cassianomonteiro I am also looking forward to get this merged. My team is running a forked version of Python and SWIFT for the past few quarters and I am looking forward to contributing the fixes upstream and removing the fork.

@mcy is there anything blocking us from merging this?

@mkruskal-google
Copy link
Member

Anton,

Can you please rebase to the tip of main? That will allow us to qualify this against google internal code

@antongrbin antongrbin force-pushed the anton/MIN-2347/json_conformance_unknown_fields branch from 4e0a67a to d925ffa Compare November 4, 2022 10:10
@antongrbin
Copy link
Contributor Author

Can you please rebase to the tip of main? That will allow us to qualify this against google internal code

@mkruskal-google thank you for your response!

Rebased!

@cassianomonteiro
Copy link

@mkruskal-google Hi, would you be able to move this forward? Thanks!

@antongrbin antongrbin force-pushed the anton/MIN-2347/json_conformance_unknown_fields branch from d925ffa to 064b263 Compare November 14, 2022 11:06
@antongrbin antongrbin requested a review from a team as a code owner November 14, 2022 11:06
@antongrbin antongrbin requested review from acozzette and removed request for a team November 14, 2022 11:06
@antongrbin
Copy link
Contributor Author

@mcy @mkruskal-google it's been 10 days since my last rebase so I'm rebasing again. GitHub automatically assigned @acozzette.

Can we get clarity on the next steps here?

@cassianomonteiro
Copy link

@mcy @mkruskal-google it's been 10 days since my last rebase so I'm rebasing again. GitHub automatically assigned @acozzette.

Can we get clarity on the next steps here?

+1!

Thanks @antongrbin !

@mkruskal-google
Copy link
Member

Hey all,

Sorry for the delay! Both of us got sick and this fell through the cracks. I'll rerun tests and qualify it internally before submitting

@cassianomonteiro
Copy link

Hey all,

Sorry for the delay! Both of us got sick and this fell through the cracks. I'll rerun tests and qualify it internally before submitting

Awesome, thanks!

@mkruskal-google
Copy link
Member

Update: This has been imported and looks good to go. Just waiting on Miguel to confirm the internal changes look ok

copybara-service bot pushed a commit that referenced this pull request Nov 15, 2022
--
064b263 by Anton Grbin <[email protected]>:

squash and rebase

COPYBARA_INTEGRATE_REVIEW=#9534 from noom:anton/MIN-2347/json_conformance_unknown_fields 064b263
PiperOrigin-RevId: 488667851
Comment on lines +20 to +21
Recommended.Proto3.JsonInput.IgnoreUnknownEnumStringValueInMapValue.ProtobufOutput
Recommended.Proto3.JsonInput.IgnoreUnknownEnumStringValueInRepeatedField.ProtobufOutput
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why were these added as failures? What's the expected behavior here? From the PR description it sounds like C++ and Java were implemented "correctly", so it's a little strange that two of the added tests are failing for C++

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this was committed already and is causing presubmits to fail since these indeed shouldn't be in the failure list: e.g. https://github.com/protocolbuffers/upb/actions/runs/3473202608/jobs/5804957297

Copy link
Member

@mkruskal-google mkruskal-google Nov 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was, I'm trying to retroactively understand the goal here. We can roll it back if it's causing issues, but it looks like you solved it by bumping upb's protobuf version

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(sorry, i was actually wrong above -- the presubmit failure was for upb and wasn't from the cpp failure list)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for the breakage!

Initially I was considering unknown enum string value in optional fields. That case works for C++ as expected, the unknown value is ignored.

When the unknown enum string value is encountered in a repeated field or a map, the C++ shows unexpected behavior too. It fails parsing the whole message although ignoreUnknownFields is set. This is inconsistent with Java, csharp, objc, but more importantly it breaks forward compatibility for enum evolution.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zhangskz, based on your latest message I suspect you already executed the following steps to debug upb failure, but to over-communicate adding it here.

I can reproduce the failure you linked above link locally. The error message is:

These tests were listed in the failure list, but they don't exist:

  Recommended.Proto3.JsonInput.IgnoreUnknownEnumStringValueInMapValue.ProtobufOutput
  Recommended.Proto3.JsonInput.IgnoreUnknownEnumStringValueInRepeatedField.ProtobufOutput

It looks like the conformance test runner and the failure list fell out of sync:

There is an open PR that is supposed to fix this: protocolbuffers/upb#938. That PR is held back due to a python test failure that looks unrelated to the conformance test that we added.

gopherbot pushed a commit to protocolbuffers/protobuf-go that referenced this pull request Feb 22, 2023
This updates all generated code to match the contents of the latest
v22.0 release of Protobuf.

This involved a couple of changes to the script that does the sync'ing:
  1. The new Protobuf version no longer includes autoconf configuration
     and instead requires using bazel to build things.
  2. The new Protobuf release does not have an artifact named
     "protobuf-all-${VERSION}.tar.gz", but the one named
     "protobuf-${VERSION}.tar.gz" has all of the sources and was
     sufficient for the regenerate.bash script to complete.

This change does NOT regenerate the protos related to benchmarks.
The Protobuf repo no longer includes benchmarks. The CL removing them
says they are superceded by google/fleetbench:
    protocolbuffers/protobuf@83c499d
But that project's proto benchmark files are very different:
    https://github.com/google/fleetbench/tree/main/fleetbench/proto
So I commented out those steps in the generation code since the benchmarks
will need some work to reconcile with fleetbench.

This code adds known failing tests for conformance. New test cases were
added in protocolbuffers/protobuf#9534, but the
Go protojson package does not behave according to the new tests. There
is an existing issue in GitHub about this:
    golang/protobuf#1208

Change-Id: Iad796ec7889bc2a74b58db5224facf850cd1a1cd
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/469255
Reviewed-by: Michael Stapelberg <[email protected]>
Reviewed-by: Damien Neil <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants