Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Python Schema] Fix redefined Record or Enum in Python schema #11595

Conversation

gaoran10
Copy link
Contributor

@gaoran10 gaoran10 commented Aug 8, 2021

Fixes #11533

Motivation

Refer to issue #11533 , currently, if users redefined the same Record or Enum in Record, the schema info isn't reused the defined name, this does not match the Avro schema info format.

Modifications

Add a new method schema_info(self, defined_names) in Record, Array, Map, and Enum, all defined names will be added in the parameter defined_names when users use a defined Record, or Enum, the schema info will use the name of the defined Record or Enum as the type.

Verifying this change

Add test to verify using same Record or Enum in a complex Record.

Does this pull request potentially affect one of the following parts:

If yes was chosen, please highlight the changes

  • Dependencies (does it add or upgrade a dependency): (no)
  • The public API: (no)
  • The schema: (no)
  • The default values of configurations: (no)
  • The wire protocol: (no)
  • The rest endpoints: (no)
  • The admin cli options: (no)
  • Anything that affects deployment: (no)

Documentation

For committer

For this PR, do we need to update docs?

  • If yes,

    • if you update docs in this PR, label this PR with the doc label.

    • if you plan to update docs later, label this PR with the doc-required label.

    • if you need help on updating docs, create a follow-up issue with the doc-required label.

  • If no, label this PR with the no-need-doc label and explain why.

@gaoran10 gaoran10 self-assigned this Aug 8, 2021
@codelipenghui codelipenghui added this to the 2.9.0 milestone Aug 8, 2021
@codelipenghui codelipenghui added release/2.8.1 type/bug The PR fixed a bug or issue reported a bug labels Aug 8, 2021
Copy link
Contributor

@BewareMyPower BewareMyPower left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Block this PR to wait until #11557 is merged.

@Anonymitaet
Copy link
Member

Thanks for your contribution. For this PR, do we need to update docs?

(The PR template contains info about doc, which helps others know more about the changes. Can you provide doc-related info in this and future PR descriptions? Thanks)

@BewareMyPower
Copy link
Contributor

Could you rebase to master to make sure CI passed?

@BewareMyPower BewareMyPower added the doc-not-needed Your PR changes do not impact docs label Aug 9, 2021
@BewareMyPower
Copy link
Contributor

Thanks for your contribution. For this PR, do we need to update docs?

(The PR template contains info about doc, which helps others know more about the changes. Can you provide doc-related info in this and future PR descriptions? Thanks)

I added the no-need-doc label. It's just a bug fix.

@Anonymitaet
Copy link
Member

@BewareMyPower thanks!

@gaoran10 gaoran10 force-pushed the gaoran/fix-python-schema-redefined-record branch from 2866080 to 9487b70 Compare August 9, 2021 02:27
@codelipenghui codelipenghui merged commit d66cdbd into apache:master Aug 9, 2021
LeBW pushed a commit to LeBW/pulsar that referenced this pull request Aug 9, 2021
…#11595)

Fixes apache#11533

### Motivation

Refer to issue apache#11533 , currently, if users redefined the same `Record` or `Enum` in `Record`, the schema info isn't reused the defined name, this does not match the Avro schema info format.

### Modifications

Add a new method `schema_info(self, defined_names)` in `Record`, `Array`, `Map`, and `Enum`, all defined names will be added in the parameter `defined_names` when users use a defined `Record`, or `Enum`, the schema info will use the name of the defined `Record` or `Enum` as the type.
@@ -393,6 +413,9 @@ def python_type(self):
return list

def validate_type(self, name, val):
if val is None:
return None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should return the self.default(), not None automatically

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, I'll fix this. Thanks

hangc0276 pushed a commit that referenced this pull request Aug 12, 2021
Fixes #11533

### Motivation

Refer to issue #11533 , currently, if users redefined the same `Record` or `Enum` in `Record`, the schema info isn't reused the defined name, this does not match the Avro schema info format.

### Modifications

Add a new method `schema_info(self, defined_names)` in `Record`, `Array`, `Map`, and `Enum`, all defined names will be added in the parameter `defined_names` when users use a defined `Record`, or `Enum`, the schema info will use the name of the defined `Record` or `Enum` as the type.

(cherry picked from commit d66cdbd)
@gaoran10 gaoran10 deleted the gaoran/fix-python-schema-redefined-record branch August 19, 2021 06:17
bharanic-dev pushed a commit to bharanic-dev/pulsar that referenced this pull request Mar 18, 2022
…#11595)

Fixes apache#11533

### Motivation

Refer to issue apache#11533 , currently, if users redefined the same `Record` or `Enum` in `Record`, the schema info isn't reused the defined name, this does not match the Avro schema info format.

### Modifications

Add a new method `schema_info(self, defined_names)` in `Record`, `Array`, `Map`, and `Enum`, all defined names will be added in the parameter `defined_names` when users use a defined `Record`, or `Enum`, the schema info will use the name of the defined `Record` or `Enum` as the type.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cherry-picked/branch-2.8 Archived: 2.8 is end of life doc-not-needed Your PR changes do not impact docs release/2.8.1 type/bug The PR fixed a bug or issue reported a bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Python Schema] Redefined record schema problem
6 participants