Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avro Schema w/Hive integration #145

Open
rnpridgeon opened this issue Oct 28, 2016 · 1 comment
Open

Avro Schema w/Hive integration #145

rnpridgeon opened this issue Oct 28, 2016 · 1 comment

Comments

@rnpridgeon
Copy link
Contributor

Due to a known limitation with hive schema literals can only be saved to the serde properties field if they are shorter than 4000 characters. The biggest issue with this limitation is that the table creation will not fail. Instead the Schema is truncated and the rest of the operation succeeds as normal.

https://issues.apache.org/jira/browse/HIVE-9815
https://issues.apache.org/jira/browse/HIVE-12274
https://issues.apache.org/jira/browse/HIVE-12299

The work around for this is to store your schema definition in a separate file and setting the appropriate table property. Alternatively you could redefine the datatype within the hive schema but this seems a bit like overkill.

Given that this is a known issue and Avro schemas are quite often in excess of 4000 characters the SR should handle this more gracefully. When Hive integration is enabled the schema should be written to a separate file and the appropriate table property should be set.

Thanks,
Ryan

@Vincent-Zeng
Copy link

Hi.
With hive.integration=true, how can kafka-connect-sink use avro.schema.url instead of avro.schema.literal. Or I need alter table manually in Hive?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants