-
Notifications
You must be signed in to change notification settings - Fork 398
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
exception when using StringConverter for key and value converter #130
Comments
@blbradley I'm a little late to the party here, but can you clarify what happened here a bit? I'm trying to understand how the data was brought into Kafka mainly. Was it brought in via a connector that was using the AvroConverter and then you are trying to use the StringConverter on the other side with the HDFS Sink Connector? |
I can! I used |
@blbradley I think I understand what you're looking for and we can do better in this scenario for sure. I think there's a greater philosophical question to answer here because the strings are inherently unstructured data and we have to decide what it means to turn them into Avro data which is structured. This is discussed a little bit on the mailing list: So I guess my question is, what's the best way to handle this for your use case? I see two scenarios:
What do you think accomplishes your goal best here? Probably both things require documentation of what the behavior is and both probably need to be implemented eventually, but it'd be good to get your input on which one hits the mark best for you at this point. |
I don't plan on using #131. It was mostly an exercise for myself and for the community to see if unstructured data could be used with Confluent Platform. I appreciate your detailed response and am very happy that my work got the conversation started. I'm very willing to discuss the issue. I think the first option you presented sounds best even if it is more work up front. Sounds like you have a good idea of what the 'out-of-the-box unstructured format' should be. How will you proceed? |
@blbradley I was thinking something like the SourceFormat that was included in a connector built off of this one would be a good place to start. However, if we start doing unstructured data, we break the Hive integration piece, so we have to be clear about the two different modes of the connector. I guess let me ask you this: do you have a need for hive integration in your use case? Maybe it makes sense for the next step to be a bit of a product survey to see if most people are using that. |
I don't require Hive integration at the moment.
That sounds great. |
This was an error in how |
@ewencp I think this means that string data will now work. Is that correct? |
@blbradley Yes, that's correct. This problem was just a general problem for all primitive types (e.g. ints, strings, byte[], etc). The test in the patch is for ints, but this should fix strings as well. |
Ok great. Then, this could clear up #74 for JSON with no defined schema. |
I think #74 was originally a feature request to have a JSON output format so while this fix will let you take JSON input as a string and write it out, it doesn't do anything with the JSON, it's just a 1-1 copy. So while I would agree |
* Follows from: confluentinc/kafka-connect-hdfs#130 * Equivalent to: confluentinc/kafka-connect-hdfs#176
I'm working to see if this is fixable.
The text was updated successfully, but these errors were encountered: