-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error when consuming: "'PartialMessage' object has no attribute 'validate_crc'" #672
Comments
It seems it is always the same messages are causing this problem - so when it gets to the same message(s) then the consumer fails with the above error instead of just stepping them over (assuming they are incorrect messages). |
If I read the code correctly, the problem is that https://github.com/dpkp/kafka-python/blame/aefafd270a75b9a3d21c148eefba5ba56cbc622b/kafka/consumer/fetcher.py#L357 does not check for PartialMessage, but assumes that all msg has a validate_crc() method. I assume a similar handling of PartialMessages should be added to fetcher.py as the simple.py used to have at kafka-python/kafka/consumer/simple.py Line 416 in 116e634
|
strange: https://github.com/dpkp/kafka-python/blob/1.1.1/kafka/consumer/fetcher.py#L615 should guarantee that no PartialMessages are put on the internal _records queue, which is what fetcher consumes from when unpacking. Are you using compressed messages by chance? |
Our Kafka cluster stores the data snappy compressed - but not sure whether this would mean that the messages are compressed.
Interesting thing that I have no problem with another topic on the same Kafka cluster, for some reason so far this issue is specific to one topic only. |
I can see that you are checking for PartialMessages at https://github.com/dpkp/kafka-python/blob/1.1.1/kafka/consumer/fetcher.py#L615, but only looking for it as the last message.
And yes, I was seeing a few "This is a partial msg" all the way until I got stopped by another error:
So it seems I need some more verification of the messages. |
PartialMessage should only ever be at the end of a message set. It is/was an artifact of the broker cutting off message sets strictly at the max_partition_fetch_bytes boundary and not "pruning" the dangling bytes from the message set before returning to the client. It should be impossible to get a partial message anywhere else. But if you are consuming compressed message sets then you should never ever see a partial message. What code is producing messages to your brokers? Are the messages sent w/ compression enabled? |
We are using kafka-python to produce these messages to the brokers and it is producing two different topics and I have no problem with the other topic. Yes, the producer is sending the messages with compression enabled (compression_type='snappy'). |
After skipping that problematic section of the topic (by manually moving the consumer offset forward), the issue does not come up anymore, so it seems it was some kind of a problem with the topic data - probably related to the Kafka cluster and kafka-python upgrade. Either case it seems this was just a one-time thing (multiple messages, but only in one section of the topic feed) and can't reproduce it, I am closing this. |
I seem to be following closely in @zoltan-fedor's footsteps in digging up obscure occasional compression-related bugs -- I am seeing this one today. I'll attempt to show something reproducible... |
Looking back at this issue, I believe it is the same root issue as #718 . |
In the last few days I have upgraded to kafka 0.9 and kafka-python 1.1.1 and started receiving at times the following error when consuming from Kafka:
'PartialMessage' object has no attribute 'validate_crc'
The text was updated successfully, but these errors were encountered: