-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
undefined method 'key'? #1315
Comments
Here is the backtrace
|
You said v0.14.6 but your backtrace says you use v0.14.8. |
Yeah, I recently updated to try and fix the error, but no luck. |
Could you paste entire log? |
|
Thanks!
This line is weird. metadata is created with nil values. I will check it why this happens. |
This error happens immediately after restart. |
There's a large number of buffers when I restart, basically it is blocking the upload of all the buffers so we a ton of buffers stuck there. |
The odd thing is that everything was generated with version 14.6, we don't have any buffers left over from a previous or older version. |
Is there a way to tell from the error which buffer file is causing the issue? |
This is chunk id and you can search which chunk has this problem. BTW, do you update fluent-plugin-kafka together with fluentd? |
I can reproduce this problem:
In this case, same error happens. |
Is it a bug (or documentation issue) of fluent-plugin-kafka, right? |
We can't judge it because if dtboctor uses recent fluent-plugin-kafka version before, |
When I encountered this problem I was using 0.14.6 of fluentd and fluentd-plugin-kafka version 0.3.1. This was working together correctly for a while and is currently what is running on most of our servers without an issue. This problem began while running with that setup? I tried upgrading to 0.14.8 of fluentd and 0.4.0 of the fluent plugin together and this problem was still occurring. I also tried just removing this file and the next file came up with the same error? |
I'm not sure. It depends on your buffer content. |
Okay, I'm going to look into whether or not there was an issue where the plugins were not quite aligned with each other in terms of version. Is there a way to look at the buffer file and parse it to see what format or version it is using? |
I tried backing down to version 0.1.3 of the plugin, which I knew we were using at one time...I get
|
Is there anything additional I can send over that would help in understanding what happened here? |
You can check your buffer file status with following script:
require 'msgpack'
require 'fluent/msgpack_factory'
unpacker = Fluent::MessagePackFactory.unpacker
log_path = ARGV.first
meta_path = "#{log_path}.meta"
log_file = File.basename(log_path)
puts "buffer file: #{log_path}"
unpacker.feed(File.read(meta_path)).each { |data|
puts "meta keys: #{data["variables"]}"
}
unpacker.feed(File.read(log_path)).each { |data|
puts "one event length: #{data.size}"
break
}
meta keys is empty and one evnet consists of 3 elements.
meta keys has
Latest kafka plugin can't flush old buffers. Check your buffer file structure. |
One other question, we had a very large event code through our system that broke a lot of other parts, is there any case where a large message could break fluentd? Like if it were very large in relation to the chunk size for the buffer? |
What does "break" mean? fluentd can handle any size of record. |
I tried to run the code you posted, I have an interesting problem
seems like my files are named differently than what is expected? |
I renamed my file to make the script work:
|
seems like most of our events are all buffer.q*, but a few are buffer.b*, and a good number of the meta files are...., why would that happen? |
That's weird... |
So in your buffer directory, the number of "b" meta file and the number of "q" buffer file are mismatched, right? |
Yes |
And it seems that I have other mismatches in there, definitely more than one |
So we had some errors before this occurred related to Kafka rejecting our messages due to the size, we eventually corrected that but could there be a case where an error causes it to be half set? Not sure, but this seems to have happened a lot on this machine, though we have many other machines running this exact setup that have had no issues like this whatsoever. |
Is there anything I can do to help diagnose what happened? Also, would just renaming these files all to q fix the issue in the interim? |
I identified that "undefined method 'key'" for nil occurs in |
Zulu time format? Any idea why this would be the case or a way for me to see if I match that descriptions? Not sure why we would be using that? |
Sorry for late reply, and, I'm sorry too that I misunderstood the problem in this issue. #1319 is another thing, and not related with this issue. |
Hey guys, I found the root cause of this error, so I thought I'd share Basically this happens if you send a message through fluentd that is bigger than the configured chunk size. It basically permanently breaks with this error until you clear out the problematic files. Seems like it is something that should be handled, in the case that the file comes through, basically fluentd stops sending to kafka and the buffer grows indefinitely. |
It means you can reproduce this problem? If so, could you share your configuration and reproducible step? |
yeah, I just set my buffer chunk limit to something pretty small, like 10k on our production server, and a >10k event came through, basically it started giving me this error. I'll see if I can get a more specific configuration to you. |
an event came through fluentd that was over a megabyte (perhaps close to 2 megabytes) and it crashed with the above error. |
Thanks for the configuration. |
Hi, any update on this issue? I also have this problem.
stack:
config:
|
@aiwantaozi Does this problem happens with |
So for fluentd 1.x, I should use kafka2 not kafka_buffered? Let me try. |
There are really two distinct problems here.
The second problem occurs because of the v0.12 compatibility layer. As to the first problem, the exact cause is unknown yet. I have tried |
This issue has been automatically marked as stale because it has been open 90 days with no activity. Remove stale label or comment or this issue will be closed in 30 days |
This issue was automatically closed because of stale in 30 days |
I'm having some issues with fluent 0.14.6, it was running okay for a while but a few days ago started giving me the following error on startup
This is on a centos OS. I've found references for that issue in older versions, not sure why I am getting it. Is there anything I can send you guys that might explain what is happening?
The text was updated successfully, but these errors were encountered: