-
Notifications
You must be signed in to change notification settings - Fork 413
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support sending logs to Kafka, instead of directly to logstash #258
Comments
The current protocol the forwarder uses allows load sharing and reliable transport. What would Kafka provide? |
Similar discussion on the rabbitmq ticket, actually. |
Related: #190 (rabbitmq) |
Aha! so you're suggesting that instead, I could just logstash-forwarder to logstash instances that only output to kafaka, which then are processed by other logstash instances that consume from kafka, and get the same result? (Some logstash instances may do double duty I suppose.) |
Or, I suppose, you're proposing there's no need at all for kafka. |
Not sure if anyone still cares, but the main benefit of Kafka in this scenario is a durable and reliable buffer for those cases where your log traffic is exceptionally high. I've never stressed Logstash, but I can imaging that there may be cases where the amount of log traffic would be more than the current Logstash instances could handle. In those cases, I can see where the forwarder has to stop and allow Logstash to catch up. Kafka would allow all of the forwarders to continue to push log messages and Logstash itself would process the messages at it's own rate. I'm not saying that the forwarder needs to support it. After working with Kafka for a several years, I felt I could provide a reason as to why it might be a good idea. NOTE: Not sure how lumberjack works exactly, so I may have mucked up a detail or to. If so, I apologize. |
As someone who have tried various combinations of logstash-forwarder and Kafka, I would like to add to the side of being useful if ls-fwder support using Kafka as a transport. As @sybrandy mentioned, logstash and/or ES instances can get busy and not able to handle the incoming traffic from ls-fwder. It would be nice to have a buffer+transport such as Kafka, especially if that infrastructure already exist (and it does in our case). In our case, ls-fwder overwhelmed our LS + ES, and we ended up writing a small local tool that consume local log and push into Kafka for transport. LS consumes data from Kafka at the other end, at its own pace. In the end, I had to drop ls-fwder completely. |
The lumberjack input plugin does not currently support the full lumberjack protocol. It is missing partial ACK support which would allow backoff. As such if LS cannot keep up with logs, LSF will lose connection and resend - making the situation worse. That might be why it was overwhelmed. I think without that problem the need for a queue is more for work distribution purposes only (dynamically adding LS worker nodes pulling from a queue) I eventually forked into Log Courier to rewrite the protocol with a back off and had no problems since with general workloads. I have some going into Redis though via a LS instance (courier in and Redis out - no filters) and that has dynamic worker pool. Could be an option here. |
+1 any updates on this? |
Logstash has a Kafka output you can use for this purpose. At this time, the On Monday, July 6, 2015, Brandon Wilson [email protected] wrote:
|
Logstash requires deploying and running a JVM on every client. So it would be a win to have a lightweight client (like logstash-forwarder) as the mechanism to send to Logstash servers via Kakfa. We have prototyped somehing using fluentd as the client writing onto kafka and being read by the logstash kafka input module. But this has the requirement of having to put ruby on all our client servers. We're using Kafka for everything else. It would be nice to have it transport logs as well and have a lightweight client with no dependencies like logstash-forwarder. If I had time I would take a stab at it. There is a couple very nice kafka library for go: The basic but full featured go client for Kafka Supports avro and schema registry, built on top of sarma: |
Have you looked at kafkacat? I was using lsf, but that got painful. https://github.com/edenhill/kafkacat I use kafkacat to read logfiles and send to our kafka infrastructure. Tin On Tue, Jul 21, 2015 at 12:51 PM, Robert J. Berger <[email protected]
|
That’s pretty nice. Do you need to do anything special so the logstash kafka input module will consume the logs on the topic you use kafkacat to publish to? Could you share an example runit config file that does this? Do you have a config file per log or you tail a bunch of logs and let logstash sort it out? Thanks
|
Since my log files are already in json format, I did not have to do Yes, one config file per log, as these logs can get fairly large (700GB to Here is the meat of the kafkaput.sh script. Define the variable and you TOPIC="log-event_nginx_json" tail -F ${NGINXLOG} | kafkacat -P -b ${BROKER} -t ${TOPIC} -p ${PARTITIONS} As for runit, just tell it to keep the script running, that's it. #!/bin/bash I used fpm to create a kafkaput rpm with the script and runit setup as the Tin On Tue, Jul 21, 2015 at 1:34 PM, Robert J. Berger [email protected]
|
+1 |
1 similar comment
+1 |
for those +1 this thread. I don't think it will happen anytime soon. Agree or not, Elastic still seem to see LS as the gateway to Kafka when using the ELK stack.With the death of LF, this is even more explicitly stated with :
https://github.com/elastic/libbeat/blob/master/CONTRIBUTING.md There are a few options in addition to the kafkacat already mentioned above, in special: Mozilla's heka (just be mindfull Heka's kafka implementation using Sarama async is not durable) Some random dude's kafka version of "lumberjack" |
For those that may not be following Beats closely, libbeat (and Filebeat) now support a Kafka output in master: elastic/beats#942 |
Closing this issue in favor of elastic/beats#943 so we can continue the discuss there and the related issues. |
Hi Tin, |
@sunilmchaudhari It's been years since I last used kafkacat, but last I checked, it does support talking to kafka over SSL. |
It's useful to have a buffer for logstash indexers to pull from. (Redis, RabbitMQ, Kafka). logstash-forwarder should support sending data into Kafka.
The text was updated successfully, but these errors were encountered: