Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

high memory usage using stream consumer and potentialy memory leaking #179

Closed
terrywh opened this issue May 27, 2017 · 21 comments
Closed

high memory usage using stream consumer and potentialy memory leaking #179

terrywh opened this issue May 27, 2017 · 21 comments

Comments

@terrywh
Copy link

terrywh commented May 27, 2017

Node v7.10.0 node-rdkafka: v0.10.2
topic message get produced at about 500/s, using the below code as consumer, memory go all the way up to 1GB:

"use strict";

const Kafka = require("node-rdkafka"),
	Writable = require("stream").Writable;

var cc = new Kafka.KafkaConsumer({
  'group.id': 'memory-leak-detector',
  'metadata.broker.list': '11.22.33.44:9092,22.33.44.55:9092',
}, {});
cc.on("error", function(err) {
	console.log("consumer:", err);
}).on("ready", function() {
	console.log("ready");
});


var cs = cc.getReadStream(["cmd-onoff-n"], {waitInterval: 10});
cs.on("error", function() {
	console.log("consumer stream:", err)
})
cs.pipe(new Writable({
	objectMode: true,
	write: function(data, encoding, callback) {
		console.log(data);
		callback(null);
	},
}));

12:23
201705271223
12:44
201705271244

by set 'enable.auto.commit': true, i got the following:
14:30
201705271430
15:07
201705271507
15:54
201705271554

Compared to, GO with confluent-kafka-go which is also based on librdkafka, and i got a average memory usage of 80MB.

@terrywh terrywh changed the title high memory usage Using stream consumer high memory usage using stream consumer May 27, 2017
@terrywh terrywh changed the title high memory usage using stream consumer high memory usage using stream consumer and potentialy memory leaking May 27, 2017
@webmakersteve
Copy link
Contributor

We run these producers all day and they handle a large number of messages per day and I am not seeing similar behavior in our own applications.

Node and GO are different beasts. Node.js has a garbage collector that runs when it needs to, and sometimes not a moment more.

I would reduce node's heap size to a lower amount and see if it ever exceeds it and crashes because it runs out of heap. Otherwise, the garbage collector may just not be running because it doesn't need to.

@terrywh
Copy link
Author

terrywh commented Jun 20, 2017

@webmakersteve the high memory usage can be reproduced by "consumers" provided by the code above, not "producers". maybe i didn't make that clear? or did i miss something ?

@webmakersteve
Copy link
Contributor

Was this fixed by changing node versions?

@hugebdu
Copy link

hugebdu commented Aug 23, 2017

Same here. Node v6.11.0. Using standard consumer API (non-stream).
Have a consumer with 21k/m rate running for 12 hours. RSS is 1.5G and counting. Heap is steady.
Will try different node versions.
@webmakersteve any input is welcome :)

@hugebdu
Copy link

hugebdu commented Aug 23, 2017

Actually found out that I'm using the defaults for queued.max.messages.kbytes. Will try with lower value and update

@hugebdu
Copy link

hugebdu commented Aug 23, 2017

Well, no luck with setting lower queued.max.messages.kbytes. Non-heap still increases constantly:
screen shot 2017-08-23 at 12 26 44

Will let it run to see whether it will crush

@hugebdu
Copy link

hugebdu commented Aug 29, 2017

UPD: tried the following and still see non-heap memory increasing endlessly:

  • node v8.4.0 (node crushed every ~minute, didn't check the cause)
  • node v6.10.3
  • node v6.11.0
  • w/o auto-commits
  • w/o event_cb

Giving up, switched to no-kafka

@webmakersteve
Copy link
Contributor

Sorry to hear it isn't working for you. Were you using the same code linked above? Because that code is outdated.

In any event, the code you were using when you ran these tests would be helpful so i could ensure it isn't happening, if you wouldn't mind linking to it if it's easily isolated.

@hugebdu
Copy link

hugebdu commented Aug 29, 2017

Stephen, thanks for following up.
I'll try to share the interesting parts of my code later today/tomorrow.

@webmakersteve
Copy link
Contributor

@terrywh I adapted the code above on the current version of librdkafka (2.0.0) and i can't reproduce a memory leak after running for around 8 hours with a stream of data of about 100 messages per second.

Was testing on node v7.5.0

@webmakersteve
Copy link
Contributor

Here is the code i was running:

"use strict";

const Kafka = require("./lib"),
    Writable = require("stream").Writable;

var cs = Kafka.createReadStream({
  'group.id': 'memory-leak-detector',
  'metadata.broker.list': 'broker',
  'enable.auto.commit': false
}, {}, {
  topics: ['topic'],
  waitInterval: 10
});

var cc = cs.consumer;

cc.on("error", function(err) {
    console.log("consumer:", err);
}).on("ready", function() {
    console.log("ready");
});


cs.on("error", function() {
    console.log("consumer stream:", err)
})
cs.pipe(new Writable({
    objectMode: true,
    write: function(data, encoding, callback) {
        console.log(data);
        callback(null);
    },
}));

@michallevin
Copy link

@terrywh @hugebdu Were you able to solve your memory issue?

@hugebdu
Copy link

hugebdu commented Jan 14, 2018

@michallevin after a long journey of trying various kafka node clients, using no-kafka. so far so good. but it's entirely node.js implementation, not a native driver kafka, for good and for bad.

good luck

@michallevin
Copy link

@hugebdu Wow, surprising. We were thinking the problem might not be in Kafka at all. We are using Highland.js and uploading files to S3...

@webmakersteve
Copy link
Contributor

@michallevin if you're having memory leak issues in your application, try to isolate it if you can. We've had memory leaks in the software before but I think we've resolved all of them. We run these consumers all day every day, but there still could be leaks in the consumer in edge cases that we may not be hitting.

But going to close this issue. Open a new one if you're having trouble!

@michallevin
Copy link

@webmakersteve @hugebdu We were just able to solve our issue with queued.max.messages.kbytes. We realized the default is ~1GB per topic 😮

@hugebdu
Copy link

hugebdu commented Jan 17, 2018

@michallevin I remember I did try playing with various config opts and still having issues. Also it was crashing completely with node 7 (yup, I know it wasn't LTS)

@senthil5053
Copy link

@michallevin what would be a decent value for queued.max.messages.kbytes, (1024) ? Considering GBs of data being consumed from a single topic.

@michallevin
Copy link

michallevin commented May 24, 2018

@senthil5053 We use 100,000 kbytes, but for several topics and consumers. If you have a single consumer you should be able to use the default.

@senthil5053
Copy link

@michallevin Thanks.

@yanqic
Copy link

yanqic commented Feb 23, 2019

Did you fix the problem,I have the same problem. Node Version: v8.15.0 , Could you tell me how to solve the problem, thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants