Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent memory leaks (many locks stay in memory) #368

Closed
collimarco opened this issue Feb 8, 2019 · 7 comments
Closed

Prevent memory leaks (many locks stay in memory) #368

collimarco opened this issue Feb 8, 2019 · 7 comments
Assignees
Labels

Comments

@collimarco
Copy link

This is more a question than a real bug report. The problem is that we were investigating a high memory usage of Redis and we found that more than 90% of the keys (~15M) were:

uniquejobs:abcd...efgxyz:AVAILABLE

And most keys were without a TTL. I can't say if the keys were leaked now or in an old version.

However I have a question... In order to avoid all these memory leaks in the future, is it enough to use the lock_expiration option?

We use this:

sidekiq_options queue: :low, lock: :until_executing, lock_expiration: 600

Can you confirm that in this case the lock and its expiration are set in the same transaction (e.g. multi) so that we are guaranteed against memory leaks? If not, can you consider this as a feature request?

@collimarco
Copy link
Author

Now I have deleted all the uniquejobs:* keys. What I have noticed is that although I set lock_expiration (see example above) the following keys are created in Redis without a TTL:

uniquejobs:abcd...efgxyz:GRABBED

Is that normal? Can you consider settings an expiration (TTL) also on those keys? I think that setting a TTL for all keys is a good practice to protect against (expensive) memory leaks. BTW Thanks for the great work!

@mhenrixon
Copy link
Owner

I'll slap something together this weekend! Thanks for reporting

@mhenrixon mhenrixon added the bug label Feb 8, 2019
@mhenrixon mhenrixon self-assigned this Feb 8, 2019
@ZhekaV
Copy link

ZhekaV commented Feb 13, 2019

I have the same issue.
Sometimes not all keys removed from digests array, after a couple of days thousands of keys in this array.

As a temporary workaround, I made the next in workers

    def perform
      DailyKpi::Calculator.new.calculate
      delete_from_unique_jobs_digests
    end

    private

    def delete_from_unique_jobs_digests
      digest = SidekiqUniqueJobs::UniqueArgs.digest(
        'class' => self.class,
        'queue' => sidekiq_options_hash['queue'],
        'args' => []
      )
      SidekiqUniqueJobs::Digests.del(digest: digest)
    end

i'm not sure will it help in my case or nor, but i'll keep an eye on digests array now.

If somebody knows a better solution - please let me know.

@ZhekaV
Copy link

ZhekaV commented Feb 13, 2019

I think I know what was the problem in my case.
There was a queue with enqueued jobs in it, queue was deleted, but the key was not removed from SidekiqUniqueJobs::Digests.

If it's possible and "normal" scenario - then sorry for disturbing.

@mhenrixon
Copy link
Owner

@ZhekaV unfortunately I think it would be really hard to detect that but I’ll have a look at it. Thanks for letting me know. Those comments are gold for me.

Sent with GitHawk

@ZhekaV
Copy link

ZhekaV commented Feb 18, 2019

Something is definitely wrong here:

[1] pry(main)> REDIS_POOL.with {|c| p c.keys}
=> ["71c224bcd97b:1:5d1fb941fb87", "processes", "stat:processed:2019-02-18", "stat:failed", "stat:processed", "stat:failed:2019-02-18", "queues"]
[2] pry(main)> date = Time.zone.today

Job enqueued:

[3] pry(main)> PageStats::CalculatorWorker.perform_async('777777', date)
=> "f4e05406a46786d8832e3f32"
[4] pry(main)> REDIS_POOL.with {|c| p c.keys}
=> ["71c224bcd97b:1:5d1fb941fb87:workers",
 "uniquejobs:9c367fce973b6f855656f1dc7ceaf2cc:EXISTS",
 "71c224bcd97b:1:5d1fb941fb87",
 "unique:keys",
 "processes",
 "stat:processed:2019-02-18",
 "stat:failed",
 "stat:processed",
 "stat:failed:2019-02-18",
 "throttled:PageStats::CalculatorWorker:concurrency.v2",
 "queues",
 "uniquejobs:9c367fce973b6f855656f1dc7ceaf2cc:GRABBED"]
[5] pry(main)> REDIS_POOL.with {|c| p c.keys}
=> ["71c224bcd97b:1:5d1fb941fb87:workers",
 "uniquejobs:9c367fce973b6f855656f1dc7ceaf2cc:EXISTS",
 "71c224bcd97b:1:5d1fb941fb87",
 "processes",
 "stat:processed:2019-02-18",
 "stat:failed",
 "stat:processed",
 "stat:failed:2019-02-18",
 "uniquejobs:9c367fce973b6f855656f1dc7ceaf2cc:AVAILABLE",
 "queues"]

Job is done:

mm_sidekiq       | 2019-02-18T14:24:31.468Z 1 TID-gsxhhqnal PageStats::CalculatorWorker JID-e48b43b8009417bec039c7e6 INFO: start
mm_sidekiq       | 2019-02-18T14:24:31.471Z 1 TID-gsxhhqnal PageStats::CalculatorWorker JID-e48b43b8009417bec039c7e6 INFO: done: 0.003 sec

But key still in Redis:

[6] pry(main)> REDIS_POOL.with {|c| p c.keys}
=> ["uniquejobs:9c367fce973b6f855656f1dc7ceaf2cc:EXISTS", 
"71c224bcd97b:1:5d1fb941fb87", 
"processes", 
"stat:processed:2019-02-18", 
"stat:failed", 
"stat:processed", 
"stat:failed:2019-02-18", 
"queues"]

But digests is blank in Sidekiq UI and here:

[7] pry(main)> SidekiqUniqueJobs::Digests.all
=> []

Anyway job will not be enqueued any more:

[8] pry(main)> PageStats::CalculatorWorker.perform_async('777777', date)
2019-02-18T13:59:59.224Z 88 TID-grq6dmnrs SidekiqUniqueJobs::Client::Middleware DIG-uniquejobs:9c367fce973b6f855656f1dc7ceaf2cc INFO: skipping job with id (9aa5d1523dadd23efea559cc) because unique_digest: (uniquejobs:9c367fce973b6f855656f1dc7ceaf2cc) already exists
=> "9aa5d1523dadd23efea559cc"

Then if try to delete this digest from console:

[9] pry(main)> digest = 'uniquejobs:9c367fce973b6f855656f1dc7ceaf2cc'
=> "uniquejobs:9c367fce973b6f855656f1dc7ceaf2cc"

[10] pry(main)> SidekiqUniqueJobs::Digests.del(digest: digest)
2019-02-18T14:10:53.686Z 88 TID-grq6dmnrs INFO: delete_by_digest(uniquejobs:9c367fce973b6f855656f1dc7ceaf2cc) completed in 0.0ms
=> 0

[11] pry(main)> REDIS_POOL.with {|c| p c.keys}
=> ["71c224bcd97b:1:5d1fb941fb87", 
"processes", 
"stat:processed:2019-02-18", 
"stat:failed", 
"stat:processed", 
"stat:failed:2019-02-18", 
"throttled:PageStats::CalculatorWorker:concurrency.v2", 
"queues"]

It works again:

[12] pry(main)> PageStats::CalculatorWorker.perform_async('777777', date)
=> "39aa7193847aaa67f6d68f55"

[13] pry(main)> REDIS_POOL.with {|c| p c.keys}
=> ["uniquejobs:9c367fce973b6f855656f1dc7ceaf2cc:EXISTS",
 "71c224bcd97b:1:5d1fb941fb87",
 "processes",
 "stat:processed:2019-02-18",
 "stat:failed",
 "stat:processed",
 "stat:failed:2019-02-18",
 "uniquejobs:9c367fce973b6f855656f1dc7ceaf2cc:AVAILABLE",
 "throttled:PageStats::CalculatorWorker:concurrency.v2",
 "queues"]

[14] pry(main)> REDIS_POOL.with {|c| p c.keys}
=> ["uniquejobs:9c367fce973b6f855656f1dc7ceaf2cc:EXISTS",
 "71c224bcd97b:1:5d1fb941fb87",
 "processes",
 "stat:processed:2019-02-18",
 "stat:failed",
 "stat:processed",
 "stat:failed:2019-02-18",
 "throttled:PageStats::CalculatorWorker:concurrency.v2",
 "queues"]

But not for long because key is there again.

@mhenrixon
Copy link
Owner

Thanks for the reports @ZhekaV and @collimarco I will get this fixed tonight!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants