Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Manager failed: stack level too deep with v3.1 #396

Closed
pjg opened this issue Jul 5, 2017 · 29 comments · Fixed by #400
Closed

Manager failed: stack level too deep with v3.1 #396

pjg opened this issue Jul 5, 2017 · 29 comments · Fixed by #400

Comments

@pjg
Copy link

pjg commented Jul 5, 2017

I just bumped to 3.1 and now need to revert. Ruby 2.3.3p222.

2017-07-05T07:02:45Z 28438 TID-194sa8 INFO: Ready: 2, Busy: 0, Active Queues: [["...", 1]]
2017-07-05T07:02:48Z 28438 TID-ud0t8 ERROR: Manager failed: stack level too deep
2017-07-05T07:02:48Z 28438 TID-ud0t8 ERROR: /mnt/deploy/shared/bundle/ruby/2.3.0/gems/concurrent-ruby-1.0.5/lib/concurrent/synchronization/mri_lockable_object.rb:38:in `block in
 synchronize'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/concurrent-ruby-1.0.5/lib/concurrent/synchronization/mri_lockable_object.rb:38:in `synchronize'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/concurrent-ruby-1.0.5/lib/concurrent/synchronization/mri_lockable_object.rb:38:in `synchronize'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/concurrent-ruby-1.0.5/lib/concurrent/atomic/event.rb:49:in `set?'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/concurrent-ruby-1.0.5/lib/concurrent/executor/ruby_executor_service.rb:67:in `ns_running?'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/concurrent-ruby-1.0.5/lib/concurrent/executor/abstract_executor_service.rb:42:in `block in running?'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/concurrent-ruby-1.0.5/lib/concurrent/synchronization/mri_lockable_object.rb:38:in `block in synchronize'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/concurrent-ruby-1.0.5/lib/concurrent/synchronization/mri_lockable_object.rb:38:in `synchronize'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/concurrent-ruby-1.0.5/lib/concurrent/synchronization/mri_lockable_object.rb:38:in `synchronize'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/concurrent-ruby-1.0.5/lib/concurrent/executor/abstract_executor_service.rb:42:in `running?'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/shoryuken-3.1.0/lib/shoryuken/manager.rb:28:in `stopped?'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/shoryuken-3.1.0/lib/shoryuken/manager.rb:32:in `dispatch'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/shoryuken-3.1.0/lib/shoryuken/manager.rb:49:in `dispatch_later'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/shoryuken-3.1.0/lib/shoryuken/manager.rb:35:in `dispatch'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/shoryuken-3.1.0/lib/shoryuken/manager.rb:49:in `dispatch_later'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/shoryuken-3.1.0/lib/shoryuken/manager.rb:35:in `dispatch'

And then the log goes on and on for megabytes of backtraces with no end in sight.

I suppose it's caused by this: #389

@phstc
Copy link
Collaborator

phstc commented Jul 5, 2017

Hi @pjg

Could you share your shoryuken.yml? Did you change it with the 3.1.0 upgrade?

@pjg
Copy link
Author

pjg commented Jul 5, 2017

Nothing fancy on my end:

concurrency: 2
delay: 3
timeout: 30

@phstc
Copy link
Collaborator

phstc commented Jul 5, 2017

@pjg did you get that error just by running Shoryuken? Or was that during stop?

@pjg
Copy link
Author

pjg commented Jul 5, 2017

I got it just by running. And it's been flooding my log. But it auto-restarted too, with 3.1

@pjg
Copy link
Author

pjg commented Jul 5, 2017

And I didn't change my config at all.

@phstc
Copy link
Collaborator

phstc commented Jul 5, 2017

@pjg I couldn't reproduce it, but I believe Concurrent.global_io_executor.running? is the offender. TBH I think we are safe only with @done.true?, I added the running? checker just to be extra careful.

Is this error easy for you to reproduce? If so, can I do a quick change, for you to test?

@pjg
Copy link
Author

pjg commented Jul 5, 2017

Me and other developer in my team were not able to reproduce it on our Macbooks. I only saw the issue on a fairly old Ubuntu server, 12.04.4 LTS, 3.2.0-32 kernel. When I saw it I downgraded, so cannot reproduce it now.

FWIW the queue processing was working fine, just that the system would constantly broke down and restart itself (or via my monit setup, couldn't tell).

@phstc
Copy link
Collaborator

phstc commented Jul 5, 2017

@pjg got it! I will test it on Ubuntu and let you know.

phstc pushed a commit that referenced this issue Jul 5, 2017
@phstc phstc closed this as completed in #399 Jul 5, 2017
phstc added a commit that referenced this issue Jul 5, 2017
@phstc
Copy link
Collaborator

phstc commented Jul 5, 2017

Hi @pjg

I couldn't reproduce that even in a Ubuntu box. Anyway, I did a change that I hope can fix the issue. Would you be able to test it before I release a new hotfix version?

gem 'shoryuken', git: 'git://github.com/phstc/shoryuken.git', ref: '484efc0'

@pjg
Copy link
Author

pjg commented Jul 5, 2017

I can reproduce it with two queues, one with priority 1 and the other with 5. It "crashes" after few minute of idling. Unfortunately, I can also reproduce when bundling the 484efc0 ref :/

Similar backtrace:

2017-07-05T18:35:28Z 20499 TID-2fvuc INFO: Received TERM, will shutdown down
2017-07-05T18:35:28Z 20499 TID-2fvuc INFO: Shutting down
2017-07-05T18:35:38Z 20987 TID-508os INFO: Starting
2017-07-05T18:45:24Z 20987 TID-ov2y558ec ERROR: Manager failed: stack level too deep
2017-07-05T18:45:24Z 20987 TID-ov2y558ec ERROR: /mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.9.19/lib/aws-sdk-core/query/param_list.rb:18:in `set'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.9.19/lib/aws-sdk-core/query/param_builder.rb:79:in `set'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.9.19/lib/aws-sdk-core/query/param_builder.rb:70:in `format'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.9.19/lib/aws-sdk-core/query/param_builder.rb:47:in `block in list'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.9.19/lib/aws-sdk-core/query/param_builder.rb:46:in `each'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.9.19/lib/aws-sdk-core/query/param_builder.rb:46:in `with_index'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.9.19/lib/aws-sdk-core/query/param_builder.rb:46:in `list'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.9.19/lib/aws-sdk-core/query/param_builder.rb:66:in `format'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.9.19/lib/aws-sdk-core/query/param_builder.rb:26:in `block in structure'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.9.19/lib/aws-sdk-core/query/param_builder.rb:23:in `each_pair'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.9.19/lib/aws-sdk-core/query/param_builder.rb:23:in `structure'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.9.19/lib/aws-sdk-core/query/param_builder.rb:16:in `apply'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.9.19/lib/aws-sdk-core/query/handler.rb:48:in `apply_params'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.9.19/lib/aws-sdk-core/query/handler.rb:42:in `build_request'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.9.19/lib/aws-sdk-core/query/handler.rb:26:in `call'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.9.19/lib/aws-sdk-core/plugins/user_agent.rb:12:in `call'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.9.19/lib/seahorse/client/plugins/endpoint.rb:41:in `call'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.9.19/lib/aws-sdk-core/plugins/param_validator.rb:21:in `call'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.9.19/lib/seahorse/client/plugins/raise_response_errors.rb:14:in `call'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.9.19/lib/aws-sdk-core/plugins/jsonvalue_converter.rb:20:in `call'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.9.19/lib/aws-sdk-core/plugins/idempotency_token.rb:18:in `call'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.9.19/lib/aws-sdk-core/plugins/param_converter.rb:20:in `call'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.9.19/lib/seahorse/client/plugins/response_target.rb:21:in `call'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.9.19/lib/seahorse/client/request.rb:70:in `send_request'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.9.19/lib/seahorse/client/base.rb:207:in `block (2 levels) in define_operation_methods'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-484efc04cb51/lib/shoryuken/queue.rb:43:in `receive_messages'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-484efc04cb51/lib/shoryuken/fetcher.rb:40:in `receive_messages'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-484efc04cb51/lib/shoryuken/fetcher.rb:18:in `fetch'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-484efc04cb51/lib/shoryuken/manager.rb:83:in `dispatch_single_messages'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-484efc04cb51/lib/shoryuken/manager.rb:42:in `dispatch'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-484efc04cb51/lib/shoryuken/manager.rb:49:in `dispatch_later'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-484efc04cb51/lib/shoryuken/manager.rb:35:in `dispatch'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-484efc04cb51/lib/shoryuken/manager.rb:49:in `dispatch_later'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-484efc04cb51/lib/shoryuken/manager.rb:35:in `dispatch'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-484efc04cb51/lib/shoryuken/manager.rb:49:in `dispatch_later'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-484efc04cb51/lib/shoryuken/manager.rb:35:in `dispatch'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-484efc04cb51/lib/shoryuken/manager.rb:49:in `dispatch_later'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-484efc04cb51/lib/shoryuken/manager.rb:35:in `dispatch'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-484efc04cb51/lib/shoryuken/manager.rb:49:in `dispatch_later'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-484efc04cb51/lib/shoryuken/manager.rb:35:in `dispatch'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-484efc04cb51/lib/shoryuken/manager.rb:49:in `dispatch_later'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-484efc04cb51/lib/shoryuken/manager.rb:35:in `dispatch'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-484efc04cb51/lib/shoryuken/manager.rb:49:in `dispatch_later'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-484efc04cb51/lib/shoryuken/manager.rb:35:in `dispatch'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-484efc04cb51/lib/shoryuken/manager.rb:49:in `dispatch_later'

@phstc
Copy link
Collaborator

phstc commented Jul 5, 2017

@pjg thanks! at least the backtrace is different now. I will try to reproduce it again.

@phstc
Copy link
Collaborator

phstc commented Jul 5, 2017

I'm wondering since the error is the same (no matter the backtrace), if this sleep is too short.

@phstc phstc reopened this Jul 5, 2017
phstc pushed a commit that referenced this issue Jul 5, 2017
@phstc phstc mentioned this issue Jul 5, 2017
@phstc
Copy link
Collaborator

phstc commented Jul 5, 2017

@pjg if you have a chance to test before me new SHA is 40332a5.

But I don't know if that would help much TBH, that sleep is around since 3.x. Maybe a difference now is that we have multiple threads trying to use the AWS::SQS::Client at once.

@pjg
Copy link
Author

pjg commented Jul 5, 2017

Sorry, but it still breaks. Just takes a while longer to break (whole 40 minutes) of idling (I've also upgraded aws-sdk this time).

2017-07-05T19:16:08Z 22700 TID-4vbp4 INFO: Received TERM, will shutdown down
2017-07-05T19:16:08Z 22700 TID-4vbp4 INFO: Shutting down
2017-07-05T19:16:18Z 23211 TID-2bbi8 INFO: Starting
2017-07-05T19:58:36Z 23211 TID-ow2nl9u54 ERROR: Manager failed: stack level too deep
2017-07-05T19:58:36Z 23211 TID-ow2nl9u54 ERROR: /usr/lib/ruby/2.3.0/uri/generic.rb:196:in `initialize'
/usr/lib/ruby/2.3.0/net/http.rb:1051:in `new'
/usr/lib/ruby/2.3.0/net/http.rb:1051:in `proxy_uri'
/usr/lib/ruby/2.3.0/net/http.rb:1038:in `proxy?'
/usr/lib/ruby/2.3.0/net/http.rb:869:in `connect'
/usr/lib/ruby/2.3.0/net/http.rb:1476:in `begin_transport'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.10.8/lib/seahorse/client/net_http/patches.rb:25:in `new_transport_request'
/usr/lib/ruby/2.3.0/net/http.rb:1407:in `request'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.10.8/lib/seahorse/client/net_http/connection_pool.rb:330:in `request'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.10.8/lib/seahorse/client/net_http/handler.rb:72:in `block in transmit'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.10.8/lib/seahorse/client/net_http/handler.rb:121:in `block in session'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.10.8/lib/seahorse/client/net_http/connection_pool.rb:96:in `session_for'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.10.8/lib/seahorse/client/net_http/handler.rb:119:in `session'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.10.8/lib/seahorse/client/net_http/handler.rb:71:in `transmit'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.10.8/lib/seahorse/client/net_http/handler.rb:45:in `call'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.10.8/lib/seahorse/client/plugins/content_length.rb:12:in `call'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.10.8/lib/aws-sdk-core/xml/error_handler.rb:8:in `call'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.10.8/lib/aws-sdk-core/plugins/request_signer.rb:88:in `call'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.10.8/lib/aws-sdk-core/plugins/helpful_socket_errors.rb:10:in `call'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.10.8/lib/aws-sdk-core/plugins/retry_errors.rb:89:in `call'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.10.8/lib/aws-sdk-core/plugins/sqs_queue_urls.rb:13:in `call'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.10.8/lib/aws-sdk-core/query/handler.rb:27:in `call'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.10.8/lib/aws-sdk-core/plugins/user_agent.rb:12:in `call'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.10.8/lib/seahorse/client/plugins/endpoint.rb:41:in `call'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.10.8/lib/aws-sdk-core/plugins/param_validator.rb:21:in `call'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.10.8/lib/seahorse/client/plugins/raise_response_errors.rb:14:in `call'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.10.8/lib/aws-sdk-core/plugins/jsonvalue_converter.rb:20:in `call'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.10.8/lib/aws-sdk-core/plugins/idempotency_token.rb:18:in `call'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.10.8/lib/aws-sdk-core/plugins/param_converter.rb:20:in `call'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.10.8/lib/seahorse/client/plugins/response_target.rb:21:in `call'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.10.8/lib/seahorse/client/request.rb:70:in `send_request'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.10.8/lib/seahorse/client/base.rb:207:in `block (2 levels) in define_operation_methods'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-40332a5bea63/lib/shoryuken/queue.rb:43:in `receive_messages'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-40332a5bea63/lib/shoryuken/fetcher.rb:40:in `receive_messages'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-40332a5bea63/lib/shoryuken/fetcher.rb:18:in `fetch'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-40332a5bea63/lib/shoryuken/manager.rb:83:in `dispatch_single_messages'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-40332a5bea63/lib/shoryuken/manager.rb:42:in `dispatch'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-40332a5bea63/lib/shoryuken/manager.rb:49:in `dispatch_later'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-40332a5bea63/lib/shoryuken/manager.rb:35:in `dispatch'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-40332a5bea63/lib/shoryuken/manager.rb:49:in `dispatch_later'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-40332a5bea63/lib/shoryuken/manager.rb:35:in `dispatch'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-40332a5bea63/lib/shoryuken/manager.rb:49:in `dispatch_later'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-40332a5bea63/lib/shoryuken/manager.rb:35:in `dispatch'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-40332a5bea63/lib/shoryuken/manager.rb:44:in `dispatch'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-40332a5bea63/lib/shoryuken/manager.rb:49:in `dispatch_later'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-40332a5bea63/lib/shoryuken/manager.rb:35:in `dispatch'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-40332a5bea63/lib/shoryuken/manager.rb:49:in `dispatch_later'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-40332a5bea63/lib/shoryuken/manager.rb:35:in `dispatch'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-40332a5bea63/lib/shoryuken/manager.rb:49:in `dispatch_later'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-40332a5bea63/lib/shoryuken/manager.rb:35:in `dispatch'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-40332a5bea63/lib/shoryuken/manager.rb:44:in `dispatch'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-40332a5bea63/lib/shoryuken/manager.rb:49:in `dispatch_later'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-40332a5bea63/lib/shoryuken/manager.rb:35:in `dispatch'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-40332a5bea63/lib/shoryuken/manager.rb:49:in `dispatch_later'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-40332a5bea63/lib/shoryuken/manager.rb:35:in `dispatch'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-40332a5bea63/lib/shoryuken/manager.rb:49:in `dispatch_later'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-40332a5bea63/lib/shoryuken/manager.rb:35:in `dispatch'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-40332a5bea63/lib/shoryuken/manager.rb:44:in `dispatch'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-40332a5bea63/lib/shoryuken/manager.rb:49:in `dispatch_later'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-40332a5bea63/lib/shoryuken/manager.rb:35:in `dispatch'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-40332a5bea63/lib/shoryuken/manager.rb:49:in `dispatch_later'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-40332a5bea63/lib/shoryuken/manager.rb:35:in `dispatch'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-40332a5bea63/lib/shoryuken/manager.rb:49:in `dispatch_later'

@pjg
Copy link
Author

pjg commented Jul 5, 2017

I reminded myself that we also have a custom Worker class in our setup. Something along those lines:

class SubscribeContactWorker
  include Shoryuken::Worker

  shoryuken_options queue: -> { ... }
  shoryuken_options auto_delete: true, body_parser: self
 
  def self.parse(body)
    data = body.match(/.../)
    Contact.new(*data.captures) if data
  end

  def self.perform_now(body)
    new.subscribe(parse(body))
  end

  def perform(sqs_msg, contact)
    subscribe(contact)
  rescue => e
    sqs_msg.delete
  end

  def subscribe(contact)
     ...

And in the initializer:

class CustomWorkerRegistry < Shoryuken::DefaultWorkerRegistry
  def fetch_worker(queue, message)
    return SubscribeContactWorker.new if SubscribeContactWorker.parse(message.body)
    super
  rescue => _
    ...
  end
end

Shoryuken.configure_client do |config|
  config.sqs_client = Aws::SQS::Client.new
end

Shoryuken.configure_server do |config|
  config.sqs_client = Aws::SQS::Client.new

  Rails.logger = Shoryuken::Logging.logger
  Rails.logger.level = Rails.application.config.log_level
end

Shoryuken.worker_registry = CustomWorkerRegistry.new

@phstc
Copy link
Collaborator

phstc commented Jul 5, 2017

@pjg what's your aws-sdk version? I'm wondering if it's an issue with not being thread-safe.

@pjg
Copy link
Author

pjg commented Jul 5, 2017

I tried with two versions. One latest, one little older. You can see the exact versions in the logs above.

@phstc
Copy link
Collaborator

phstc commented Jul 5, 2017

You can see the exact versions in the logs above.

oops 🤦‍♂️

@phstc
Copy link
Collaborator

phstc commented Jul 5, 2017

I'm now able to reproduce it 🍻

Pablois debugging 🐛

@phstc phstc closed this as completed in #400 Jul 6, 2017
phstc added a commit that referenced this issue Jul 6, 2017
@phstc
Copy link
Collaborator

phstc commented Jul 6, 2017

@pjg took me a very long while, but it seems to be fixed now, I can no longer reproduce it. I will keep the process running during the night, let's see how it goes.

Would you mind test this SHA 27e0c448195af3e72e735d106cf89eae1e8ca32f, it should be good to go.

@pjg
Copy link
Author

pjg commented Jul 6, 2017

I've been running 27e0c4481 for about 12 hours straight now. No issues so far.

@phstc
Copy link
Collaborator

phstc commented Jul 6, 2017

yay awesome! Thanks for reporting and helping with the fix 🍻

OSS 🤘

@pjg
Copy link
Author

pjg commented Jul 7, 2017

Thanks for the fast fix!

@phstc
Copy link
Collaborator

phstc commented Jul 7, 2017

@pjg BTW 3.1.2 is out with the fix

@pjg
Copy link
Author

pjg commented Oct 4, 2017

I have now upgraded all my production servers to use Shoryuken 3.1.12 (from 3.0.7) and something is not right. The moment we upgraded, this graph started to look bad:

image

Our retention period is 5 days, that's why those messages accumulate on this graph. I think that some worker process just takes a job, marks it as being processed and then dies silently, without ever releasing it back.

I have a strong feeling that this is still a regression related to this issue 😞

==== UPDATE ====

Nevermind. It's actually the new exception handling for ActiveJob in the sentry-ruby gem. When it cannot find the serialized object (ActiveRecord), it will not be rescued properly and the message won't be deleted from the queue. Now trying to find a fix for that...

==== UPDATE TWO ====

Just found this guide... Should have read it first before jumping into the new version with native ActiveJob support:
https://github.com/phstc/shoryuken/wiki/Rails-Integration-Active-Job#active-job-support

@phstc
Copy link
Collaborator

phstc commented Oct 4, 2017

@pjg sorry for that. but I'm not sure if I understood the issue, is it because you have non-ActiveJob messages mixed in a queue also consumed by ActiveJob workers? Shoryuken does not force a message format, but ActiveJob does.

@pjg
Copy link
Author

pjg commented Oct 7, 2017

I got to the bottom of this. My issue is that I have ERROR: Processor failed: Couldn't find Site with 'id'=xxx errors in the logs. Those messages are not deleted from the queue when this exception is raised. I'm thinking I need to have some custom rescue somewhere. I believe Rails 5+ ignores those errors in ActiveJob by default.

@phstc
Copy link
Collaborator

phstc commented Oct 8, 2017

@pjg Shoryuken does not auto delete messages if an exception is raised.

Rescuing that error will do, but you can also:

def perform(site_id)
  return unless site = Site.where(id: site_id).first

  # ...
end

BTW are you using a DL queue? If not, have a look at them, I strongly recommend using DL queues.

@pjg
Copy link
Author

pjg commented Oct 8, 2017

I'm serializing whole objects (Site, not site_id) and ActiveJob takes care of handling the unique record identifier, but if record is no longer there, my job will fail and will stay in the queue. Dead letter queue might be the solution here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants