Manager failed: stack level too deep with v3.1 #396

pjg · 2017-07-05T08:43:19Z

I just bumped to 3.1 and now need to revert. Ruby 2.3.3p222.

2017-07-05T07:02:45Z 28438 TID-194sa8 INFO: Ready: 2, Busy: 0, Active Queues: [["...", 1]]
2017-07-05T07:02:48Z 28438 TID-ud0t8 ERROR: Manager failed: stack level too deep
2017-07-05T07:02:48Z 28438 TID-ud0t8 ERROR: /mnt/deploy/shared/bundle/ruby/2.3.0/gems/concurrent-ruby-1.0.5/lib/concurrent/synchronization/mri_lockable_object.rb:38:in `block in
 synchronize'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/concurrent-ruby-1.0.5/lib/concurrent/synchronization/mri_lockable_object.rb:38:in `synchronize'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/concurrent-ruby-1.0.5/lib/concurrent/synchronization/mri_lockable_object.rb:38:in `synchronize'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/concurrent-ruby-1.0.5/lib/concurrent/atomic/event.rb:49:in `set?'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/concurrent-ruby-1.0.5/lib/concurrent/executor/ruby_executor_service.rb:67:in `ns_running?'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/concurrent-ruby-1.0.5/lib/concurrent/executor/abstract_executor_service.rb:42:in `block in running?'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/concurrent-ruby-1.0.5/lib/concurrent/synchronization/mri_lockable_object.rb:38:in `block in synchronize'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/concurrent-ruby-1.0.5/lib/concurrent/synchronization/mri_lockable_object.rb:38:in `synchronize'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/concurrent-ruby-1.0.5/lib/concurrent/synchronization/mri_lockable_object.rb:38:in `synchronize'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/concurrent-ruby-1.0.5/lib/concurrent/executor/abstract_executor_service.rb:42:in `running?'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/shoryuken-3.1.0/lib/shoryuken/manager.rb:28:in `stopped?'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/shoryuken-3.1.0/lib/shoryuken/manager.rb:32:in `dispatch'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/shoryuken-3.1.0/lib/shoryuken/manager.rb:49:in `dispatch_later'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/shoryuken-3.1.0/lib/shoryuken/manager.rb:35:in `dispatch'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/shoryuken-3.1.0/lib/shoryuken/manager.rb:49:in `dispatch_later'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/shoryuken-3.1.0/lib/shoryuken/manager.rb:35:in `dispatch'

And then the log goes on and on for megabytes of backtraces with no end in sight.

I suppose it's caused by this: #389

The text was updated successfully, but these errors were encountered:

phstc · 2017-07-05T13:35:42Z

Hi @pjg

Could you share your shoryuken.yml? Did you change it with the 3.1.0 upgrade?

pjg · 2017-07-05T13:36:41Z

Nothing fancy on my end:

concurrency: 2
delay: 3
timeout: 30

phstc · 2017-07-05T13:37:22Z

@pjg did you get that error just by running Shoryuken? Or was that during stop?

pjg · 2017-07-05T13:39:29Z

I got it just by running. And it's been flooding my log. But it auto-restarted too, with 3.1

pjg · 2017-07-05T13:39:46Z

And I didn't change my config at all.

phstc · 2017-07-05T13:44:18Z

@pjg I couldn't reproduce it, but I believe Concurrent.global_io_executor.running? is the offender. TBH I think we are safe only with @done.true?, I added the running? checker just to be extra careful.

Is this error easy for you to reproduce? If so, can I do a quick change, for you to test?

pjg · 2017-07-05T14:13:41Z

Me and other developer in my team were not able to reproduce it on our Macbooks. I only saw the issue on a fairly old Ubuntu server, 12.04.4 LTS, 3.2.0-32 kernel. When I saw it I downgraded, so cannot reproduce it now.

FWIW the queue processing was working fine, just that the system would constantly broke down and restart itself (or via my monit setup, couldn't tell).

phstc · 2017-07-05T14:40:17Z

@pjg got it! I will test it on Ubuntu and let you know.

@done

Rely only on @done for running managers Fix #396

phstc · 2017-07-05T16:08:25Z

Hi @pjg

I couldn't reproduce that even in a Ubuntu box. Anyway, I did a change that I hope can fix the issue. Would you be able to test it before I release a new hotfix version?

gem 'shoryuken', git: 'git://github.com/phstc/shoryuken.git', ref: '484efc0'

pjg · 2017-07-05T18:56:31Z

I can reproduce it with two queues, one with priority 1 and the other with 5. It "crashes" after few minute of idling. Unfortunately, I can also reproduce when bundling the 484efc0 ref :/

Similar backtrace:

2017-07-05T18:35:28Z 20499 TID-2fvuc INFO: Received TERM, will shutdown down
2017-07-05T18:35:28Z 20499 TID-2fvuc INFO: Shutting down
2017-07-05T18:35:38Z 20987 TID-508os INFO: Starting
2017-07-05T18:45:24Z 20987 TID-ov2y558ec ERROR: Manager failed: stack level too deep
2017-07-05T18:45:24Z 20987 TID-ov2y558ec ERROR: /mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.9.19/lib/aws-sdk-core/query/param_list.rb:18:in `set'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.9.19/lib/aws-sdk-core/query/param_builder.rb:79:in `set'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.9.19/lib/aws-sdk-core/query/param_builder.rb:70:in `format'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.9.19/lib/aws-sdk-core/query/param_builder.rb:47:in `block in list'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.9.19/lib/aws-sdk-core/query/param_builder.rb:46:in `each'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.9.19/lib/aws-sdk-core/query/param_builder.rb:46:in `with_index'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.9.19/lib/aws-sdk-core/query/param_builder.rb:46:in `list'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.9.19/lib/aws-sdk-core/query/param_builder.rb:66:in `format'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.9.19/lib/aws-sdk-core/query/param_builder.rb:26:in `block in structure'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.9.19/lib/aws-sdk-core/query/param_builder.rb:23:in `each_pair'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.9.19/lib/aws-sdk-core/query/param_builder.rb:23:in `structure'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.9.19/lib/aws-sdk-core/query/param_builder.rb:16:in `apply'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.9.19/lib/aws-sdk-core/query/handler.rb:48:in `apply_params'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.9.19/lib/aws-sdk-core/query/handler.rb:42:in `build_request'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.9.19/lib/aws-sdk-core/query/handler.rb:26:in `call'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.9.19/lib/aws-sdk-core/plugins/user_agent.rb:12:in `call'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.9.19/lib/seahorse/client/plugins/endpoint.rb:41:in `call'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.9.19/lib/aws-sdk-core/plugins/param_validator.rb:21:in `call'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.9.19/lib/seahorse/client/plugins/raise_response_errors.rb:14:in `call'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.9.19/lib/aws-sdk-core/plugins/jsonvalue_converter.rb:20:in `call'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.9.19/lib/aws-sdk-core/plugins/idempotency_token.rb:18:in `call'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.9.19/lib/aws-sdk-core/plugins/param_converter.rb:20:in `call'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.9.19/lib/seahorse/client/plugins/response_target.rb:21:in `call'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.9.19/lib/seahorse/client/request.rb:70:in `send_request'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.9.19/lib/seahorse/client/base.rb:207:in `block (2 levels) in define_operation_methods'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-484efc04cb51/lib/shoryuken/queue.rb:43:in `receive_messages'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-484efc04cb51/lib/shoryuken/fetcher.rb:40:in `receive_messages'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-484efc04cb51/lib/shoryuken/fetcher.rb:18:in `fetch'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-484efc04cb51/lib/shoryuken/manager.rb:83:in `dispatch_single_messages'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-484efc04cb51/lib/shoryuken/manager.rb:42:in `dispatch'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-484efc04cb51/lib/shoryuken/manager.rb:49:in `dispatch_later'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-484efc04cb51/lib/shoryuken/manager.rb:35:in `dispatch'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-484efc04cb51/lib/shoryuken/manager.rb:49:in `dispatch_later'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-484efc04cb51/lib/shoryuken/manager.rb:35:in `dispatch'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-484efc04cb51/lib/shoryuken/manager.rb:49:in `dispatch_later'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-484efc04cb51/lib/shoryuken/manager.rb:35:in `dispatch'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-484efc04cb51/lib/shoryuken/manager.rb:49:in `dispatch_later'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-484efc04cb51/lib/shoryuken/manager.rb:35:in `dispatch'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-484efc04cb51/lib/shoryuken/manager.rb:49:in `dispatch_later'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-484efc04cb51/lib/shoryuken/manager.rb:35:in `dispatch'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-484efc04cb51/lib/shoryuken/manager.rb:49:in `dispatch_later'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-484efc04cb51/lib/shoryuken/manager.rb:35:in `dispatch'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-484efc04cb51/lib/shoryuken/manager.rb:49:in `dispatch_later'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-484efc04cb51/lib/shoryuken/manager.rb:35:in `dispatch'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-484efc04cb51/lib/shoryuken/manager.rb:49:in `dispatch_later'

phstc · 2017-07-05T19:05:19Z

@pjg thanks! at least the backtrace is different now. I will try to reproduce it again.

phstc · 2017-07-05T19:06:21Z

I'm wondering since the error is the same (no matter the backtrace), if this sleep is too short.

See #396 (comment)

phstc · 2017-07-05T19:10:10Z

@pjg if you have a chance to test before me new SHA is 40332a5.

But I don't know if that would help much TBH, that sleep is around since 3.x. Maybe a difference now is that we have multiple threads trying to use the AWS::SQS::Client at once.

pjg · 2017-07-05T20:01:07Z

Sorry, but it still breaks. Just takes a while longer to break (whole 40 minutes) of idling (I've also upgraded aws-sdk this time).

2017-07-05T19:16:08Z 22700 TID-4vbp4 INFO: Received TERM, will shutdown down
2017-07-05T19:16:08Z 22700 TID-4vbp4 INFO: Shutting down
2017-07-05T19:16:18Z 23211 TID-2bbi8 INFO: Starting
2017-07-05T19:58:36Z 23211 TID-ow2nl9u54 ERROR: Manager failed: stack level too deep
2017-07-05T19:58:36Z 23211 TID-ow2nl9u54 ERROR: /usr/lib/ruby/2.3.0/uri/generic.rb:196:in `initialize'
/usr/lib/ruby/2.3.0/net/http.rb:1051:in `new'
/usr/lib/ruby/2.3.0/net/http.rb:1051:in `proxy_uri'
/usr/lib/ruby/2.3.0/net/http.rb:1038:in `proxy?'
/usr/lib/ruby/2.3.0/net/http.rb:869:in `connect'
/usr/lib/ruby/2.3.0/net/http.rb:1476:in `begin_transport'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.10.8/lib/seahorse/client/net_http/patches.rb:25:in `new_transport_request'
/usr/lib/ruby/2.3.0/net/http.rb:1407:in `request'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.10.8/lib/seahorse/client/net_http/connection_pool.rb:330:in `request'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.10.8/lib/seahorse/client/net_http/handler.rb:72:in `block in transmit'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.10.8/lib/seahorse/client/net_http/handler.rb:121:in `block in session'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.10.8/lib/seahorse/client/net_http/connection_pool.rb:96:in `session_for'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.10.8/lib/seahorse/client/net_http/handler.rb:119:in `session'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.10.8/lib/seahorse/client/net_http/handler.rb:71:in `transmit'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.10.8/lib/seahorse/client/net_http/handler.rb:45:in `call'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.10.8/lib/seahorse/client/plugins/content_length.rb:12:in `call'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.10.8/lib/aws-sdk-core/xml/error_handler.rb:8:in `call'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.10.8/lib/aws-sdk-core/plugins/request_signer.rb:88:in `call'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.10.8/lib/aws-sdk-core/plugins/helpful_socket_errors.rb:10:in `call'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.10.8/lib/aws-sdk-core/plugins/retry_errors.rb:89:in `call'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.10.8/lib/aws-sdk-core/plugins/sqs_queue_urls.rb:13:in `call'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.10.8/lib/aws-sdk-core/query/handler.rb:27:in `call'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.10.8/lib/aws-sdk-core/plugins/user_agent.rb:12:in `call'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.10.8/lib/seahorse/client/plugins/endpoint.rb:41:in `call'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.10.8/lib/aws-sdk-core/plugins/param_validator.rb:21:in `call'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.10.8/lib/seahorse/client/plugins/raise_response_errors.rb:14:in `call'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.10.8/lib/aws-sdk-core/plugins/jsonvalue_converter.rb:20:in `call'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.10.8/lib/aws-sdk-core/plugins/idempotency_token.rb:18:in `call'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.10.8/lib/aws-sdk-core/plugins/param_converter.rb:20:in `call'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.10.8/lib/seahorse/client/plugins/response_target.rb:21:in `call'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.10.8/lib/seahorse/client/request.rb:70:in `send_request'
/mnt/deploy/shared/bundle/ruby/2.3.0/gems/aws-sdk-core-2.10.8/lib/seahorse/client/base.rb:207:in `block (2 levels) in define_operation_methods'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-40332a5bea63/lib/shoryuken/queue.rb:43:in `receive_messages'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-40332a5bea63/lib/shoryuken/fetcher.rb:40:in `receive_messages'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-40332a5bea63/lib/shoryuken/fetcher.rb:18:in `fetch'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-40332a5bea63/lib/shoryuken/manager.rb:83:in `dispatch_single_messages'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-40332a5bea63/lib/shoryuken/manager.rb:42:in `dispatch'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-40332a5bea63/lib/shoryuken/manager.rb:49:in `dispatch_later'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-40332a5bea63/lib/shoryuken/manager.rb:35:in `dispatch'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-40332a5bea63/lib/shoryuken/manager.rb:49:in `dispatch_later'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-40332a5bea63/lib/shoryuken/manager.rb:35:in `dispatch'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-40332a5bea63/lib/shoryuken/manager.rb:49:in `dispatch_later'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-40332a5bea63/lib/shoryuken/manager.rb:35:in `dispatch'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-40332a5bea63/lib/shoryuken/manager.rb:44:in `dispatch'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-40332a5bea63/lib/shoryuken/manager.rb:49:in `dispatch_later'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-40332a5bea63/lib/shoryuken/manager.rb:35:in `dispatch'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-40332a5bea63/lib/shoryuken/manager.rb:49:in `dispatch_later'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-40332a5bea63/lib/shoryuken/manager.rb:35:in `dispatch'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-40332a5bea63/lib/shoryuken/manager.rb:49:in `dispatch_later'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-40332a5bea63/lib/shoryuken/manager.rb:35:in `dispatch'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-40332a5bea63/lib/shoryuken/manager.rb:44:in `dispatch'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-40332a5bea63/lib/shoryuken/manager.rb:49:in `dispatch_later'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-40332a5bea63/lib/shoryuken/manager.rb:35:in `dispatch'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-40332a5bea63/lib/shoryuken/manager.rb:49:in `dispatch_later'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-40332a5bea63/lib/shoryuken/manager.rb:35:in `dispatch'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-40332a5bea63/lib/shoryuken/manager.rb:49:in `dispatch_later'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-40332a5bea63/lib/shoryuken/manager.rb:35:in `dispatch'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-40332a5bea63/lib/shoryuken/manager.rb:44:in `dispatch'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-40332a5bea63/lib/shoryuken/manager.rb:49:in `dispatch_later'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-40332a5bea63/lib/shoryuken/manager.rb:35:in `dispatch'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-40332a5bea63/lib/shoryuken/manager.rb:49:in `dispatch_later'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-40332a5bea63/lib/shoryuken/manager.rb:35:in `dispatch'
/mnt/deploy/shared/bundle/ruby/2.3.0/bundler/gems/shoryuken-40332a5bea63/lib/shoryuken/manager.rb:49:in `dispatch_later'

pjg · 2017-07-05T20:22:34Z

I reminded myself that we also have a custom Worker class in our setup. Something along those lines:

class SubscribeContactWorker
  include Shoryuken::Worker

  shoryuken_options queue: -> { ... }
  shoryuken_options auto_delete: true, body_parser: self
 
  def self.parse(body)
    data = body.match(/.../)
    Contact.new(*data.captures) if data
  end

  def self.perform_now(body)
    new.subscribe(parse(body))
  end

  def perform(sqs_msg, contact)
    subscribe(contact)
  rescue => e
    sqs_msg.delete
  end

  def subscribe(contact)
     ...

And in the initializer:

class CustomWorkerRegistry < Shoryuken::DefaultWorkerRegistry
  def fetch_worker(queue, message)
    return SubscribeContactWorker.new if SubscribeContactWorker.parse(message.body)
    super
  rescue => _
    ...
  end
end

Shoryuken.configure_client do |config|
  config.sqs_client = Aws::SQS::Client.new
end

Shoryuken.configure_server do |config|
  config.sqs_client = Aws::SQS::Client.new

  Rails.logger = Shoryuken::Logging.logger
  Rails.logger.level = Rails.application.config.log_level
end

Shoryuken.worker_registry = CustomWorkerRegistry.new

phstc · 2017-07-05T20:24:43Z

@pjg what's your aws-sdk version? I'm wondering if it's an issue with not being thread-safe.

pjg · 2017-07-05T20:27:12Z

I tried with two versions. One latest, one little older. You can see the exact versions in the logs above.

phstc · 2017-07-05T20:31:07Z

You can see the exact versions in the logs above.

oops 🤦‍♂️

phstc · 2017-07-05T20:54:21Z

I'm now able to reproduce it 🍻

is debugging 🐛

Keep trying... fix #396

phstc · 2017-07-06T03:35:24Z

@pjg took me a very long while, but it seems to be fixed now, I can no longer reproduce it. I will keep the process running during the night, let's see how it goes.

Would you mind test this SHA 27e0c448195af3e72e735d106cf89eae1e8ca32f, it should be good to go.

pjg · 2017-07-06T19:23:56Z

I've been running 27e0c4481 for about 12 hours straight now. No issues so far.

phstc · 2017-07-06T19:25:32Z

yay awesome! Thanks for reporting and helping with the fix 🍻

OSS 🤘

pjg · 2017-07-07T09:40:31Z

Thanks for the fast fix!

phstc · 2017-07-07T12:17:03Z

@pjg BTW 3.1.2 is out with the fix

pjg · 2017-10-04T08:43:10Z

I have now upgraded all my production servers to use Shoryuken 3.1.12 (from 3.0.7) and something is not right. The moment we upgraded, this graph started to look bad:

Our retention period is 5 days, that's why those messages accumulate on this graph. I think that some worker process just takes a job, marks it as being processed and then dies silently, without ever releasing it back.

I have a strong feeling that this is still a regression related to this issue 😞

==== UPDATE ====

Nevermind. It's actually the new exception handling for ActiveJob in the sentry-ruby gem. When it cannot find the serialized object (ActiveRecord), it will not be rescued properly and the message won't be deleted from the queue. Now trying to find a fix for that...

==== UPDATE TWO ====

Just found this guide... Should have read it first before jumping into the new version with native ActiveJob support:
https://github.com/phstc/shoryuken/wiki/Rails-Integration-Active-Job#active-job-support

phstc · 2017-10-04T12:23:45Z

@pjg sorry for that. but I'm not sure if I understood the issue, is it because you have non-ActiveJob messages mixed in a queue also consumed by ActiveJob workers? Shoryuken does not force a message format, but ActiveJob does.

pjg · 2017-10-07T15:53:06Z

I got to the bottom of this. My issue is that I have ERROR: Processor failed: Couldn't find Site with 'id'=xxx errors in the logs. Those messages are not deleted from the queue when this exception is raised. I'm thinking I need to have some custom rescue somewhere. I believe Rails 5+ ignores those errors in ActiveJob by default.

phstc · 2017-10-08T00:34:14Z

@pjg Shoryuken does not auto delete messages if an exception is raised.

Rescuing that error will do, but you can also:

def perform(site_id)
  return unless site = Site.where(id: site_id).first

  # ...
end

BTW are you using a DL queue? If not, have a look at them, I strongly recommend using DL queues.

pjg · 2017-10-08T13:17:55Z

I'm serializing whole objects (Site, not site_id) and ActiveJob takes care of handling the unique record identifier, but if record is no longer there, my job will fail and will stay in the queue. Dead letter queue might be the solution here.

phstc pushed a commit that referenced this issue Jul 5, 2017

Fix #396

063b762

phstc mentioned this issue Jul 5, 2017

Rely only on @done for running managers #399

Merged

phstc closed this as completed in #399 Jul 5, 2017

phstc added a commit that referenced this issue Jul 5, 2017

Merge pull request #399 from phstc/hotfix/stack-level-too-deep-396

484efc0

Rely only on @done for running managers Fix #396

phstc reopened this Jul 5, 2017

phstc pushed a commit that referenced this issue Jul 5, 2017

Keep trying...

40332a5

See #396 (comment)

phstc mentioned this issue Jul 5, 2017

Keep trying... #400

Merged

phstc closed this as completed in #400 Jul 6, 2017

phstc added a commit that referenced this issue Jul 6, 2017

Merge pull request #400 from phstc/hotfix/stack-level-too-deep-396

27e0c44

Keep trying... fix #396

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Manager failed: stack level too deep with v3.1 #396

Manager failed: stack level too deep with v3.1 #396

pjg commented Jul 5, 2017 •

edited

Loading

phstc commented Jul 5, 2017 •

edited

Loading

pjg commented Jul 5, 2017

phstc commented Jul 5, 2017

pjg commented Jul 5, 2017

pjg commented Jul 5, 2017

phstc commented Jul 5, 2017

pjg commented Jul 5, 2017

phstc commented Jul 5, 2017

phstc commented Jul 5, 2017 •

edited

Loading

pjg commented Jul 5, 2017

phstc commented Jul 5, 2017

phstc commented Jul 5, 2017

phstc commented Jul 5, 2017

pjg commented Jul 5, 2017 •

edited

Loading

pjg commented Jul 5, 2017

phstc commented Jul 5, 2017

pjg commented Jul 5, 2017

phstc commented Jul 5, 2017

phstc commented Jul 5, 2017

phstc commented Jul 6, 2017

pjg commented Jul 6, 2017

phstc commented Jul 6, 2017

pjg commented Jul 7, 2017

phstc commented Jul 7, 2017

pjg commented Oct 4, 2017 •

edited

Loading

phstc commented Oct 4, 2017

pjg commented Oct 7, 2017

phstc commented Oct 8, 2017

pjg commented Oct 8, 2017 •

edited

Loading

Manager failed: stack level too deep with v3.1 #396

Manager failed: stack level too deep with v3.1 #396

Comments

pjg commented Jul 5, 2017 • edited Loading

phstc commented Jul 5, 2017 • edited Loading

pjg commented Jul 5, 2017

phstc commented Jul 5, 2017

pjg commented Jul 5, 2017

pjg commented Jul 5, 2017

phstc commented Jul 5, 2017

pjg commented Jul 5, 2017

phstc commented Jul 5, 2017

phstc commented Jul 5, 2017 • edited Loading

pjg commented Jul 5, 2017

phstc commented Jul 5, 2017

phstc commented Jul 5, 2017

phstc commented Jul 5, 2017

pjg commented Jul 5, 2017 • edited Loading

pjg commented Jul 5, 2017

phstc commented Jul 5, 2017

pjg commented Jul 5, 2017

phstc commented Jul 5, 2017

phstc commented Jul 5, 2017

phstc commented Jul 6, 2017

pjg commented Jul 6, 2017

phstc commented Jul 6, 2017

pjg commented Jul 7, 2017

phstc commented Jul 7, 2017

pjg commented Oct 4, 2017 • edited Loading

phstc commented Oct 4, 2017

pjg commented Oct 7, 2017

phstc commented Oct 8, 2017

pjg commented Oct 8, 2017 • edited Loading

pjg commented Jul 5, 2017 •

edited

Loading

phstc commented Jul 5, 2017 •

edited

Loading

phstc commented Jul 5, 2017 •

edited

Loading

pjg commented Jul 5, 2017 •

edited

Loading

pjg commented Oct 4, 2017 •

edited

Loading

pjg commented Oct 8, 2017 •

edited

Loading