Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

System Hardening Phase 1: Catch errors at top level #639

Closed
wants to merge 4 commits into from

Conversation

johnnyshields
Copy link

@johnnyshields johnnyshields commented Nov 16, 2023

In PR #577, @grzuy asked me to break it into a series of smaller PRs.

My plan is as follows:

  • This PR:
    • Move error catching to top-level of Rack::Attack. This is mandatory to do, because we need to have Rack::Attack be aware of errors in order to implement advanced error handling/fault-tolerance mechanisms in subsequent PRs.
    • Remove the rescuing blocks from Dalli/RedisProxy classes.
    • Add "Rack::Attack.allowed_errors" config which defaults to Dalli::DalliError and Redis::BaseError. This config is useful if you are implementing a custom store class.

Future PRs (will raise the next one after this one is merged)

  • PR 2:
    • Add Rack::Attack.error_handler which takes a Proc for custom error handling.
    • Refactor Rack::Attack a bit so the error_handler can select whether to throttle or block.
  • PR 3: Add Rack::Attack.calling? so Rails Cache error hander can raise its error for Rack::Attack.
  • PR 4: Add Rack::Attack.failure_cooldown config. This temporarily disables Rack::Attack after an error occurs (including allowed errors), to prevent cache connection latency.

Note: This PR will affect the small number of users are directly using the Dalli/RedisProxy classes (or subclassing them, etc.) since the methods of these classes now raise Dalli::DalliError /Redis::BaseError rather than returning nil. It should probably be released as a major version for this reason.

…nd instead catches errors at the top level.
@johnnyshields
Copy link
Author

@grzuy ready for your review.

Copy link
Collaborator

@grzuy grzuy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@johnnyshields Thanks for breaking it up and opening smaller PR.

Left some comments after first pass.

Lets please update PR title to something that explains what this does for end users. Would that be "Allows users to specify list of ignored errors" or something similar?


```ruby
# in initializers/rack_attack.rb
Rack::Attack.allowed_errors += [MyErrorClass, 'MyOtherErrorClass']
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not strong, thoughts on naming ignored_errors instead allowed_errors? 🤔

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@grzuy the reason for "allowed" here is that when these errors occur, RackAttack "allows" the request. This naming will be made more obvious in the next PR, which allows you to add error handlers which throttle or block based on certain errors.

@@ -32,6 +32,7 @@ Gem::Specification.new do |s|
s.add_development_dependency "bundler", ">= 1.17", "< 3.0"
s.add_development_dependency 'minitest', "~> 5.11"
s.add_development_dependency "minitest-stub-const", "~> 0.6"
s.add_development_dependency 'rspec-mocks', '~> 3.11.0'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't looked at the in more detail but would be great if we wouldn't need to add extra dep.

Copy link
Author

@johnnyshields johnnyshields Nov 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this PR need some form of mocking in our tests, and RSpec's mock library is widely-used and has minitest integration.

In the tests you can see I am stubbing certain methods to raise errors. There is no other way to do this in the specs b/c these error would only occur if certain infra such as Redis is down. The only alternative would be to write monkey patches in the tests, which is far worse/messy.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get your point, but maybe we can change where this gem is used for something similar to how exceptions are currently being forced in tests, and we avoid introducing a new dependency? Or are there other places where rspec-mocks is needed?

Copy link
Author

@johnnyshields johnnyshields Nov 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please read my spec/acceptance/error_handling_spec.rb. I am using mocks to trigger various errors which do not normally occur and asserting how they are handled. rspec-mocks was designed exactly for this sort of use case (e.g. testing how novel errors propagate through a system.)

It is certainly possible to force certain errors by strange Redis configurations, however this ultimately is limited in what types of errors we can trigger. I will be using mocks further in subsequent PRs.

end

it 'can ignore error as String' do
Rack::Attack.allowed_errors = %w[RuntimeError]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just go with ["RuntimeError"] so it's more obvious what type it is?

Copy link
Author

@johnnyshields johnnyshields Nov 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Rubocop style guide (according to default settings) says a one-element array can be either ['RuntimeError'] or %w[RuntimeError], but a two-element array must use %w[] e.g. %w[RuntimeError OtherError]

Since I'm using both 1 and 2 element arrays this test (as well as elsewhere), I think it is better to use %w[ ] consistently everywhere. Please confirm whether you'd like me to (a) keep as-is, (b) change only one-element arrays, (c) change both 1 and 2-element arrays.

See: https://docs.rubocop.org/rubocop/cops_style.html#stylewordarray

@grzuy
Copy link
Collaborator

grzuy commented Nov 18, 2023

@santib Would be good to have your thoughts on this one too 🙏 :-)

lib/rack/attack.rb Outdated Show resolved Hide resolved
Comment on lines +149 to +150
rescue StandardError => error
self.class.allow_error?(error) ? @app.call(request.env) : raise(error)
Copy link
Collaborator

@santib santib Nov 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, here we are rescuing StandardError so we can allow strings in the allowed_errors configuration, right? Is there any reason why to not do this?

rescue *self.allowed_errors => error
  @app.call(request.env)
end

because I prefer the simplicity of this alternative if possible

Copy link
Author

@johnnyshields johnnyshields Nov 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did it this way so we can define the allowed_errors as strings, and therefore not have a dependency on Dalli or Redis constants being loaded. Note the allow_error? method is as follows:

      def allow_error?(error)
        allowed_errors&.any? do |allowed_error|
          case allowed_error
          when String then error.class.ancestors.any? {|a| a.name == allowed_error }
          else error.is_a?(allowed_error)
          end
        end
      end

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not have a dependency on Dalli or Redis constants being loaded

Agree on this.

Not really sold on the allow_error? method, I'd prefer to use Ruby's exception handling instead, while finding other way to solve the need to not load Dalli/Redis constants.

I know you didn't like it before, but one way could be the Rack::Attack::CacheError class, but there could be others, for example having this class wrap and re-raise the original error, maybe? Or maybe having the defaults defined by each proxy class which also checks for defined?(Dalli) as they already do in other places?

I think this point is really important and we need to take the time to brainstorm/discuss it properly.

Copy link
Author

@johnnyshields johnnyshields Nov 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@santib In order to invoke rescue *allowed_errors, the allowed_errors variable must contain constants, not strings. We could (could != should) do something like:

def allowed_errors_as_consts
  @cached_error_classes ||= {}
  allowed_errors.map do |err|
    if err.is_a?(String)
      @cached_error_classes[err] ||= Object.const_get(err) rescue NameError
    else
      err
    end
  end.compact
end

This would need to be done dynamically because Dalli/Redis could be loaded at anytime even after we started using Rack::Attack (and allowed_errors could also be modified.)

This is a lot of silly overhead just to achieve the rescue *list line. The code I have is reasonably fast (string comparison of class names) and low-complexity, so lets go with it for now and not let "perfect" become the enemy of "good enough." We can consider further once the full series of PRs have been merged.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to invoke rescue *allowed_errors, the allowed_errors variable must contain constants, not strings.

I know that. But why do we need to also allow Strings? If supporting only constants makes the code simpler, then that works for me.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For reference, there is precedent in other Ruby libs to reference potentially unloaded classes with strings, such as Rails where you can see this:

belongs_to :manager, class_name: "Employee"

Copy link
Author

@johnnyshields johnnyshields Nov 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last thing: this is a moot point. Please see these line in the intended final state of this series of PRs:

https://github.com/rack/rack-attack/pull/577/files#diff-9114718ab0fa2d856234c04d73b07c3758a6aaab1dbfbe87401f4eb4e358f127R218-R225

Note I'm adding an additional error_handler here.

Copy link
Author

@johnnyshields johnnyshields Nov 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rack::Attack::CacheError

I would really prefer that we do not obscure the errors coming from Redis/Dalli itself. It makes it harder to understand the reasons why Redis/Dalli are failing.

Copy link
Collaborator

@santib santib Nov 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rack::Attack::CacheError

I would really prefer that we do not obscure the errors coming from Redis/Dalli itself. It makes it harder to understand the reasons why Redis/Dalli are failing.

it could be just a wrapper (as I said in option 2.), and then pass the original error to the application. Having that wrapper would solve the constant issue we are discussing here.

Copy link
Author

@johnnyshields johnnyshields Nov 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's defer this discussion until I get my other PRs merged. There's no reason to bikeshed and introduce complexity at this point.


```ruby
# in initializers/rack_attack.rb
Rack::Attack.allowed_errors += [MyErrorClass, 'MyOtherErrorClass']
Copy link
Collaborator

@santib santib Nov 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to better understand the need to make this configurable, apart from Redis::BaseConnectionError/Redis::BaseError, and Dalli::DalliError which other errors could potentially need to be handled? What was the exception that caused your issue in production?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some reasons:

  • If you have some complex logic inside your Rack::Attack handlers that raises errors.
  • If you implemented your own store proxy (non-Redis/Dalli)

That being said I'd estimate 98%+ of users are fine with the default Redis/Dalli error handling.

@johnnyshields
Copy link
Author

johnnyshields commented Nov 20, 2023

@santib I feel these discussions are getting lost in the weeds. Ultimately this configurable error handling is not that important.

For now, I am fine to remove the allowed_errors= configuration and just define allowed_errors as follows:

def allowed_errors
   arr = []
   arr << Dalli::DalliError if defined?(Dalli)
   arr << Redis::BaseError if defined?(Redis)
   arr
end

Would you accept this so we can move forward? I'd like to get the other PRs in the series and come back to this issue at the end.

@johnnyshields
Copy link
Author

johnnyshields commented Nov 20, 2023

Closing in favor of #641. We'll revisit making allowed_errors configurable later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants