rack · johnnyshields · Mar 20, 2022 · Mar 20, 2022 · Mar 20, 2022 · Mar 20, 2022
diff --git a/README.md b/README.md
@@ -37,6 +37,12 @@ See the [Backing & Hacking blog post](https://www.kickstarter.com/backing-and-ha
 - [Customizing responses](#customizing-responses)
   - [RateLimit headers for well-behaved clients](#ratelimit-headers-for-well-behaved-clients)
 - [Logging & Instrumentation](#logging--instrumentation)
+- [Fault Tolerance & Error Handling](#fault-tolerance--error-handling)
+  - [Built-in error handling](#built-in-error-handling)
+  - [Expose Rails cache errors to Rack::Attack](#expose-rails-cache-errors-to-rackattack)
+  - [Configure cache timeout](#configure-cache-timeout)
+  - [Failure cooldown](#failure-cooldown)
+  - [Custom error handling](#custom-error-handling)
 - [Testing](#testing)
 - [How it works](#how-it-works)
   - [About Tracks](#about-tracks)
@@ -400,11 +406,133 @@ ActiveSupport::Notifications.subscribe(/rack_attack/) do |name, start, finish, r
 end
 ```
 
+## Fault Tolerance & Error Handling
+
+Rack::Attack has a mission-critical dependency on your [cache store](#cache-store-configuration).
+If the cache system experiences an outage, it may cause severe latency within Rack::Attack
+and lead to an overall application outage.
+
+Although Rack::Attack is designed to be "fault-tolerant by default", depending on your application
+setup, additional configuration may be required. Please **read this section carefully** to understand
+how to best protect your application.
+
+### Built-in error handling
+
+As a Rack middleware component, Rack::Attack wraps your application's request handling endpoint.
+When an error occurs within either within Rack::Attack **or** within your application, by default:
+
+- If the error is a Redis or Dalli cache error, Rack::Attack logs the error then allows the request.
+- Otherwise, Rack::Attack raises the error. The request will fail.
+
+All errors will trigger a failure cooldown (see below), regardless of whether they are allowed or raised.
+
+### Expose Rails cache errors to Rack::Attack
+
+If you are using Rack::Attack with Rails cache, by default, Rails cache will **suppress**
+any such errors, and Rack::Attack will not be able to handle them properly as per above.
+This can be dangerous: if your cache is timing out due to high request volume,
+for example, Rack::Attack will continue to blindly send requests to your cache and worsen the problem.
+
+To mitigate this:
+
+* When using Rails cache with `:redis_cache_store`, you'll need to expose errors to Rack::Attack
+with a custom error handler as follows:
+
+    ```ruby
+    # in your Rails config
+    config.cache_store = :redis_cache_store,
+                         { # ...
+                           error_handler: -> (method:, returning:, exception:) do
+                             raise exception if Rack::Attack.calling?
+                           end
+                         }
+    ```
+
+* Rails `:mem_cache_store` and `:dalli_store` suppress all Dalli errors. The recommended
+workaround is to set a [Rack::Attack-specific cache configuration](#cache-store-configuration).
+
+### Configure cache timeout
+
+In your application config, it is recommended to set your cache timeout to 0.1 seconds or lower.
+Please refer to the [Rails Guide](https://guides.rubyonrails.org/caching_with_rails.html).
+
+```ruby
+# Set 100 millisecond timeout on Redis
+config.cache_store = :redis_cache_store,
+                     { # ...
+                       connect_timeout: 0.1,
+                       read_timeout: 0.1,
+                       write_timeout: 0.1
+                     }
+```
+
+To use different timeout values specific to Rack::Attack, you may set a
+[Rack::Attack-specific cache configuration](#cache-store-configuration).
+
+### Failure cooldown
+
+When any error occurs, Rack::Attack becomes disabled for a 60 seconds "cooldown" period.
+This prevents a cache outage from adding timeout latency on each Rack::Attack request.
+All errors trigger the failure cooldown, regardless of whether they are allowed or handled.
+You can configure the cooldown period as follows:
+
+```ruby
+# in initializers/rack_attack.rb
+
+# Disable Rack::Attack for 5 minutes if any cache failure occurs
+Rack::Attack.failure_cooldown = 300
+
+# Do not use failure cooldown
+Rack::Attack.failure_cooldown = nil
+```
+
+### Custom error handling
+
+For most use cases, it is not necessary to re-configure Rack::Attack's default error handling.
+However, there are several ways you may do so.
+
+First, you may specify the list of errors to allow as an array of Class and/or String values.
+
+```ruby
+# in initializers/rack_attack.rb
+Rack::Attack.allowed_errors += [MyErrorClass, 'MyOtherErrorClass']
+```
+
+Alternatively, you may define a custom error handler as a Proc. The error handler will receive all errors,
+regardless of whether they are on the allow list. Your handler should return either `:allow`, `:block`,
+or `:throttle`, or else re-raise the error; other returned values will allow the request.
+
+```ruby
+# Set a custom error handler which blocks allowed errors
+# and raises all others
+Rack::Attack.error_handler = -> (error, request) do
+  if Rack::Attack.allow_error?(error)
+    Rails.logger.warn("Blocking error: #{error.class.name} from IP #{request.ip}")
+    :block
+  else
+    raise(error)
+  end
+end
+```
+
+Lastly, you can define the error handlers as a Symbol shortcut:
+
+```ruby
+# Handle all errors with block response
+Rack::Attack.error_handler = :block
+
+# Handle all errors with throttle response
+Rack::Attack.error_handler = :throttle
+
+# Handle all errors by allowing the request
+Rack::Attack.error_handler = :allow
+```
+
 ## Testing
 
-A note on developing and testing apps using Rack::Attack - if you are using throttling in particular, you will
-need to enable the cache in your development environment. See [Caching with Rails](http://guides.rubyonrails.org/caching_with_rails.html)
-for more on how to do this.
+When developing and testing apps using Rack::Attack, if you are using throttling in particular,
+you must enable the cache in your development environment. See
+[Caching with Rails](http://guides.rubyonrails.org/caching_with_rails.html) for how to do this.
 
 ### Disabling
 

diff --git a/lib/rack/attack.rb b/lib/rack/attack.rb
@@ -32,8 +32,18 @@ class IncompatibleStoreError < Error; end
     autoload :Fail2Ban,             'rack/attack/fail2ban'
     autoload :Allow2Ban,            'rack/attack/allow2ban'
 
+    THREAD_CALLING_KEY = 'rack.attack.calling'
+    DEFAULT_FAILURE_COOLDOWN = 60
+    DEFAULT_ALLOWED_ERRORS = %w[Dalli::DalliError Redis::BaseError].freeze
+
     class << self
-      attr_accessor :enabled, :notifier, :throttle_discriminator_normalizer
+      attr_accessor :enabled,
+                    :notifier,
+                    :throttle_discriminator_normalizer,
+                    :error_handler,
+                    :allowed_errors,
+                    :failure_cooldown
+
       attr_reader :configuration
 
       def instrument(request)
@@ -59,6 +69,40 @@ def reset!
         cache.reset!
       end
 
+      def failed!
+        @last_failure_at = Time.now
+      end
+
+      def failure_cooldown?
+        return false unless @last_failure_at && failure_cooldown
+
+        Time.now < @last_failure_at + failure_cooldown
+      end
+
+      def allow_error?(error)
+        allowed_errors&.any? do |ignored_error|
+          case ignored_error
+          when String then error.class.ancestors.any? {|a| a.name == ignored_error }
+          else error.is_a?(ignored_error)
+          end
+        end
+      end
+
+      def calling?
+        !!thread_store[THREAD_CALLING_KEY]
+      end
+
+      def with_calling
+        thread_store[THREAD_CALLING_KEY] = true
+        yield
+      ensure
+        thread_store[THREAD_CALLING_KEY] = nil
+      end
+
+      def thread_store
+        defined?(RequestStore) ? RequestStore.store : Thread.current
+      end
+
       extend Forwardable
       def_delegators(
         :@configuration,
@@ -86,7 +130,11 @@ def reset!
       )
     end
 
-    # Set defaults
+    # Set class defaults
+    self.failure_cooldown = DEFAULT_FAILURE_COOLDOWN
+    self.allowed_errors = DEFAULT_ALLOWED_ERRORS.dup
+
+    # Set instance defaults
     @enabled = true
     @notifier = ActiveSupport::Notifications if defined?(ActiveSupport::Notifications)
     @throttle_discriminator_normalizer = lambda do |discriminator|
@@ -102,32 +150,89 @@ def initialize(app)
     end
 
     def call(env)
-      return @app.call(env) if !self.class.enabled || env["rack.attack.called"]
+      return @app.call(env) if !self.class.enabled || env["rack.attack.called"] || self.class.failure_cooldown?
 
       env["rack.attack.called"] = true
       env['PATH_INFO'] = PathNormalizer.normalize_path(env['PATH_INFO'])
       request = Rack::Attack::Request.new(env)
+      result = :allow
+
+      self.class.with_calling do
+        begin
+          result = get_result(request)
+        rescue StandardError => error
+          return do_error_response(error, request)
+        end
+      end
+
+      do_response(result, request)
+    end
+
+    private
 
+    def get_result(request)
       if configuration.safelisted?(request)
-        @app.call(env)
+        :allow
       elsif configuration.blocklisted?(request)
-        # Deprecated: Keeping blocklisted_response for backwards compatibility
-        if configuration.blocklisted_response
-          configuration.blocklisted_response.call(env)
-        else
-          configuration.blocklisted_responder.call(request)
-        end
+        :block
       elsif configuration.throttled?(request)
-        # Deprecated: Keeping throttled_response for backwards compatibility
-        if configuration.throttled_response
-          configuration.throttled_response.call(env)
-        else
-          configuration.throttled_responder.call(request)
-        end
+        :throttle
       else
         configuration.tracked?(request)
-        @app.call(env)
+        :allow
+      end
+    end
+
+    def do_response(result, request)
+      case result
+      when :block then do_block_response(request)
+      when :throttle then do_throttle_response(request)
+      else @app.call(request.env)
+      end
+    end
+
+    def do_block_response(request)
+      # Deprecated: Keeping blocklisted_response for backwards compatibility
+      if configuration.blocklisted_response
+        configuration.blocklisted_response.call(request.env)
+      else
+        configuration.blocklisted_responder.call(request)
       end
     end
+
+    def do_throttle_response(request)
+      # Deprecated: Keeping throttled_response for backwards compatibility
+      if configuration.throttled_response
+        configuration.throttled_response.call(request.env)
+      else
+        configuration.throttled_responder.call(request)
+      end
+    end
+
+    def do_error_response(error, request)
+      self.class.failed!
+      result = error_result(error, request)
+      result ? do_response(result, request) : raise(error)
+    end
+
+    def error_result(error, request)
+      handler = self.class.error_handler
+      if handler
+        error_handler_result(handler, error, request)
+      elsif self.class.allow_error?(error)
+        :allow
+      end
+    end
+
+    def error_handler_result(handler, error, request)
+      result = handler
+
+      if handler.is_a?(Proc)
+        args = [error, request].first(handler.arity)
+        result = handler.call(*args) # may raise error
+      end
+
+      %i[block throttle].include?(result) ? result : :allow
+    end
   end
 end
diff --git a/lib/rack/attack/store_proxy/dalli_proxy.rb b/lib/rack/attack/store_proxy/dalli_proxy.rb
@@ -24,34 +24,26 @@ def initialize(client)
         end
 
         def read(key)
-          rescuing do
-            with do |client|
-              client.get(key)
-            end
+          with do |client|
+            client.get(key)
           end
         end
 
         def write(key, value, options = {})
-          rescuing do
-            with do |client|
-              client.set(key, value, options.fetch(:expires_in, 0), raw: true)
-            end
+          with do |client|
+            client.set(key, value, options.fetch(:expires_in, 0), raw: true)
           end
         end
 
         def increment(key, amount, options = {})
-          rescuing do
-            with do |client|
-              client.incr(key, amount, options.fetch(:expires_in, 0), amount)
-            end
+          with do |client|
+            client.incr(key, amount, options.fetch(:expires_in, 0), amount)
           end
         end
 
         def delete(key)
-          rescuing do
-            with do |client|
-              client.delete(key)
-            end
+          with do |client|
+            client.delete(key)
           end
         end
 
@@ -66,12 +58,6 @@ def with
             end
           end
         end
-
-        def rescuing
-          yield
-        rescue Dalli::DalliError
-          nil
-        end
       end
     end
   end