Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion Gemfile
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,6 @@ gem 'ahoy_matey', '~> 3.0'
gem 'autoprefixer-rails', '~> 10.0'
gem 'aws-sdk-kms', '~> 1.4'
gem 'aws-sdk-ses', '~> 1.6'
gem 'aws-sdk-eventbridge'
gem 'base32-crockford'
gem 'blueprinter', '~> 0.25.3'
gem 'device_detector'
Expand Down
4 changes: 0 additions & 4 deletions Gemfile.lock
Original file line number Diff line number Diff line change
Expand Up @@ -130,9 +130,6 @@ GEM
aws-partitions (~> 1, >= 1.239.0)
aws-sigv4 (~> 1.1)
jmespath (~> 1.0)
aws-sdk-eventbridge (1.18.0)
aws-sdk-core (~> 3, >= 3.109.0)
aws-sigv4 (~> 1.1)
aws-sdk-kms (1.40.0)
aws-sdk-core (~> 3, >= 3.109.0)
aws-sigv4 (~> 1.1)
Expand Down Expand Up @@ -704,7 +701,6 @@ DEPENDENCIES
ahoy_matey (~> 3.0)
autoprefixer-rails (~> 10.0)
aws-sdk-cloudwatchlogs
aws-sdk-eventbridge
aws-sdk-kms (~> 1.4)
aws-sdk-ses (~> 1.6)
axe-core-rspec (~> 4.2)
Expand Down
62 changes: 62 additions & 0 deletions app/jobs/risc_delivery_job.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
class RiscDeliveryJob < ApplicationJob
queue_as :low

retry_on Faraday::TimeoutError, Faraday::ConnectionFailed, wait: :exponentially_longer
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://edgeapi.rubyonrails.org/classes/ActiveJob/Exceptions/ClassMethods.html#method-i-retry_on

The defaults here are 5 retries, and with this exponential config will go from about 3s delay to a few minutes

I think this OK for now? open to suggestions

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems good to me

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

although the default retries is still zero (which I think is fine behavior for now)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

attempts: 5 is the default if I read that doc correctly?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ohhh, I see. I misunderstood. How does it behave with the inline job adapter? Since we won't want it retrying for minutes quite yet 🙂

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah sorry I wasn't clear. Async worker jobs get 5x retries. Inline preserves the current behavior of logging a warning and not retrying

https://github.com/18F/identity-idp/pull/5333/files#diff-0136d3b83a32c6efebf036634bad2428cef4f659538ae4d564b7b97019880795R34-R37


def perform(
push_notification_url:,
jwt:,
event_type:,
issuer:
)
response = faraday.post(
push_notification_url,
jwt,
'Accept' => 'application/json',
'Content-Type' => 'application/secevent+jwt',
) do |req|
req.options.context = {
service_name: inline? ? 'risc_http_push_direct' : 'risc_http_push_async',
}
end

unless response.success?
Rails.logger.warn(
{
event: 'http_push_error',
transport: inline? ? 'direct' : 'async',
event_type: event_type,
service_provider: issuer,
status: response.status,
}.to_json,
)
end
rescue Faraday::TimeoutError, Faraday::ConnectionFailed => err
raise err if !inline?

Rails.logger.warn(
{
event: 'http_push_error',
transport: 'direct',
event_type: event_type,
service_provider: issuer,
error: err.message,
}.to_json,
)
end

def faraday
Faraday.new do |f|
f.request :instrumentation, name: 'request_log.faraday'
f.adapter :net_http
f.options.timeout = 3
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you think of making these configurable?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm open to it! Would we want all Faraday instances to share the same timeout configs, or would we want it different in different contexts?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh I missed this was merged 😛

it's probably fine for all RISC events to have the same timeout, yeah. we have some independent faraday configs for timeouts like outbound_connection_check_timeout, acuant_timeout, lexisnexis_timeout, etc. so we could follow that pattern?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

f.options.read_timeout = 3
f.options.open_timeout = 3
f.options.write_timeout = 3
end
end

def inline?
queue_adapter.is_a?(ActiveJob::QueueAdapters::InlineAdapter)
end
end
88 changes: 12 additions & 76 deletions app/services/push_notification/http_push.rb
Original file line number Diff line number Diff line change
Expand Up @@ -38,72 +38,21 @@ def url_options
def deliver_one(service_provider)
deliver_local(service_provider) if IdentityConfig.store.risc_notifications_local_enabled

if IdentityConfig.store.risc_notifications_eventbridge_enabled
deliver_eventbridge(service_provider)
else
deliver_direct(service_provider)
end
end

def deliver_eventbridge(service_provider)
response = eventbridge_client.put_events(
entries: [
{
time: now,
source: service_provider.issuer,
detail_type: 'notification',
detail: { jwt: jwt(service_provider) }.to_json,
event_bus_name: "#{Identity::Hostdata.env}-risc-notifications",
},
],
)

if response.failed_entry_count.to_i > 0
Rails.logger.warn(
{
event: 'http_push_error',
transport: 'eventbridge',
event_type: event.event_type,
service_provider: service_provider.issuer,
error: response.to_s,
}.to_json,
if IdentityConfig.store.risc_notifications_active_job_enabled
RiscDeliveryJob.perform_later(
push_notification_url: service_provider.push_notification_url,
jwt: jwt(service_provider),
event_type: event.event_type,
issuer: service_provider.issuer,
)
end
end

def deliver_direct(service_provider)
response = faraday.post(
service_provider.push_notification_url,
jwt(service_provider),
'Accept' => 'application/json',
'Content-Type' => 'application/secevent+jwt',
) do |req|
req.options.context = { service_name: 'http_push_direct' }
end

unless response.success?
Rails.logger.warn(
{
event: 'http_push_error',
transport: 'direct',
event_type: event.event_type,
service_provider: service_provider.issuer,
status: response.status,
}.to_json,
else
RiscDeliveryJob.perform_now(
push_notification_url: service_provider.push_notification_url,
jwt: jwt(service_provider),
event_type: event.event_type,
issuer: service_provider.issuer,
)
end
rescue Faraday::TimeoutError,
Faraday::ConnectionFailed,
PushNotification::PushNotificationError => err
Rails.logger.warn(
{
event: 'http_push_error',
transport: 'direct',
event_type: event.event_type,
service_provider: service_provider.issuer,
error: err.message,
}.to_json,
)
end

def deliver_local(service_provider)
Expand Down Expand Up @@ -134,13 +83,6 @@ def jwt_payload(service_provider)
}
end

def faraday
Faraday.new do |f|
f.request :instrumentation, name: 'request_log.faraday'
f.adapter :net_http
end
end

def agency_uuid(service_provider)
AgencyIdentity.find_by(
user_id: event.user.id,
Expand All @@ -151,11 +93,5 @@ def agency_uuid(service_provider)
service_provider: service_provider.issuer,
)&.uuid
end

def eventbridge_client
@eventbridge_client ||= Aws::EventBridge::Client.new(
region: Identity::Hostdata.aws_region,
)
end
end
end
4 changes: 0 additions & 4 deletions app/services/push_notification/push_notification_error.rb

This file was deleted.

2 changes: 1 addition & 1 deletion config/application.yml.default
Original file line number Diff line number Diff line change
Expand Up @@ -177,7 +177,7 @@ requests_per_ip_track_only_mode: 'false'
reset_password_email_max_attempts: '20'
reset_password_email_window_in_minutes: '60'
risc_notifications_local_enabled: 'false'
risc_notifications_eventbridge_enabled: 'false'
risc_notifications_active_job_enabled: 'false'
ruby_workers_enabled: 'true'
rules_of_use_horizon_years: '6'
rules_of_use_updated_at: '2021-05-21T00:00:00Z'
Expand Down
2 changes: 1 addition & 1 deletion lib/identity_config.rb
Original file line number Diff line number Diff line change
Expand Up @@ -244,7 +244,7 @@ def self.build_store(config_map)
config.add(:reset_password_email_max_attempts, type: :integer)
config.add(:reset_password_email_window_in_minutes, type: :integer)
config.add(:risc_notifications_local_enabled, type: :boolean)
config.add(:risc_notifications_eventbridge_enabled, type: :boolean)
config.add(:risc_notifications_active_job_enabled, type: :boolean)
config.add(:ruby_workers_enabled, type: :boolean)
config.add(:rules_of_use_horizon_years, type: :integer)
config.add(:rules_of_use_updated_at, type: :timestamp)
Expand Down
68 changes: 68 additions & 0 deletions spec/jobs/risc_delivery_job_spec.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
require 'rails_helper'

RSpec.describe RiscDeliveryJob do
describe '#perform' do
let(:push_notification_url) { 'https://push.example.gov' }
let(:jwt) { JWT.encode({ foo: 'bar' }, 'a') }
let(:event_type) { PushNotification::IdentifierRecycledEvent::EVENT_TYPE }
let(:issuer) { 'issuer1' }
let(:transport) { 'ruby_worker' }

let(:job) { RiscDeliveryJob.new }
subject(:perform) do
job.perform(
push_notification_url: push_notification_url,
jwt: jwt,
event_type: event_type,
issuer: issuer,
)
end

it 'POSTs the jwt to the given URL' do
req = stub_request(:post, push_notification_url).
with(
body: jwt,
headers: {
'Content-Type' => 'application/secevent+jwt',
'Accept' => 'application/json',
},
)

perform

expect(req).to have_been_requested
end

context 'network errors' do
before do
stub_request(:post, push_notification_url).to_timeout
end

context 'when performed inline' do
it 'warns on timeouts' do
expect(Rails.logger).to receive(:warn) do |msg|
payload = JSON.parse(msg, symbolize_names: true)

expect(payload[:event]).to eq('http_push_error')
expect(payload[:transport]).to eq('direct')
end

expect { perform }.to_not raise_error
end
end

context 'when performed in a worker' do
before do
allow(job).to receive(:queue_adapter).
and_return(ActiveJob::QueueAdapters::GoodJobAdapter.new)
end

it 'raises on timeouts (and retries via ActiveJob)' do
expect(Rails.logger).to_not receive(:warn)

expect { perform }.to raise_error(Faraday::ConnectionFailed)
end
end
end
end
end
Loading