Skip to content

Add integration error event#11615

Merged
Sgtpluck merged 14 commits intomainfrom
dmm/add-int-error-event
Dec 12, 2024
Merged

Add integration error event#11615
Sgtpluck merged 14 commits intomainfrom
dmm/add-int-error-event

Conversation

@Sgtpluck
Copy link
Copy Markdown
Contributor

🎫 Ticket

There is no ticket! This is part of a hackthon project. The goals of this project is to monitor authentication and logout requests to be on the lookout for broken integrations. This will allow us to be more proactive with partners who may be struggling with their integrations.

The existing error logging, however, was not really designed for pulling out specific details and notifying us about them, so a new event seemed like a good fit for this problem.

🛠 Summary of changes

This change:

  • Adds a new event called integration_errors_present
  • Wires that event into our existing code flow, whenever a request is made that can fail.

There is some duplicative code that is similar but not exactly the same. It could probably be pulled out into a shared module or something. Should I do that? Opinions welcome!

I also am happy to add more fields to the new event if they seem useful -- what do people think?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in my experience, it's easier to filter for these in CW by having them be individual properties with true/false? So like:

Suggested change
error_types: [:saml_request_errors],
error_types: { saml_request_errors: true },

but I know there are LIKE queries we can use with the array form as well

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmm i don't feel super strongly about this, since just having an event for these kinds of errors is going to make searching for them much easier. so am happy to give that structure a go.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated in 647b717

@Sgtpluck Sgtpluck force-pushed the dmm/add-int-error-event branch from 48e27bd to 38bd17a Compare December 11, 2024 18:32
@Sgtpluck Sgtpluck requested a review from a team December 11, 2024 19:51
analytics.integration_errors_present(
**result.
to_h[:integration_errors].
merge({ event: :oidc_logout_submitted }),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

supernit for style! you can drop the optional curly braces

Suggested change
merge({ event: :oidc_logout_submitted }),
merge(event: :oidc_logout_submitted),

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated in d011de2

)
end

# @param [Array] error_details Full messages of the errors
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Add a description that explains what this event is for

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated in d011de2

capture_analytics
track_integration_errors(
event: :saml_auth_request,
errors: result.errors.values.flatten,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Is there any chance that result.errors will be nil or empty? If so, does it make sense for this event to still get logged?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no chance -- return if result.success? indicates that any flow that does not include an error will return before getting here.

# @param [Symbol] event What part of the workflow the error occured in
# @param [Boolean] integration_exists Whether the requesting issuer maps to an SP
# @param [String] request_issuer The issuer in the request
def integration_errors_present(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Consider prefixing the method with something like sp_ or idp_. As we have integrations with other vendors as well as registered service_provider integrations, this current name is a bit confusing without a lot more context.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated in d011de2

Comment on lines +28 to +29
else
if result.extra[:integration_errors].present?
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Consider extracting this to a private method that takes result and the value for :event

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated in d011de2 -- the number of private methods we have in that controller now that are taking result as an arg is a smell that makes me suspect we are doing some work in the wrong place. but, refactoring that is outside the scope of this PR!

Copy link
Copy Markdown
Contributor

@lmgeorge lmgeorge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This mostly looks good and my suggestions are pretty minor.

The only significant issue I saw was around the event property in the integration_errors_present body. Depending on how you plan to use the integration_errors_present event, the current implementation will not support proper correlation between the integration_errors_present event and the workflow event specified in the event field as most of the workflow events referenced in this changeset still use an arbitrary string, not the method symbol, as the actual event name.

analytics.integration_errors_present(
**result.
to_h[:integration_errors].
merge({ event: :oidc_logout_requested }),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: The method AnalyticsEvents#oidc_logout_requested tracks the 'OIDC Logout Requested' event. If there is any need to correlate these two events in CloudWatch, this won't work as is.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah great observation -- this event field wasn't intended to correlate in cloudwatch, it was just an indicator to the IEs about where the error is occurring! it felt easiest to use the method name of the event that is firing for discoverability.

i think if we find in our usage that we'd like to correlate, i can update in later PRs. thanks for planting that seed!

if result.success? && redirect_uri
handle_logout(result, redirect_uri)
else
if result.extra[:integration_errors].present?
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: The method AnalyticsEvents#oidc_logout_submitted tracks the 'OIDC Logout Submitted' event. If there was any need to correlate events in CloudWatch, this won't work as is.

@Sgtpluck Sgtpluck merged commit acd5125 into main Dec 12, 2024
@Sgtpluck Sgtpluck deleted the dmm/add-int-error-event branch December 12, 2024 14:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants