Replace NewRelic browser instrumentation with custom error handler by aduth · Pull Request #8950 · 18F/identity-idp

aduth · 2023-08-07T20:17:25Z

🛠 Summary of changes

Implements a frontend error logger, replacing the NewRelic browser instrumentation.

Why?

Removes the large NewRelic loader snippet from every page
- The snippet exceeds 50% of the markup size of the Sign In page (15kb of 27kb, before compression)
- The snippet is out-of-date, in part because the latest version of the snippet is even larger in size
- The snippet is rendered inline and is therefore not cacheable
- The snippet is render-blocking
- The practical impact here is a reduction of ~~98.2%, from 6.84kb of uncacheable gzip-compressed JavaScript, to 124 bytes of cacheable brotli-compressed JavaScript.†~~ Edit (after revisions): 99.0%, from 6.84kb to 67 bytes gzipped
Consolidates logging
Reduce number of third-party JavaScript scripts present in the page, to reduce vulnerability surface area
Allows for greater control over how errors are logged, e.g. excluding errors occurring from browser extensions
We don't use most of the frontend NewRelic features, and are largely interested in uncaught errors

📜 Testing Plan

Go to http://localhost:3000
tail -n1 log/events.log
See "Frontend error" log result with name, message, and stack properties

app/javascript/packages/analytics/index.ts

app/controllers/frontend_log_controller.rb

app/views/layouts/base.html.erb

app/javascript/packs/track-errors.ts

zachmargolis · 2023-08-08T18:25:15Z

We don't use most of the frontend NewRelic features, and are largely interested in uncaught errors

I'll admit I don't look at it often, but I do like knowing we have access to some of the page load times, as reported by the browser. I worry about the number of logs we'd create if we tried to log that ourselves, but some of that could be nice (loading time, time to first paint, etc etc)

aduth · 2023-08-08T18:31:37Z

We don't use most of the frontend NewRelic features, and are largely interested in uncaught errors

I'll admit I don't look at it often, but I do like knowing we have access to some of the page load times, as reported by the browser. I worry about the number of logs we'd create if we tried to log that ourselves, but some of that could be nice (loading time, time to first paint, etc etc)

It's a paradox where NewRelic's tooling to diagnose slow page loads would inevitably lead us to the conclusion that NewRelic's own tooling contributes most significantly to it? 😄

For this, I wasn't sure how much value we'd get out of this data vs. something we could debug or simulate locally, since I wouldn't really expect a lot of variability between usage in the wild. Alternatively, I'd wondered if the synthetic monitors would give us similar data in NewRelic, but act externally without needing to be part of each user's request.

zachmargolis · 2023-08-08T18:36:18Z

For this, I wasn't sure how much value we'd get out of this data vs. something we could debug or simulate locally, since I wouldn't really expect a lot of variability between usage in the wild. Alternatively, I'd wondered if the synthetic monitors would give us similar data in NewRelic, but act externally without needing to be part of each user's request.

My understanding is that people's internet connections (which vary a lot) would definitely contribute to things like asset loading, etc etc, so we'd get much more value of the overall sense (and the long tail) by measuring real user usage. Maybe we could randomly sample 1/1000 pageviews and send that data so we get something but not an avalanche of everything

aduth · 2023-08-08T18:40:41Z

My understanding is that people's internet connections (which vary a lot) would definitely contribute to things like asset loading, etc etc, so we'd get much more value of the overall sense (and the long tail) by measuring real user usage. Maybe we could randomly sample 1/1000 pageviews and send that data so we get something but not an avalanche of everything

I'm sure it's not 100% accurate, but we can simulate network conditions and even CPU conditions with Chrome's tooling:

Looks like it's even built-in to the new "Performance insights" tab:

zachmargolis · 2023-08-29T20:00:04Z

app/controllers/frontend_log_controller.rb

+    if error_event?
+      NewRelic::Agent.notice_error(FrontendError.new, custom_params: log_params[:payload].to_h)
+    else
+      frontend_logger.track_event(log_params[:event], log_params[:payload].to_h)
+    end


what if we brought back the "symbol or proc" logic we used to have before #7110 in EVENT_MAP and then supply a proc for this event?

what if we brought back the "symbol or proc" logic we used to have before #7110 in EVENT_MAP and then supply a proc for this event?

Ooh, that's a very interesting idea! Let me give that a shot.

I had fun and gave it a shot here #9117

aduth · 2023-08-29T20:41:14Z

app/javascript/packages/analytics/index.ts

-interface NewRelicGlobals {
-  newrelic?: NewRelicAgent;
-}
+export { default as isTrackableErrorEvent } from './is-trackable-error-event';


Follow-up task: Would like to move trackEvent and trackError to dedicated files track-event.ts and track-error.ts so this index.ts can actually be an index.

aduth · 2023-08-29T20:42:00Z

app/views/layouts/base.html.erb

        false,
      ) %>
  <%= javascript_packs_tag_once('application', prepend: true) %>
+  <%= javascript_packs_tag_once('track-errors') if BrowserSupport.supported?(request.user_agent) %>


Question: Can we defer this?

Question: Can we defer this?

I think we can and should make this <script async defer>, with the benefit being that the page can finish being considered "loaded" without being blocked by the tracker.

I still need to get some clarification about async script load order when combined with non-async scripts (related Slack discussion), and this'll require some revisions to ScriptHelper / AssetSources to make sure the extra attributes are carried across to the rendered script tag.

I'll plan to explore this as a follow-on task.

aduth · 2023-09-05T18:46:11Z

I think this is in mostly good shape as-is. One idea I'd been contemplating is checking to see if it's feasible enough to directly call to NewRelic's browser error logging endpoints, i.e. treat this as a tiny drop-in replacement for their existing browser agent for the error logging in particular. That's quite a departure from the initial direction and not sure it's worth pursuing, but figured I should at least check.

zachmargolis · 2023-09-05T20:37:15Z

One idea I'd been contemplating is checking to see if it's feasible enough to directly call to NewRelic's browser error logging endpoints, i.e. treat this as a tiny drop-in replacement for their existing browser agent for the error logging in particular.

I think that would be a great follow-up PR!

Honestly I think it would be worth considering having this log to events.log only first, just to understand how many of these we'd be seeing/sending, before sending to NewRelic

Or maybe set expected: true (see api docs) so that we don't immediately affect monitoring, etc

aduth · 2023-09-13T17:00:06Z

I investigated how JavaScript error logging is currently handled in the NewRelic client, and it does seem pretty straight forward as far as POST-ing to a URL with the application ID and an embedded payload of error details, but it might take some sleuthing through the internals of their browser agent code to fully understand what all of the parameters mean.

I think to minimize the changes necessary to get this merged, I'll go ahead and add the expected: true argument so that we will log this to NewRelic APM, but avoid any unnecessary alerting until we get a handle on how expected these truly are.

changelog: Internal, Error Tracking, Implement replacement frontend error logger

Previously trying to avoid CDATA , but more standard

This reverts commit bafdea8.

…ller error reporting

… its thing - bind_call calls a specific method implementation, so it doesn't go through the Enhancer implementation

This reverts commit ae086b4.

The method signature is lost if read directly from analytics instance, since it's overridden by AnalyticsEventsEnhancer. We need the original method reference

aduth · 2023-09-14T15:28:33Z

I tested this using @mitchellhenke 's personal environment and verified that the errors are being logged as expected errors to NewRelic with the custom parameters https://onenr.io/08wogAPAbwx

This reverts commit 13d961e.

zachmargolis

LGTM

I tested this using @mitchellhenke 's personal environment and verified that the errors are being logged as expected errors to NewRelic with the custom parameters https://onenr.io/08wogAPAbwx

This didn't quite match my expecations:

name and message are the same?
the thing I expected to be message was inside stack?

Error: Example error at https://..../packs/js/password_toggle_component-fba98070.digested.js:1:360

Either way, I think it's good enough for now, and we can always iterate and improve later

aduth · 2023-09-14T18:40:42Z

This didn't quite match my expecations:

That is strange! What I'm seeing sent from the browser matches my expectations, so not sure where it's getting lost along the way. Maybe NewRelic is doing its own thing with those parameters, since I could see message being pretty common.

Example payload I see in testing:

message: "Example error"
name: "Error"
stack: "Error: Example error\n    at https://[...]/packs/js/track-errors-02b3022a.digested.js:1:456"

Easier to see grouped in alphabetical order, avoid conflicting names

aduth · 2023-09-14T19:14:02Z

I think it's definitely a naming conflict thing. Also, namespacing helps make it more easy to see the properties together anyways, so this should be improved now after 4bdf111 . See https://onenr.io/0Owvg5MzaRv

aduth commented Aug 7, 2023

View reviewed changes

app/javascript/packages/analytics/index.ts Outdated Show resolved Hide resolved

app/controllers/frontend_log_controller.rb Outdated Show resolved Hide resolved

aduth commented Aug 7, 2023

View reviewed changes

app/views/layouts/base.html.erb Outdated Show resolved Hide resolved

aduth commented Aug 7, 2023

View reviewed changes

app/javascript/packs/track-errors.ts Outdated Show resolved Hide resolved

aduth force-pushed the aduth-try-custom-error-logger branch from 39cc667 to 547c42b Compare August 29, 2023 19:22

zachmargolis reviewed Aug 29, 2023

View reviewed changes

aduth commented Aug 29, 2023

View reviewed changes

zachmargolis mentioned this pull request Aug 30, 2023

Add custom proc support to FrontendLogger, simplify FrontendLogController #9117

Merged

mitchellhenke marked this pull request as ready for review September 13, 2023 17:12

mitchellhenke marked this pull request as draft September 13, 2023 17:12

aduth and others added 13 commits September 14, 2023 10:30

Try replacing NewRelic browser instrumentation with custom error handler

7812b78

changelog: Internal, Error Tracking, Implement replacement frontend error logger

Experiment with hand-minified inline snippet

6f46a45

Must go smaller

575e2b4

Smaller!

2b4d20e

Filter events to same-host script errors

914adef

Use javascript_tag for prelude script

01ad0e1

Previously trying to avoid CDATA , but more standard

Log Webpack script errors in development environment

6470b0b

Route frontend error events to NewRelic

223a63d

Revert "Log Webpack script errors in development environment"

164cb8a

This reverts commit bafdea8.

Remove demo error

106fd69

Update index.spec.ts

7ff899a

Add custom proc support to FrontendLogger, simplify FrontendLogContro…

da0edb2

…ller error reporting

Bring back parens around assignment inside a conditional

625de7e

zachmargolis and others added 9 commits September 14, 2023 10:30

Use #public_send on method name so that AnalyticsEventEnhancer can do…

e59f074

… its thing - bind_call calls a specific method implementation, so it doesn't go through the Enhancer implementation

dedicated class?

ff757a4

WIP: experimenting with less special-casing of Analytics class

f2f1da4

Revert "WIP: experimenting with less special-casing of Analytics class"

c45092f

This reverts commit ae086b4.

Slight simplification, support #call-able objects

80d18c6

Rename and document things for clarity

51f3e01

Lint is_a

33f4f68

Fix issue with IdV::AnalyticsEventsEnhancer override

36cb9d6

The method signature is lost if read directly from analytics instance, since it's overridden by AnalyticsEventsEnhancer. We need the original method reference

Log errors as expected

c9d605c

aduth force-pushed the aduth-try-custom-error-logger branch from 62f9721 to c9d605c Compare September 14, 2023 14:36

aduth marked this pull request as ready for review September 14, 2023 14:41

Temporary: Test error

13d961e

aduth added 3 commits September 14, 2023 11:29

Alphabetize analytics events

1f7ee55

Update spec expectations

cffb342

Revert "Temporary: Test error"

4431b94

This reverts commit 13d961e.

zachmargolis approved these changes Sep 14, 2023

View reviewed changes

Try namespacing logged error

4bdf111

Easier to see grouped in alphabetical order, avoid conflicting names

aduth merged commit a7ed678 into main Sep 14, 2023

aduth deleted the aduth-try-custom-error-logger branch September 14, 2023 20:05

aduth changed the title ~~Try replacing NewRelic browser instrumentation with custom error handler~~ Replace NewRelic browser instrumentation with custom error handler Sep 14, 2023

aduth mentioned this pull request Sep 15, 2023

Remove NewRelic domains from CSP #9216

Merged

aduth added the performance label Sep 15, 2023

mitchellhenke mentioned this pull request Sep 18, 2023

Deploy RC 315 to Prod #9225

Merged

aduth mentioned this pull request Oct 6, 2023

Update guidance for frontend error logging #9330

Merged

aduth mentioned this pull request Jan 31, 2024

Load error tracking script asynchronously #10013

Merged

aduth mentioned this pull request Nov 8, 2024

Add identifier for explicit frontend error logging #11481

Merged

Conversation

aduth commented Aug 7, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🛠 Summary of changes

📜 Testing Plan

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

zachmargolis commented Aug 8, 2023

Uh oh!

aduth commented Aug 8, 2023

Uh oh!

zachmargolis commented Aug 8, 2023

Uh oh!

aduth commented Aug 8, 2023

Uh oh!

zachmargolis Aug 29, 2023

Choose a reason for hiding this comment

Uh oh!

aduth Aug 29, 2023

Choose a reason for hiding this comment

Uh oh!

zachmargolis Aug 30, 2023

Choose a reason for hiding this comment

Uh oh!

aduth Aug 29, 2023

Choose a reason for hiding this comment

Uh oh!

aduth Aug 29, 2023

Choose a reason for hiding this comment

Uh oh!

aduth Aug 30, 2023

Choose a reason for hiding this comment

Uh oh!

aduth commented Sep 5, 2023

Uh oh!

zachmargolis commented Sep 5, 2023

Uh oh!

aduth commented Sep 13, 2023

Uh oh!

aduth commented Sep 14, 2023

Uh oh!

zachmargolis left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aduth commented Sep 14, 2023

Uh oh!

aduth commented Sep 14, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

aduth commented Aug 7, 2023 •

edited

Loading

zachmargolis left a comment •

edited

Loading

aduth commented Sep 14, 2023 •

edited

Loading