Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better handle errors in Notifiers #273

Merged
merged 13 commits into from
Jun 21, 2023
Merged

Conversation

Cruikshanks
Copy link
Member

@Cruikshanks Cruikshanks commented Jun 19, 2023

We spotted this issue when working in another service where we have copied the app/lib/base-notifier.lib.js and app/lib/global-notifier.lib.js across. At the points we wanted to log an error we were calling global.GlobalNotifier.omfg('import.charge-versions: errored', error). When we got an error though, this is what we saw.

In the logs, it would just be

0|import  | [15:44:40.837] ERROR (8616):
0|import  |     message: "import.charge-versions: errored"

Errbit would get the notification but if you tried to view it you'd get an error.

The main issue with the logs was we were treating instances of Error as if they were just a POJO when they are not.

const myError = new Error('boom')

const myObject = {
  ...myError
}

console.log(myObject) // {}

We also went back to the pino docs and learnt that the log packet we were generating no longer matched what pino was expecting. Specifically, we refreshed our understanding of the mergingObject and realised that we would be better placed to keep any data that needs to be logged separately from the message.

We also learnt that pino knows how to log errors, including custom ones with additional properties as long as either

  • the error instance is passed into the log call as the mergingObject
  • we mimic what pino does and wrap the error instance in a new object with the property err:

For Airbrake and Errbit uncaught errors were coming through and displaying fine. It was those caught in our logic and passed through as the data argument to omfg(). The key difference was app/plugins/airbrake.plugin.js when calling notify() passes in the error instance as the error: property, not session:.

// When Hapi emits a request event with an error we capture the details and use Airbrake to send a request to our
    // Errbit instance
    server.events.on({ name: 'request', channels: 'error' }, (request, event, _tags) => {
      server.app.airbrake
        .notify({
          error: event.error,
          session: {
            route: request.route.path,
            method: request.method,
            url: request.url.href
          }
        })

If we did the same then the error would always display in Errbit. We also found the grouping of errors vastly improved.

All this means we are making the following changes

  • to ensure things are passed through to the logger and errbit correctly, formatting of the arguments for both is moved to BaseNotifier rather than forcing extending classes to do their own thing
  • callers still have the flexibility to just set a message for both omg() and omfg(). But behind the scenes we always generate an Error instance and pass this to the Pino and Airbrake to ensure consistency

The result is this. Given the following calls to omg() and omfg()

const myError = new ErrorWithDate('Thing go pop')
const myData = { thingyId: 'ece18f30-1a73-4579-b752-5ca2607671bf', status: 'snazzy' }

global.GlobalNotifier.omg('Keep calm. I am an example of using omg()', myData)
global.GlobalNotifier.omfg('Panic! I am an example of using omfg()', myData, myError)

We see the following log output (pretty-printed local version)

[15:57:10.672] INFO (10292): Keep calm. I am an example of using omg()
    thingyId: "ece18f30-1a73-4579-b752-5ca2607671bf"
    status: "snazzy"
[15:57:10.673] ERROR (10292): Panic! I am an example of using omfg()
    thingyId: "ece18f30-1a73-4579-b752-5ca2607671bf"
    status: "snazzy"
    err: {
      "type": "ErrorWithDate",
      "message": "Thing go pop",
      "stack":
          Error: Thing go pop
              at index (/home/repos/water-abstraction-system/app/controllers/root.controller.js:16:19)
              at exports.Manager.execute (/home/repos/water-abstraction-system/node_modules/@hapi/hapi/lib/toolkit.js:57:29)
              at Object.internals.handler (/home/repos/water-abstraction-system/node_modules/@hapi/hapi/lib/handler.js:46:48)
              at exports.execute (/home/repos/water-abstraction-system/node_modules/@hapi/hapi/lib/handler.js:31:36)
              at Request._lifecycle (/home/repos/water-abstraction-system/node_modules/@hapi/hapi/lib/request.js:370:68)
              at processTicksAndRejections (node:internal/process/task_queues:96:5)
              at async Request._execute (/home/repos/water-abstraction-system/node_modules/@hapi/hapi/lib/request.js:280:9)
      "dateItHappened": "2023-06-20T15:57:10.672Z"
    }

In Errbit all errors are viewable. Also, it better groups the notifications.

Screenshot of app errors page Screenshot 2023-06-20 at 17 00 33
Screenshot of error summary page Screenshot 2023-06-20 at 17 03 15
Screenshot of error session page Screenshot 2023-06-20 at 17 05 39

We spotted this issue when [working in another service](DEFRA/water-abstraction-import#661) where we have copied the `app/lib/base-notifier.lib.js` and `app/lib/global-notifier.lib.js` across. At the points we wanted to log an error we were calling `global.GlobalNotifier.omfg('import.charge-versions: errored', error)`. When we got an error though, this is what we saw.

In the logs it would just be

```text
0|import  | [15:44:40.837] ERROR (8616):
0|import  |     message: "import.charge-versions: errored"
```

Errbit would get the notification but if you tried to view it you'd get an error. Looking into this last one it was to do with how the Airbrake package was handling an instance of [Error](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Error/Error) being passed as the `session:` property in the call to `notify()`. Simple errors were fine but custom ones appeared to break Errbit.

Simply, we were treating instances of `Error` as if they were just a POJO when they are not.

```javascript
const myError = new Error('boom')

const myObject = {
  ...myError
}

console.log(myObject) // {}
```

For Airbrake and Errbit uncaught errors were coming through and displaying fine. The key difference is `app/plugins/airbrake.plugin.js` when calling `notify()` passes in the error instance as the `error:` property.

This change updates our notifier modules to better calls to `omfg()` that pass in an `Error` instance. In the logs we will get

```text
0|import  | [16:42:39.829] ERROR (8786):
0|import  |     message: "import.charge-versions: errored"
0|import  |     error: {
0|import  |       "stack":
0|import  |           error: null value in column "start_date" violates not-null constraint
0|import  |               at Parser.parseErrorMessage (/home/repos/water-abstraction-import/node_modules/pg-protocol/dist/parser.js:287:98)
0|import  |               at Parser.handlePacket (/home/repos/water-abstraction-import/node_modules/pg-protocol/dist/parser.js:126:29)
0|import  |               at Parser.parse (/home/repos/water-abstraction-import/node_modules/pg-protocol/dist/parser.js:39:38)
0|import  |               at Socket.<anonymous> (/home/repos/water-abstraction-import/node_modules/pg-protocol/dist/index.js:11:42)
0|import  |               at Socket.emit (node:events:513:28)
0|import  |               at Socket.emit (node:domain:489:12)
0|import  |               at addChunk (node:internal/streams/readable:315:12)
0|import  |               at readableAddChunk (node:internal/streams/readable:289:9)
0|import  |               at Socket.Readable.push (node:internal/streams/readable:228:10)
0|import  |               at TCP.onStreamRead (node:internal/stream_base_commons:190:23)
0|import  |       "message": "null value in column \"start_date\" violates not-null constraint",
0|import  |       "length": 307,
0|import  |       "name": "error",
0|import  |       "severity": "ERROR",
0|import  |       "code": "23502",
0|import  |       "detail": "Failing row contains (4:10024704:licenceStart-licenceEnd, 1, null, 2018-12-31, current, t, 2023-06-16 16:42:39.787071+00, null).",
0|import  |       "schema": "water_import",
0|import  |       "table": "charge_versions_metadata",
0|import  |       "column": "start_date",
0|import  |       "file": "execMain.c",
0|import  |       "line": "1948",
0|import  |       "routine": "ExecConstraints"
0|import  |     }
```

In Errbit all errors will be visible. Also, it will be able to determine the error class and therefore better group notifications.
@Cruikshanks Cruikshanks added the enhancement New feature or request label Jun 19, 2023
@Cruikshanks Cruikshanks self-assigned this Jun 19, 2023
hapi-pino wants a POJO in order to output something to the logs. But as demonstrated we can't just use `{ ...myError }` to do that.

We used the responses to [Is it not possible to stringify an Error using JSON.stringify?](https://stackoverflow.com/q/18391212) to implement a solution that uses `Object.getOwnPropertyNames()` to identify the properties directly on the error instance. So, `message:` and `stack:` will be there as all errors inherit these. But if a custom error adds additional properties then we'll get those too.

That comes back as an array, which we iterate to extract the values off the error and create a new POJO from.
First, we'd forgotten that `omg()` calls `.info()` and `omfg()` calls `.error()` on the logger.

So, we went back to the docs to double check there wasn't something we were missing that is different about these methods.

The docs confirmed the signature is the same. What we'd overlooked (either originally or they've changed on us!) is that the log methods expect any data value to be passed through first. When you have a value to log, the second arg should be the message. We were formatting a single object with a `message:` property.

The first arg is referred to as the [mergingObject](https://github.com/pinojs/pino/blob/master/docs/api.md#mergingobject-object)

> Each enumerable key and value of the `mergingObject` is copied into the JSON log line.

We could have made our current implementation work if we amended the `message:` property to be `msg:`. Then pino would know to use that as the message for the log entry.

But if the `mergingObject` is an error and no `message` arg is provided then pino will use the errors `message:` property instead. We want to be able to log an error correctly _and_ still have our own message. For this you must provide both `mergingObject` and `message` args to pino. So, it makes sense at this stage to always be passing both args to pino and stick with that convention.

The second key change is we're scraping the idea of forcing notifiers that extend the base to implement their own formatters. Experience has shown we're just copying the same format logic in them so we may as well leave that in the base as a default implementation.

This change updates BaseNotifier with a default implementation for `_formatLogPacket()`. We then update the tests to cover both changes.
So, we started thinking we had this licked. But that refresher of the Pino docs and the move to have the BaseNotifier implement the default formatters open up a different perspective.

That led to what is implemented here. Essentially, when logging errors we _always_ log an error, even if we have to generate one. For notifications we always send an error to Errbit. Again, it helps keep things consistent and supports Errbit with its grouping.

We also tweaked what to do when things go wrong. If Airbrake fails we no longer use `_formatLogPacket()`. We just call the `logger.error()` directly to avoid issues.

Having tested with real errors we know this delivers the results we want now. Hopefully, the unit tests express the behaviour we want.
Remove format packet overrides and unit tests accordingly.
This is currently the one notifier we have that was overriding what BaseNotifier was doing in the _format() methods (beyond the copy & paste default implementation used by the others).

Now BaseNotifier holds our default implementation, we simply want to inject our behaviour in the process. The RequestNotifier needs to add the request ID to the data `mergeObject` pino will log and the session object Airbrake sends.

We now override the base methods to create a new decorated object based on what is passed in. We then forward that to the BaseNotifier methods using [super](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/super).
@Cruikshanks Cruikshanks marked this pull request as ready for review June 20, 2023 21:23
Copy link
Contributor

@Jozzey Jozzey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Cruikshanks Cruikshanks merged commit 86e2e2e into main Jun 21, 2023
@Cruikshanks Cruikshanks deleted the better-handle-errors-in-notifiers branch June 21, 2023 10:50
Cruikshanks added a commit to DEFRA/water-abstraction-import that referenced this pull request Jun 21, 2023
When working on [Move ChargeVersionsMetadataImport out of NALD](#661) we encountered an error. That was expected; we were mid-implementation. What we didn't expect was

- the lack of detail in the log
- that attempting to view the error in Errbit caused an error

Suffice to say, we realised our notifier implementation was wrong. We were passing things incorrectly to both [pino](https://github.com/pinojs/pino) and [errbit](https://github.com/errbit/errbit). [Better handle errors in Notifiers](DEFRA/water-abstraction-system#273) goes into more detail about the issue and what we did to resolve it.

That is our main repo and the place we copied the current implementation of `GlobalNotifier` and `BaseNotifier` from. So, we did the original fixes there. This change updates the copies of those notifiers in this repo to include the fixes.
Cruikshanks added a commit to DEFRA/water-abstraction-import that referenced this pull request Jun 21, 2023
When working on [Move ChargeVersionsMetadataImport out of NALD](#661) we encountered an error. That was expected; we were mid-implementation. What we didn't expect was

- the lack of detail in the log
- that attempting to view the error in Errbit caused an error

Suffice it to say, we realised our notifier implementation was wrong. We were passing things incorrectly to both [pino](https://github.com/pinojs/pino) and [errbit](https://github.com/errbit/errbit). [Better handle errors in Notifiers](DEFRA/water-abstraction-system#273) goes into more detail about the issue and what we did to resolve it.

That is our main repo and the place we copied the current implementation of `GlobalNotifier` and `BaseNotifier`. So, we did the original fixes there. This change updates the copies of those notifiers in this repo to include the fixes.
Cruikshanks added a commit that referenced this pull request Jul 6, 2023
With the changes we made in [Better handle errors in Notifiers](#273) calls to `omfg()` should pass the error separately from any data to be included.

Whilst working on the code we spotted an instance we'd overlooked when updating all our `omfg()` calls.
Cruikshanks added a commit that referenced this pull request Jul 6, 2023
With the changes we made in [Better handle errors in Notifiers](#273) calls to `omfg()` should pass the error separately from any data to be included.

Whilst working on the code we spotted an instance we'd overlooked when updating all our `omfg()` calls.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants