Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reduce verbosity of *.{err, skip} logs #278

Open
missinglink opened this issue Nov 1, 2021 · 0 comments
Open

reduce verbosity of *.{err, skip} logs #278

missinglink opened this issue Nov 1, 2021 · 0 comments

Comments

@missinglink
Copy link
Member

A planet-wide interpolation build tends to encounter a lot of bad/invalid OpenAddresses data.

The logging for this is quite verbose, the *.err logger writes both the offending row in JSON format and a stack trace for each, this results in massive log files which are mostly duplicitous and end up being significantly larger than the actual database itself.

21.7 GiB address.db.gz
63.9 GiB conflate_oa.err
533.5 KiB conflate_oa.out
29.6 GiB conflate_oa.skip

This issue is to consider ways of reducing the logging verbosity, possible solutions:

  • remove the stack traces if not useful
  • allow the log stream to be compressed, or attached to a compressor

As somewhat of an aside, try/catch here incurs a fair cost, mainly due to v8 having to produce stack traces.
It may be possible to 'return errors as values' here instead? doing so would distinguish actual errors from validation issues and also improve performance by avoiding try/catch and the instantiation of Error instances.

@missinglink missinglink changed the title reduce verbosity of .err logs reduce verbosity of .{err, skip} logs Nov 1, 2021
@missinglink missinglink changed the title reduce verbosity of .{err, skip} logs reduce verbosity of *.{err, skip} logs Nov 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant