feat: db connection analytics#13346
feat: db connection analytics#13346eschutho wants to merge 4 commits intoapache:db-connection-logsfrom
Conversation
e11c306 to
eabd614
Compare
There was a problem hiding this comment.
engine wasn't defined if database was None, so I put it under this block.
Codecov Report
@@ Coverage Diff @@
## db-connection-logs #13346 +/- ##
=====================================================
Coverage ? 54.61%
=====================================================
Files ? 478
Lines ? 15982
Branches ? 4123
=====================================================
Hits ? 8729
Misses ? 7253
Partials ? 0
Flags with carried forward coverage won't be shown. Click here to find out more. Continue to review full report at Codecov.
|
betodealmeida
left a comment
There was a problem hiding this comment.
LGTM, my only concern is that in theory we might fail to rollback a transaction. I also left a comment on using log instead of log_context.
There was a problem hiding this comment.
This worries me a bit — if there's an error in the context manager we won't rollback the transaction.
I think it would be better to just log, instead of using a context manager:
try:
event_logger.log(action=f"db_connection_failed.{database.db_engine_spec.__name__}")
finally:
db.session.rollback()To ensure that rollback is always called.
There was a problem hiding this comment.
Maybe log should be pessimistic / guaranteed to never raise(?) so that we don't have to use caution every time we call it? That would mean wrapping the content of log in a try ?
There was a problem hiding this comment.
@betodealmeida and I were talking about this earlier, and most statements before the yield are pretty safe, but to be sure, I moved what I could below the yield statement. I don't think there's much risk of this failing now:
start_time = time.time()
payload_override = {}
There was a problem hiding this comment.
lmk if you think this works as a solution @betodealmeida @mistercrunch
There was a problem hiding this comment.
We might want to just use log in places like this. The main difference between log and log_context is that the former, in addition to being a context manager, logs the duration automatically, together with some dashboard/chart metadata that is irrelevant here.
There was a problem hiding this comment.
true... it also logs the user_id and referrer, so maybe it's still useful?
There was a problem hiding this comment.
We may want to either 1) refactor log to do some of the magic in log_context maybe add a bool flag enrich_with_request_context=False, or 2) can we just call the context manager with the with block? Would that just work? or 3) refactor log_context to NOT be be context manager and create a new one say log_context_manager (or better name) that's used only when we want duration / with block
There was a problem hiding this comment.
One problem is that adding the flag enrich_with_request_context would be a breaking change (our event logger, for example, would have to be updated).
I think the best way here is:
- Create a new method in
AbstractEventLoggercalledlog_with_context, that does all the job thatlog_contextdoes but is a function, not a context manager. The method callslogat the end, with all the extracted context. - Change
AbstractEventLogger.log_contextto uselog_with_contextfor the enrichment, removing the shared code between them. - In this PR, call the new
log_with_context. Since it delegates the actual logging tologit will work with all existing event loggers, and is not a breaking change..
Eventually I think we should have the event logger itself be a context manager by implementing __enter__ and __exit__ methods, so we could call:
with event_logger(action="foo"):
do_something()Since we can do this by adding methods to AbstractEventLogger it would also not be a breaking change.
There was a problem hiding this comment.
Quick clarification that enrich_with_request_context would be false by default to protect backwards compatibity. But maybe function composition is better than a parameter here, I'm ok with with either pattern.
There's a need to really clarify/improve the method in AbstractEventLogger with good/clear names here between functions we can call, content manager(s) and decorator(s)
There was a problem hiding this comment.
since we're merging this branch into @hughhhh's feature branch rather than master, @betodealmeida said that he could do the log_context refactor in a separate PR.
There was a problem hiding this comment.
I'm working on the refactor right now.
@mistercrunch you're right, I forgot that log takes a **kwargs that would swallow the new argument.
|
I feel like we may want to add Eventually I'd like to add the I think you can just pass as a keyword as in I made a related comment here: #13223 (comment) |
There was a problem hiding this comment.
@mistercrunch I added in the engine here as a kwarg. Also, just as an FYI, the strings that we're passing in these two files aren't exactly 1:1. Here we have to use the engine spec name, and in the test_connection file we pass the engine+driver combo. But the information is essentially the same.. any downstream integrations can still see the engine from the string that is passed in.
f6d8ea1 to
4fe1895
Compare
|
rerun tests |
fb68366 to
d4588f6
Compare
4538e2f to
bf0ff7b
Compare
bf0ff7b to
0ceb8f7
Compare
| except DAOCreateFailedError as ex: | ||
| logger.exception(ex.exception) | ||
| raise DatabaseCreateFailedError() | ||
| with event_logger.log_context(action=f"db_creation_failed.{ex.exception}"): |
There was a problem hiding this comment.
@hughhhh I saw this earlier today, that we're currently logging the entire exception. Can we change this on line 71 and 72 to log just the class? I originally thought since we were already logging it, it would be fine to pass to the event logger, but I think we should actually clean this up as well.
|
@hughhhh is taking this effort over in https://github.com/apache/superset/pull/13468/files |
SUMMARY
adds analytics logs to the db connection success/failures
TEST PLAN
unit tests
ADDITIONAL INFORMATION