-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle errors in execution #176
Conversation
mkaruza
commented
Sep 13, 2024
- Unify reporting of WARNING with log
- Throw exception in functions that are used inside duckdb execution
- Report WARNING for functions that are part of "normal" postgres execution
- Cleanup duckdb_state before reporting error in DuckDBNode (clean up)
bca95d8
to
57ff4fd
Compare
Should address #93 |
1c1109c
to
9dbbcf6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for these efforts, I don't think #93 can be fully closed because it's something we have to be conscientiously aware of going forward.
Also I wonder if there are methods that are used both during duckdb execution and outside of it, which could be a problem if one or the other is assumed
I certainly don't want #93 to be open forever 😂 but we can maybe add something to CONTRIBUTING.md as a note as well. We should try to move towards issues being specific rather than generic "fix all the things" e.g. with either a specific reproducible test case or ideally pointing to something specific in the code. Of course this can be hard with issues like memory leaks. If there's still specific things you think we should add or change, I'd suggest we open new issue(s) for those. |
77c2698
to
236d928
Compare
std::optional<T> | ||
PostgresFunctionGuard(FuncType postgres_function, FuncArgs... args) { | ||
T return_value; | ||
bool error = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs to be marked volatile
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually it seems this is probably not necessary. I was thinking about the problem described here, but that does not apply since error
is not changed in the PG_TRY section: https://github.com/postgres/postgres/blob/master/src/include/utils/elog.h#L369-L374
236d928
to
2b614c8
Compare
} | ||
PG_CATCH(); | ||
{ | ||
error = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like it would probably be better to throw an exception here already. That way we can preserve the original Postgres error message for much better debugging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure is it correct to throw from this. I would prefer that this function outputs error message but still caller needs to handle this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what do you mean with "correct" in this case?
Do you mean is it allowed to throw an exception in PG_CATCH? In which case, I think it's fine, but indeed I'm also not 100% sure. But if you throw an exception after the PG_CATCH if error
is true
, that should definitely be fine.
Or do you mean is it what the caller should do? I think in at least 90% of the cases the caller will simply want to propagate the error message in the form of an exception. In the last 10% maybe they want more control, in which case returning the error message is maybe indeed what we would want. But I think we should make it as easy as possible to write correct code, so I think we should also have a helper that simply throws an exception. And honestly, maybe throwing an exception in all cases might be fine too, since a caller can always catch
it if it needs to do something special.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JelteF guard function will now throw InternalException with error message constructed from postgresql log
CONTRIBUTING.md
Outdated
* *Exception* should be used in code that is executed inside duckdb engine. | ||
* Use PostgreSQL *elog* API to report messages back to user. Using *ERROR* is strictly forbiden to use as it can brake execution and lead to | ||
unexpected memory problems - *ERROR* is only used in duckdb node to early bail out execution in case of query preparation error or execution interruption. | ||
* Calling PostgreSQL native functions needs to be used with *PostgresFunctionGuard*. *PostgresFunctionGuard* will handle correctly *ERROR* log messages that could be emmited from these functions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* *Exception* should be used in code that is executed inside duckdb engine. | |
* Use PostgreSQL *elog* API to report messages back to user. Using *ERROR* is strictly forbiden to use as it can brake execution and lead to | |
unexpected memory problems - *ERROR* is only used in duckdb node to early bail out execution in case of query preparation error or execution interruption. | |
* Calling PostgreSQL native functions needs to be used with *PostgresFunctionGuard*. *PostgresFunctionGuard* will handle correctly *ERROR* log messages that could be emmited from these functions. | |
* There are two distinct parts of the code where error handling is done very differently: The code that executes before we enter DuckDB execution engine (e.g. initial part of the planner hook) and the part that gets executed inside the duckdb execution engine. Below are rules for how to handle errors in both parts of the code. Not following these guidelines can cause crashes, memory leaks and other unexpected problems. | |
* Before we enter the DuckDB exection engine no exceptions should ever be thrown here. In cases where you would want to throw an exception here, use `elog(ERROR, ...)`. Any C++ code that might throw an exception is also problematic. Since C++ throws exceptions on allocation failures, this covers lots of C++ APIs. So try to use Postgres datastructures instead of C++ ones whenever possible (e.g. use `List` instead of `Vec`) | |
* Inside the duckdb execution engine the opposite is true. `elog(ERROR, ...)` should never be used there, use exceptions instead. | |
* Use PostgreSQL *elog* API can be used to report non-fatal messages back to user. Using *ERROR* is strictly forbiden to use in code that is executed inside the duckdb engine as | |
* Calling PostgreSQL native functions from within DuckDB execution needs **extreme care**. Pretty much non of these functions are thread-safe, and they might throw errors using `elog(ERROR, ...)`. If you've solved the thread-safety issue by taking a lock (or by carefully asserting that the actual code is thread safe), then you can use *PostgresFunctionGuard* to solve the `elog(ERROR, ...) problem. *PostgresFunctionGuard* will correctly handle *ERROR* log messages that could be emmited from these functions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed according to suggestion.
@@ -19,4 +23,44 @@ TokenizeString(char *str, const char delimiter) { | |||
return v; | |||
}; | |||
|
|||
template <typename T, typename FuncType, typename... FuncArgs> | |||
std::optional<T> | |||
PostgresFunctionGuard(FuncType postgres_function, FuncArgs... args) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should have a similar function for the other way around, i.e. where we call DuckDB C++ APIs from Postgres C code.
src/pgduckdb_detoast.cpp
Outdated
toastrel = table_open(toast_pointer.va_toastrelid, AccessShareLock); | ||
table_relation_fetch_toast_slice(toastrel, toast_pointer.va_valueid, attrsize, 0, attrsize, result); | ||
table_close(toastrel, AccessShareLock); | ||
toast_rel = try_table_open(toast_pointer.va_toastrelid, AccessShareLock); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
try_table_open
might allocate memory an throw an ERROR
that way. So this needs a PostgresFunctionGuard
too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Guarded.
src/pgduckdb_detoast.cpp
Outdated
error_fetch_toast = true; | ||
} | ||
|
||
table_close(toast_rel, AccessShareLock); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even table_close
can throw some ERROR, see this quote from the PG source code where they consider memory context reset callbacks possibly throwing an error.
/*
* It's not entirely clear whether 'tis better to do this before or after
* delinking the context; but an error in a callback will likely result in
* leaking the whole context (if it's not a root context) if we do it
* after, so let's do it before.
*/
MemoryContextCallResetCallbacks(context);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Guarded.
Totally agreed. But I think there's two more things we need to do at minimum:
I think neither necessarily needs to be part of this initial PR. I personally would prefer merging this PR as soon as we agree on nice to use Guard APIs. Because then we can use those Guard APIs for new code, without having to wait for the problems to be addressed in all existing code. Finally, this is an issue where it's easy to miss one place. So I definitely want to take another full pass myself over the codebase, after the initial push, to try and find some missed callsites before closing #93. |
As a side-note to this: It would be great if we could make it harder to make mistakes here. One idea I had was to not directly import duckdb C++ headers into postgres code. And just as well not directly import postgres headers into C++ code. But instead create dedicated files from where we create safe to use versions with the correct guards. That would make it easy to spot mistakes during review, because it's not expected that you import Edit: To be clear, please don't do this in this PR. This is more a refactoring that I think could makes sense once we have the Guard APIs to find a few of the missed function calls, and make sure we keep following our own guidelines for this in future PRs. |
* Unify reporting of WARNING with log * Throw exception in functions that are used inside duckdb execution * Report WARNING for functions that are part of "normal" postgres execution * Cleanup duckdb_state before reporting error in DuckDBNode (clean up)
* Added `PostgresFunctionGuard` and `PostgresVoidFunctionGuard` that handle call to Postgres function that can throw ERROR.
* Add few informations how we should handle errors inside code. * Use same function name `PostgresFunctionGuard` for both void and non-void PG function calls.
3ac5926
to
8dd72df
Compare
src/pgduckdb_utils.cpp
Outdated
PG_END_TRY(); | ||
// clang-format on | ||
if (error) { | ||
throw duckdb::InternalException("(PGDuckDB/PostgresFunctionGuard) %s", edata->message); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it makes sense to add this prefix here (and in the other guard), the error didn't really originate here, we are simply transforming it, so having a reference to the guard won't help you find where it originated. Especially if we'll also add conversion the other way around I can see many prefixes being added to a simple error message.
throw duckdb::InternalException("(PGDuckDB/PostgresFunctionGuard) %s", edata->message); | |
throw duckdb::InternalException("%s", edata->message); |
src/pgduckdb_utils.cpp
Outdated
PG_END_TRY(); | ||
// clang-format on | ||
if (error) { | ||
throw duckdb::InternalException("(PGDuckDB/PostgresFunctionGuard) %s", edata->message); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think throwing an InternalException is acceptable for now, so that we can start using these guards in new code ASAP. But I think it would be great, if we would save not only the error message, but also the code/detail/context and maybe more in the extra_data
of the Exception
. Or probably easier: just store the whole edata
(which we'd then probably have to allocate in TopMemoryContext
).
That way we could convert such an Exception
back to the equivalent ereport
when surfacing the error to the user.
In the spirit of this PR, I am going to add some generic "ScopedPostgresResource" class so we can make sure never to leak postgres resources like |
src/pgduckdb_node.cpp
Outdated
// Delete the scan state | ||
CleanupDuckdbScanState(state); | ||
// Process the interrupt on the Postgres side | ||
ProcessInterrupts(); | ||
elog(ERROR, "Query cancelled"); | ||
elog(ERROR, "(PGDuckDB/ExecuteQuery) Query cancelled"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the prefix really adds anything useful here.
elog(ERROR, "(PGDuckDB/ExecuteQuery) Query cancelled"); | |
elog(ERROR, "Query cancelled"); |
Side-note: We should be throwing the expected error code here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, I think this is pretty much in a merge-able state now.
I'm not sure I'm a fan of all the (PGDuckDB/ReplaceView)
type prefixes, but that might just be a matter of me not being used to them. I left two comments on where I think we shouldn't add them for sure though.
Apart from that: Tests are failing. Which should obviously be fixed before merging.
* Guard function throw `duckdb::Exception(EXECUTOR,...)` * Updated CONTRIBUTING.md according to review suggestion * Guarded other suggested pg functions
8dd72df
to
466d77e
Compare
@JelteF will merge this PR to have some foundation to start with, but i would imagine that there is going to be changes in future until we are satisfied how to handle guarding / error logging / error handling in project. |
Yeah, sounds like a good plan. I think this is a good foundation to build on top of. |