Skip to content

Add invoked functions to query completed event#18980

Merged
zacw7 merged 1 commit intoprestodb:masterfrom
zacw7:log-function-signatures
Feb 3, 2023
Merged

Add invoked functions to query completed event#18980
zacw7 merged 1 commit intoprestodb:masterfrom
zacw7:log-function-signatures

Conversation

@zacw7
Copy link
Member

@zacw7 zacw7 commented Jan 26, 2023

Add the names of the functions invoked by the query to the completed event when session property log_invoked_function_names_enabled or system config log-invoked-function-names-enabled is enabled.

Check com.facebook.presto.hive.TestHiveIntegrationSmokeTest#testInvokedFunctionNamesLog for details.

== RELEASE NOTES ==

General Changes
* Add the names of the functions invoked by the query to the completed event.
This can be enabled with the ``log_invoked_function_names_enabled`` session property or the ``log-invoked-function-names-enabled`` configuration property.

@zacw7 zacw7 force-pushed the log-function-signatures branch 2 times, most recently from 175c9e2 to b3533d1 Compare January 26, 2023 00:40
@zacw7 zacw7 changed the title Add used function signatures to query completed event Add invoked functions to query completed event Feb 1, 2023
@zacw7 zacw7 force-pushed the log-function-signatures branch 3 times, most recently from c8722f0 to 51fc072 Compare February 1, 2023 22:59
@zacw7 zacw7 marked this pull request as ready for review February 1, 2023 23:04
@zacw7 zacw7 requested a review from a team as a code owner February 1, 2023 23:04
@zacw7 zacw7 force-pushed the log-function-signatures branch 2 times, most recently from 56266eb to ab3919e Compare February 2, 2023 00:22
@zacw7 zacw7 requested a review from NikhilCollooru February 2, 2023 17:19
@NikhilCollooru
Copy link
Contributor

NikhilCollooru commented Feb 2, 2023

The changes looks good.
Do we want to limit the size of the the list of functions. like can there be a query with 1000 function names? ..if so it would blow up the size of the event object. wondering if it makes sense to limit the size. like may be we log only 20 function names or so ?

@zacw7
Copy link
Member Author

zacw7 commented Feb 2, 2023

The changes looks good. Do we want to limit the size of the the list of functions. like can there be a query with 1000 function names? ..if so it would blow up the size of the event object. wondering if it makes sense to limit the size. like may be we log only 20 function names or so ?

Thanks for reviewing the PR! I think it should fine.

The longest function I found so far is an internal UDF which has 44 characters, let's add the prefix presto.default and make it 59 characters long. Assuming 1000 such functions are used in one query, it still only takes 59K characters in total while another field runtime_stats can be as long as 790K (e.g. 20230202_133039_02539_hw725). I tried SHOW FUNCTIONS on a verifier and there were only 998 functions and a lot of them actually share the same name (with different parameters) and hence would be counted only once. So I think it's very unlikely for the size to be larger than even 20K.

That said, we can always have the feature tested on testing clusters first, populate some logs with this new added column and check if the size needs to be limited.

@NikhilCollooru
Copy link
Contributor

The changes looks good. Do we want to limit the size of the the list of functions. like can there be a query with 1000 function names? ..if so it would blow up the size of the event object. wondering if it makes sense to limit the size. like may be we log only 20 function names or so ?

Thanks for reviewing the PR! I think it should fine.

The longest function I found so far is an internal UDF which has 44 characters, let's add the prefix presto.default and make it 59 characters long. Assuming 1000 such functions are used in one query, it still only takes 59K characters in total while another field runtime_stats can be as long as 790K (e.g. 20230202_133039_02539_hw725). I tried SHOW FUNCTIONS on a verifier and there were only 998 functions and a lot of them actually share the same name (with different parameters) and hence would be counted only once. So I think it's very unlikely for the size to be larger than even 20K.

That said, we can always have the feature tested on testing clusters first, populate some logs with this new added column and check if the size needs to be limited.

Got it. sounds good.

@zacw7 zacw7 force-pushed the log-function-signatures branch from ab3919e to 2feac6b Compare February 2, 2023 22:04
@zacw7 zacw7 merged commit 2a9bbcb into prestodb:master Feb 3, 2023
@zacw7 zacw7 deleted the log-function-signatures branch February 3, 2023 00:39
@wanglinsong wanglinsong mentioned this pull request Feb 25, 2023
12 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants