Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Select more efficient Sorting Key in Tinybird data sources #8393

Closed
gnzjgo opened this issue Nov 7, 2024 · 2 comments
Closed

Select more efficient Sorting Key in Tinybird data sources #8393

gnzjgo opened this issue Nov 7, 2024 · 2 comments
Assignees

Comments

@gnzjgo
Copy link

gnzjgo commented Nov 7, 2024

Scope & Context

Change Materialized Views Sorting Keys to make API Endpoints more performant. Context here.

Current behavior

Example packages/twenty-tinybird/datasources/webhookEventMV.datasource has the following SK:

ENGINE_SORTING_KEY timestamp, workspaceId

And when consumed in packages/twenty-tinybird/pipes/getWebhookAnalytics.pipe it is always fiktered by webhookId, workspaceId, and timestamp

    FROM webhookEventMV
    WHERE
        true
        AND webhookId = {{ String(webhookId, '90f12aed-0276-4bea-bcaa-c21ea2763d7d', required=True) }}
        AND workspaceId
        ={{ String(workspaceId, '20202020-1c25-4d02-bf25-6aeccf7ea419', required=True) }}
        AND timestamp >= {{ DateTime(start, '2024-10-22 00:00:00') }}
        AND timestamp < {{ DateTime(end, '2024-10-23 00:00:00') }}

Expected behavior

Following best practices:

Some good rules of thumb for setting Sorting Keys:

  • Order matters: Data will be stored based on the Sort Key order.
  • Between 3 and 5 columns is good enough. More will probably penalize.
  • timestamp is often a bad candidate for being the first element of the SK.
  • If you have a multi-tenant app, customer_id is a great candidate for being the first element of the SK.

I'd go for this change:

- ENGINE_SORTING_KEY timestamp, workspaceId
+ ENGINE_SORTING_KEY workspaceId, webhookId, timestamp

PS: same applies for packages/twenty-tinybird/datasources/serverlessFunctionEventMV.datasource

Technical inputs

If using git integration, check this example.
If still prototyping and fine using UI and CLI in the workspace, go for this other one.

@FelixMalfait
Copy link
Member

Thanks a lot @gnzjgo!! cc @anamarn I think you might have covered that in your ongoing PR already?

@anamarn
Copy link
Contributor

anamarn commented Nov 12, 2024

Yes it's already been covered on PR #8253 :)

@anamarn anamarn closed this as completed Nov 12, 2024
@github-project-automation github-project-automation bot moved this from 🆕 New to ✅ Done in Product development ✅ Nov 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

No branches or pull requests

3 participants