Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Users sessions is very slow to process #58

Open
1 of 6 tasks
drphrozen opened this issue Jun 17, 2024 · 4 comments
Open
1 of 6 tasks

Users sessions is very slow to process #58

drphrozen opened this issue Jun 17, 2024 · 4 comments
Labels
status:needs_triage Needs maintainer triage. type:bug Bugs or weaknesses. The issue has to contain steps to reproduce.

Comments

@drphrozen
Copy link

Describe the bug

In our dbt run of unified snowplow_unified_users_sessions_this_run takes 29742s out of 31658s (~94%).

Steps to reproduce

In our case, add a lot of users and run unified model.

Expected results

That the peformance matched web dbt :)

Actual results

Its not

Screenshots and log output

dbt-output.txt

System information

The contents of your packages.yml file:

# add dependencies. these will get pulled during the `dbt deps` process.
---
packages:
  # https://hub.getdbt.com/dbt-labs/dbt_utils/latest/
  - package: dbt-labs/dbt_utils
    version: [">=1.1.1", "<2.0.0"]

  - package: snowplow/snowplow_unified
    version: 0.4.0

  - package: snowplow/snowplow_ecommerce
    version: 0.8.2

Which database are you using dbt with?

  • postgres
  • redshift
  • bigquery
  • snowflake
  • databricks
  • other (specify: ____________)

The output of dbt --version:

1.8.2

The operating system you're using:
N/A

The output of python --version:
N/A

Additional context

Compared to dbt web it looks like the calculation of start/end time was done incrementally, but is now calculated directly when needed.

Are you interested in contributing towards the fix?

Yes, if feedback is needed

@drphrozen drphrozen added the type:bug Bugs or weaknesses. The issue has to contain steps to reproduce. label Jun 17, 2024
@github-actions github-actions bot added the status:needs_triage Needs maintainer triage. label Jun 17, 2024
@agnessnowplow
Copy link
Collaborator

Thanks @drphrozen for raising this, we have just released v0.4.3 where we restructured the user_sessions_this_run model a bit in case it helps the query optimizer. If this doesn't help it's best to raise a support ticket if you can as it needs further investigation perhaps related to your warehouse setup.

@drphrozen
Copy link
Author

I deployed 0.4.3 yesterday and it completed this morning.. Issue remains, i'll reach out to support and reference this issue,

@agnessnowplow
Copy link
Collaborator

Thanks. For comparison, it would be great to know which version of the web package you used, was it v1.0.1?

@drphrozen
Copy link
Author

It's version 1.0.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status:needs_triage Needs maintainer triage. type:bug Bugs or weaknesses. The issue has to contain steps to reproduce.
Projects
None yet
Development

No branches or pull requests

2 participants