Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve the Performance Characteristics of add_test_edges() #10950

Open
1 task done
peterallenwebb opened this issue Oct 30, 2024 · 0 comments
Open
1 task done

Improve the Performance Characteristics of add_test_edges() #10950

peterallenwebb opened this issue Oct 30, 2024 · 0 comments
Assignees

Comments

@peterallenwebb
Copy link
Contributor

Housekeeping

  • I am a maintainer of dbt-core

Short description

The add_test_edges() function is called during the dbt build command, and inserts edges into the execution graph which are meant to ensure that models downstream from a node will not run until all the tests on that node have passed.

The function is slow in certain projects, and recent data from the field show that it inflates the number of edges in the graph by a factor of six. It is slow enough that it often shows up in performance profiles, but is even more problematic in terms of memory consumption, as memory use is high enough to cause OOM crashes.

Acceptance criteria

  1. If possible, implement a new version of this function which adds edges to achieve the desired test-dependency behavior but inserts fewer edges and runs more quickly.
  2. Add a new behavior flag which causes the new function to be used, while retaining the old function on the default code path.
  3. Follow up by gathering data about the relative performance of the two implementations and monitoring for regressions.

Suggested Tests

Existing tests should suffice, but we should add additional tests to reduce the risks associated with the new implementation.

Impact to Other Teams

None.

Will backports be required?

No.

Context

No response

@peterallenwebb peterallenwebb added user docs [docs.getdbt.com] Needs better documentation triage performance and removed user docs [docs.getdbt.com] Needs better documentation labels Oct 30, 2024
@peterallenwebb peterallenwebb self-assigned this Oct 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant