-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make dbt
CLI cold start load up to 8% faster with lazy loading of task modules and agate
#9744
Conversation
Thank you for your pull request! We could not find a changelog entry for this change. For details on how to document a change, see the contributing guide. |
1 similar comment
Thank you for your pull request! We could not find a changelog entry for this change. For details on how to document a change, see the contributing guide. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #9744 +/- ##
==========================================
- Coverage 88.18% 88.11% -0.08%
==========================================
Files 178 178
Lines 22480 22487 +7
==========================================
- Hits 19825 19814 -11
- Misses 2655 2673 +18
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
It looks like this PR needs a I also don't know what to do about the coverage requirement? Mypy covers the |
@dwreeves Thank you for making dbt better! We hope to have even more improvements to start-up time coming your way soon. |
Thanks @MichelleArk and @peterallenwebb! The associated |
The dbt CLI takes a while to load, especially before
.pyc
files are compiled. This PR makes things go much faster by making sure a handful of resources only get loaded when they are needed.In isolation, this PR speeds things up by about 5%. The additional 3% requires the following change to
dbt-adapters
: dbt-labs/dbt-adapters#126Lazy loading task modules
The first change was already discussed, but issue #4627 was closed without ever being fully resolved:
Implementing this speeds up the CLI by about 5% on a cold start.
Use
TYPE_CHECKING
to importagate
When I run
PYTHONPROFILEIMPORTTIME=1 dbt --help
to see where some additional free speed-ups may be,agate
sticks out as taking a while to load; specifically it takes up about 3.5% of the total load time for the commanddbt --help
.agate
appears to only be used in three contexts across dbt (correct me if I am wrong): dbt unit tests, dbt docs generate, and dbt seeds.Most
agate
imports are actually just for type annotations, meaning runtimes such asdbt run --select foo
that do not requireagate
at all will end up loading it.Important note: Gating
import agate
behindif TYPE_CHECKING:
also requires making this change to thedbt-adapters
library as well. Once bothdbt-core
anddbt-adapters
useif TYPE_CHECKING
to import agate in those contexts, then this change should shave another 3.5% off load times, and this speed improvement will impact not justdbt --help
but all dbt command invocations that don’t requireagate
.Before and after
I use the following script to test speeds before and after the changes. Note that this is only testing the lazy loading of modules, not
if TYPE_CHECKING: import agate
for reasons stated above.In my testing, the before and after difference is about 5%.
If we compare the before and after outputs of the
PYTHONPROFILEIMPORTTIME=1
diagnostics, we can see what modules that are no longer avoided which account for the 5% speed-up:Output: