Skip to content

Cache package-lock.yml file#1086

Merged
pankajastro merged 9 commits into
mainfrom
cache-lockfile
Aug 8, 2024
Merged

Cache package-lock.yml file#1086
pankajastro merged 9 commits into
mainfrom
cache-lockfile

Conversation

@pankajastro
Copy link
Copy Markdown
Contributor

@pankajastro pankajastro commented Jul 10, 2024

Description

This PR aims to cache the package-lock.yml in cache_dir/dbt_project

Since dbt version 1.7.0, executing the dbt deps command results in the generation of a package-lock.yml file. This file pins the dependencies and their versions for the dbt project. dbt uses this file to install packages, ensuring predictable and consistent package installations across environments.

  • This feature is enabled only if the user checks in package-lock.yml in their dbt project. Also, I'm assuming if package-lock.yml their dbt-core version is >= 1.7.0 since this feature is available for only dbt >= 1.7.0
  • package-lock.yml also contains the sha1_hash of the packages. This is used to check if the cached package-lock.yml is outdated or not in this PR
  • The cached package-lock.yml is finally copied from from cached path to the tmp project and used
  • To update dependencies or versions, it is expected that the user will manually update their package-lock.yml in the dbt project using the dbt deps command.

Related Issue(s)

closes: #930

Breaking Change?

Checklist

  • I have made corresponding changes to the documentation (if required)
  • I have added tests that prove my fix is effective or that my feature works

@netlify
Copy link
Copy Markdown

netlify Bot commented Jul 10, 2024

Deploy Preview for sunny-pastelito-5ecb04 canceled.

Name Link
🔨 Latest commit 4559a42
🔍 Latest deploy log https://app.netlify.com/sites/sunny-pastelito-5ecb04/deploys/668e0c141d9e710008156b88

@netlify
Copy link
Copy Markdown

netlify Bot commented Jul 10, 2024

Deploy Preview for sunny-pastelito-5ecb04 canceled.

Name Link
🔨 Latest commit af2a4ea
🔍 Latest deploy log https://app.netlify.com/sites/sunny-pastelito-5ecb04/deploys/66b5296ae850850008045096

Comment thread cosmos/cache.py Outdated
@codecov
Copy link
Copy Markdown

codecov Bot commented Jul 11, 2024

Codecov Report

Attention: Patch coverage is 98.07692% with 1 line in your changes missing coverage. Please review.

Project coverage is 96.53%. Comparing base (711bb7c) to head (af2a4ea).

Files Patch % Lines
cosmos/cache.py 97.22% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1086      +/-   ##
==========================================
+ Coverage   96.51%   96.53%   +0.02%     
==========================================
  Files          64       64              
  Lines        3325     3374      +49     
==========================================
+ Hits         3209     3257      +48     
- Misses        116      117       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@pankajastro pankajastro changed the title WIP: Cache package-lock.yml file Cache package-lock.yml file Jul 14, 2024
@pankajastro pankajastro marked this pull request as ready for review July 15, 2024 06:37
@dosubot dosubot Bot added size:L This PR changes 100-499 lines, ignoring generated files. area:dependencies Related to dependencies, like Python packages, library versions, etc dbt:deps Primarily related to dbt deps command or functionality labels Jul 15, 2024
@tatiana tatiana added this to the Cosmos 1.6.0 milestone Jul 18, 2024
Copy link
Copy Markdown
Contributor

@pankajkoti pankajkoti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM mostly, have a couple of questions inline

Comment thread cosmos/cache.py
Comment thread cosmos/cache.py Outdated
Comment thread cosmos/cache.py Outdated
@dosubot dosubot Bot added the lgtm This PR has been approved by a maintainer label Aug 5, 2024
Comment thread cosmos/cache.py Outdated
Comment thread cosmos/cache.py
Copy link
Copy Markdown
Contributor

@pankajkoti pankajkoti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, one minor suggestion inline

Comment thread cosmos/dbt/graph.py Outdated
@pankajastro pankajastro merged commit e847f19 into main Aug 8, 2024
@pankajastro pankajastro deleted the cache-lockfile branch August 8, 2024 22:53
tatiana pushed a commit that referenced this pull request Aug 14, 2024
This PR aims to cache the package-lock.yml in `cache_dir/dbt_project`

Since dbt version 1.7.0, executing the dbt deps command results in the
generation of a package-lock.yml file. This file pins the dependencies
and their versions for the dbt project. dbt uses this file to install
packages, ensuring predictable and consistent package installations
across environments.

- This feature is enabled only if the user checks in package-lock.yml in
their dbt project. Also, I'm assuming if `package-lock.yml` their
dbt-core version is >= 1.7.0 since this feature is available for only
dbt >= 1.7.0
- package-lock.yml also contains the sha1_hash of the packages. This is
used to check if the cached package-lock.yml is outdated or not in this
PR
- The cached `package-lock.yml` is finally copied from from cached path
to the tmp project and used
- To update dependencies or versions, it is expected that the user will
manually update their package-lock.yml in the dbt project using the dbt
deps command.


closes: #930
@pankajkoti pankajkoti mentioned this pull request Aug 16, 2024
pankajkoti added a commit that referenced this pull request Aug 20, 2024
New Features

* Add support for loading manifest from cloud stores using Airflow
Object Storage by @pankajkoti in #1109
* Cache ``package-lock.yml`` file by @pankajastro in #1086
* Support persisting the ``LoadMode.VIRTUALENV`` directory by @tatiana
in #1079
* Add support to store and fetch ``dbt ls`` cache in remote stores by
@pankajkoti in #1147
* Add default source nodes rendering by @arojasb3 in #1107
* Add Teradata ``ProfileMapping`` by @sc250072 in #1077

Enhancements

* Add ``DatabricksOauthProfileMapping`` profile by @CorsettiS in #1091
* Use ``dbt ls`` as the default parser when ``profile_config`` is
provided by @pankajastro in #1101
* Add task owner to dbt operators by @wornjs in #1082
* Extend Cosmos custom selector to support + when using paths and tags
by @mvictoria in #1150
* Simplify logging by @dwreeves in #1108

Bug fixes

* Fix Teradata ``ProfileMapping`` target invalid issue by @sc250072 in
#1088
* Fix empty tag in case of custom parser by @pankajastro in #1100
* Fix ``dbt deps`` of ``LoadMode.DBT_LS`` should use
``ProjectConfig.dbt_vars`` by @tatiana in #1114
* Fix import handling by lazy loading hooks introduced in PR #1109 by
@dwreeves in #1132
* Fix Airflow 2.10 regression and add Airflow 2.10 in test matrix by
@pankajastro in #1162

Docs

* Fix typo in azure-container-instance docs by @pankajastro in #1106
* Use Airflow trademark as it has been registered by @pankajastro in
#1105

Others

* Run some example DAGs in Kubernetes execution mode in CI by
@pankajastro in #1127
* Install requirements.txt by default during dev env spin up by
@@CorsettiS in #1099
* Remove ``DbtGraph.current_version`` dead code by @tatiana in #1111
* Disable test for Airflow-2.5 and Python-3.11 combination in CI by
@pankajastro in #1124
* Pre-commit hook updates in #1074, #1113, #1125, #1144, #1154,  #1167

---------

Co-authored-by: Pankaj Koti <pankajkoti699@gmail.com>
Co-authored-by: Pankaj Singh <98807258+pankajastro@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:dependencies Related to dependencies, like Python packages, library versions, etc dbt:deps Primarily related to dbt deps command or functionality lgtm This PR has been approved by a maintainer size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Cache dbt deps lock file

3 participants