Skip to content

3.0.0

Latest
Compare
Choose a tag to compare
@Ekrekr Ekrekr released this 21 Jun 11:10
· 13 commits to main since this release
0a9cc9b

TL;DR of What's Changed Since 2.9.0

dataform.json -> workflow_settings.yaml

workflow_settings.yaml has been introduced, which will gradually replace dataform.json in a later version; there is no immediate action to be taken, as dataform.json files are still valid in projects with Dataform Core 3.0.0.

dataform.json is being deprecated in favor of workflow_settings.yaml. This means that:

  • Workflow settings are now strictly typed, in Protobuf format.
  • The Dataform Core version can be specified directly in the workflow_settings.yaml file. Note: to have more than just @dataform/core as a dependency, a package.json must still be used.

Example conversion of workflow_settings.yaml:

defaultProject: dataform-demos
defaultLocation: us
defaultDataset: dataform
defaultAssertionDataset: dataform_assertions
version: 3.0.0
vars:
    environmentName: "development"

The above is equivalent to the dataform.json file:

{
  "warehouse": "bigquery",
  "defaultDatabase": "dataform-demos",
  "defaultLocation": "us",
  "defaultSchema": "dataform",
  "assertionSchema": "dataform_assertions"
  "vars": {
    "environmentName": "development"
  }
}

Notebooks Actions and actions.yaml

Notebooks as Dataform actions are on their way - but not quite yet! They're part of the compiled graph, and soon they'll be executable.

A new way of configuring action configs through actions.yaml has been implemented to support this.

An example of loading a notebook in Dataform can be seen at https://github.com/dataform-co/dataform/tree/main/examples/extreme_weather_programming.

Stateless Package Installation by @dataform/cli

Package installation by @dataform/cli is now stateless! The CLI will install NPM packages during compilation if version is defined in the workflow_settings.yaml file.

This means no node_modules folder has to be seen in the project, and Dataform users no longer need to be familiar with NPM.

Compilation Output is Now Warehouse Agnostic

Previously the output of compilation results from @dataform/core would insert warehouse specific SQL into the compiled graph. Where possible, this has been removed - transferring the responsibility of inserting warehouse specific SQL into whichever execution engine is running Dataform.

Additionally, support for non-BigQuery warehouses has been dropped. We're in discussions with Datashell for them to provide a warehouse-agnostic CLI execution engine based off of Dataform compiled graphs. In the meantime however, if you need support for a non-BigQuery warehouse, please continue using the latest version starting with 2.x.x!

dependOnDependencyAssertions

An easier ways to add assertions from dependency as dependencies has been introduced.

dependOnDependencyAssertions in config blocks can be used to add assertions from all dependencies of the action as dependencies.

config { 
    type: "view",
    dependOnDependencyAssertions: true,
    dependencies: ["some_table"]
}

select test from ${ref("some_other_table")}

Additionally, the includeDependentAssertions parameter can be used when setting individual dependencies either in config.dependencies or in ref() to add assertions for these dependencies as the dependencies for current action.

config { 
    type: "view",
    dependencies: [{name: "some_table", includeDependentAssertions: true}]
}

select test from ${ref({name: "some_other_table", includeDependentAssertions: true})}

Full Changelog from 2.9.0: 2.9.0...3.0.0