Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create an Expectation Suite #160

Open
jwestw opened this issue Jan 8, 2025 · 1 comment
Open

Create an Expectation Suite #160

jwestw opened this issue Jan 8, 2025 · 1 comment

Comments

@jwestw
Copy link
Contributor

jwestw commented Jan 8, 2025

An Expectation Suite in Great Expectations is a collection of Expectations that apply to a specific dataset.

You can create a new suite programmatically. The name from your TOML's [data_asset] table can be used as the Expectation Suite name.

@james-westwood
Copy link
Collaborator

james-westwood commented Jan 15, 2025

The task is to build a function or method that constructs a Great Expectation expectation suite from the inputs which are dictionaries from the generator functions.

Each of the Expectation chooser functions will return a dictionary something like this.

{
      "meta": {},
      "expectation_type": "example_expectation_name",
      "kwargs": {
        "column": "example_column_name"
      }

You can add an expectation to the suite using:

suite.expect_column_values_to_not_be_null(column)

This will add the "expect_column_values_to_not_be_null" expectation to the Great Expectations suite object. It modifies the suite in place. However, this change is only in memory until you explicitly save the suite using

context.save_expectation_suite(suite, "my_expectation_suite")

The saving action persists the expectation to a file, typically in JSON format, making it part of your saved Expectation Suite.

Ultimately an expectation suite will look like this.

{
  "expectation_suite_name": "example_survey_results_expectations",
  "ge_cloud_id": null,
  "expectations": [
    {
      "meta": {},
      "expectation_type": "expect_table_columns_to_match_ordered_list",
      "kwargs": {
        "column_list": ["reference", "period", "survey", "date_col_example_1", "date_col_example_2", "integer_col_example"]
      }
    },
    {
      "meta": {},
      "expectation_type": "expect_column_values_to_not_be_null",
      "kwargs": {
        "column": "reference"
      }
    },
    {
      "meta": {},
      "expectation_type": "expect_column_value_lengths_to_be_at_least",
      "kwargs": {
        "column": "reference",
        "min_value": 1
      }
    },
     {
      "meta": {},
      "expectation_type": "expect_column_values_to_match_regex",
      "kwargs": {
        "column": "reference",
        "regex": "^\\d{6}$"
      }
    },
    {
      "meta": {},
      "expectation_type": "expect_column_values_to_be_unique",
      "kwargs": {
        "column": "reference"
      }
    }
  ],
  "data_asset_type": "Dataset",
  "meta": {}
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants