feat: allow to set clustering and time partitioning options at table …

…creation (#928) * refactor: standardize bigquery options handling to manage more options * feat: handle table partitioning, table clustering and more table options (expiration_timestamp, expiration_timestamp, require_partition_filter, default_rounding_mode) via create_table dialect options * fix: having clustering fields and partitioning exposed has table indexes leads to bad autogenerated version file def upgrade() -> None: # ### commands auto generated by Alembic - please adjust! ### op.drop_index('clustering', table_name='dataset.some_table') op.drop_index('partition', table_name='dataset.some_table') # ### end Alembic commands ### def downgrade() -> None: # ### commands auto generated by Alembic - please adjust! ### op.create_index('partition', 'dataset.some_table', ['createdAt'], unique=False) op.create_index('clustering', 'dataset.some_table', ['id', 'createdAt'], unique=False) # ### end Alembic commands ### * docs: update README to describe how to create clustered and partitioned table as well as other newly supported table options * test: adjust system tests since indexes are no longer populated from table partitions and clustering info * test: alembic now supports creating partitioned tables * test: run integration tests with all the new create_table options * chore: rename variables to represent what it is a bit more clearly * fix: assertions should no be used to validate user inputs * refactor: extract process_option_value() from post_create_table() for improved readability * docs: add docstring to post_create_table() and _process_option_value() * test: increase code coverage by testing error cases * refactor: better represent the distinction between the option value data type check and the transformation in SQL literal * test: adding test cases for _validate_option_value_type() and _process_option_value() * chore: coding style * chore: reformat files with black * test: typo in tests * feat: change the option name for partitioning to leverage the TimePartitioning interface of the Python Client for Google BigQuery * fix: TimePartitioning.field is optional * chore: coding style * test: fix system test with table option bigquery_require_partition_filter * feat: add support for experimental range_partitioning option * test: fix system test with new bigquery_time_partitioning table option * docs: update README with time_partitioning and range_partitioning * test: relevant comments in unit tests * test: cover all error cases * chore: no magic numbers * chore: consistency in docstrings * chore: no magic number * chore: better error types * chore: fix W605 invalid escape sequence
googleapis · Jan 10, 2024 · c2c2958 · c2c2958
1 parent ac74a34
commit c2c2958
Show file tree

Hide file tree

Showing 7 changed files with 799 additions and 67 deletions.
diff --git a/README.rst b/README.rst
@@ -292,14 +292,65 @@ To add metadata to a table:
 
 .. code-block:: python
 
-    table = Table('mytable', ..., bigquery_description='my table description', bigquery_friendly_name='my table friendly name')
+    table = Table('mytable', ...,
+        bigquery_description='my table description',
+        bigquery_friendly_name='my table friendly name',
+        bigquery_default_rounding_mode="ROUND_HALF_EVEN",
+        bigquery_expiration_timestamp=datetime.datetime.fromisoformat("2038-01-01T00:00:00+00:00"),
+    )
 
 To add metadata to a column:
 
 .. code-block:: python
 
     Column('mycolumn', doc='my column description')
 
+To create a clustered table:
+
+.. code-block:: python
+
+    table = Table('mytable', ..., bigquery_clustering_fields=["a", "b", "c"])
+
+To create a time-unit column-partitioned table:
+
+.. code-block:: python
+
+    from google.cloud import bigquery
+
+    table = Table('mytable', ...,
+        bigquery_time_partitioning=bigquery.TimePartitioning(
+            field="mytimestamp",
+            type_="MONTH",
+            expiration_ms=1000 * 60 * 60 * 24 * 30 * 6, # 6 months
+        ),
+        bigquery_require_partition_filter=True,
+    )
+
+To create an ingestion-time partitioned table:
+
+.. code-block:: python
+
+    from google.cloud import bigquery
+
+    table = Table('mytable', ...,
+        bigquery_time_partitioning=bigquery.TimePartitioning(),
+        bigquery_require_partition_filter=True,
+    )
+
+To create an integer-range partitioned table
+
+.. code-block:: python
+
+    from google.cloud import bigquery
+
+    table = Table('mytable', ...,
+        bigquery_range_partitioning=bigquery.RangePartitioning(
+            field="zipcode",
+            range_=bigquery.PartitionRange(start=0, end=100000, interval=10),
+        ),
+        bigquery_require_partition_filter=True,
+    )
+
 
 Threading and Multiprocessing
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^