Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
58 changes: 18 additions & 40 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,15 +6,25 @@ BigQuery DataFrames (BigFrames)
|GA| |pypi| |versions|

BigQuery DataFrames (also known as BigFrames) provides a Pythonic DataFrame
and machine learning (ML) API powered by the BigQuery engine.
and machine learning (ML) API powered by the BigQuery engine. It provides modules
for many use cases, including:

* `bigframes.pandas` provides a pandas API for analytics. Many workloads can be
* `bigframes.pandas <https://dataframes.bigquery.dev/reference/api/bigframes.pandas.html>`_
is a pandas API for analytics. Many workloads can be
migrated from pandas to bigframes by just changing a few imports.
* ``bigframes.ml`` provides a scikit-learn-like API for ML.
* `bigframes.ml <https://dataframes.bigquery.dev/reference/index.html#ml-apis>`_
is a scikit-learn-like API for ML.
* `bigframes.bigquery.ai <https://dataframes.bigquery.dev/reference/api/bigframes.bigquery.ai.html>`_
are a collection of powerful AI methods, powered by Gemini.

BigQuery DataFrames is an open-source package.
BigQuery DataFrames is an `open-source package <https://github.com/googleapis/python-bigquery-dataframes>`_.

**Version 2.0 introduces breaking changes for improved security and performance. See below for details.**
.. |GA| image:: https://img.shields.io/badge/support-GA-gold.svg
:target: https://github.com/googleapis/google-cloud-python/blob/main/README.rst#general-availability
.. |pypi| image:: https://img.shields.io/pypi/v/bigframes.svg
:target: https://pypi.org/project/bigframes/
.. |versions| image:: https://img.shields.io/pypi/pyversions/bigframes.svg
:target: https://pypi.org/project/bigframes/

Getting started with BigQuery DataFrames
----------------------------------------
Expand All @@ -38,7 +48,8 @@ To use BigFrames in your local development environment,

import bigframes.pandas as bpd

bpd.options.bigquery.project = your_gcp_project_id
bpd.options.bigquery.project = your_gcp_project_id # Optional in BQ Studio.
bpd.options.bigquery.ordering_mode = "partial" # Recommended for performance.
df = bpd.read_gbq("bigquery-public-data.usa_names.usa_1910_2013")
print(
df.groupby("name")
Expand All @@ -48,49 +59,16 @@ To use BigFrames in your local development environment,
.to_pandas()
)


Documentation
-------------

To learn more about BigQuery DataFrames, visit these pages

* `Introduction to BigQuery DataFrames (BigFrames) <https://cloud.google.com/bigquery/docs/bigquery-dataframes-introduction>`_
* `Sample notebooks <https://github.com/googleapis/python-bigquery-dataframes/tree/main/notebooks>`_
* `API reference <https://cloud.google.com/python/docs/reference/bigframes/latest/summary_overview>`_
* `API reference <https://dataframes.bigquery.dev/>`_
* `Source code (GitHub) <https://github.com/googleapis/python-bigquery-dataframes>`_

⚠️ Warning: Breaking Changes in BigQuery DataFrames v2.0
--------------------------------------------------------

Version 2.0 introduces breaking changes for improved security and performance. Key default behaviors have changed, including

* **Large Results (>10GB):** The default value for ``allow_large_results`` has changed to ``False``.
Methods like ``to_pandas()`` will now fail if the query result's compressed data size exceeds 10GB,
unless large results are explicitly permitted.
* **Remote Function Security:** The library no longer automatically lets the Compute Engine default service
account become the identity of the Cloud Run functions. If that is desired, it has to be indicated by passing
``cloud_function_service_account="default"``. And network ingress now defaults to ``"internal-only"``.
* **@remote_function Argument Passing:** Arguments other than ``input_types``, ``output_type``, and ``dataset``
to ``remote_function`` must now be passed using keyword syntax, as positional arguments are no longer supported.
* **@udf Argument Passing:** Arguments ``dataset`` and ``name`` to ``udf`` are now mandatory.
* **Endpoint Connections:** Automatic fallback to locational endpoints in certain regions is removed.
* **LLM Updates (Gemini Integration):** Integrations now default to the ``gemini-2.0-flash-001`` model.
PaLM2 support has been removed; please migrate any existing PaLM2 usage to Gemini. **Note:** The current default
model will be removed in Version 3.0.

**Important:** If you are not ready to adapt to these changes, please pin your dependency to a version less than 2.0
(e.g., ``bigframes==1.42.0``) to avoid disruption.

To learn about these changes and how to migrate to version 2.0, see the
`updated introduction guide <https://cloud.google.com/bigquery/docs/bigquery-dataframes-introduction>`_.

.. |GA| image:: https://img.shields.io/badge/support-GA-gold.svg
:target: https://github.com/googleapis/google-cloud-python/blob/main/README.rst#general-availability
.. |pypi| image:: https://img.shields.io/pypi/v/bigframes.svg
:target: https://pypi.org/project/bigframes/
.. |versions| image:: https://img.shields.io/pypi/pyversions/bigframes.svg
:target: https://pypi.org/project/bigframes/

License
-------

Expand Down