1.33.0 (2025-01-22)
- Add
bigframes.bigquery.sql_scalar()
to apply SQL syntax on Series objects (#1293) (aa2f73a) - Add unix_seconds, unix_millis and unix_micros for timestamp series. (#1297) (e4b0c8d)
- DataFrame.join supports Series other (#1303) (ee37a0a)
- Support array output in
remote_function
(#1057) (bdee173)
- Dataframe sort_values Series input keyerror. (#1285) (5a2731b)
- Fix read_gbq_function issue in dataframe apply method (#1174) (0318764)
- Series sort_index and sort_values now raises when axis!=0 (#1294) (94bc2f2)
- Add snippet to forecast future time series in the Forecast a single time series with a univariate model tutorial (#1271) (a687050)
- Update
bigframes.pandas.Series
docs (#1273) (0cac64f)
1.32.0 (2025-01-13)
- Add max_retries to TextEmbeddingGenerator and Claude3TextGenerator (#1259) (8077ff4)
- Bigframes.bigquery.parse_json (#1265) (27bbd80)
- Support DataFrame.astype(dict) (#1262) (5934f8e)
- Avoid global mutation in
BigQueryOptions.client_endpoints_override
(#1280) (788f6e9) - Fix erroneous window bounds removal during compilation (#1163) (f91756a)
- Add bq studio links that allows users to generate Jupiter notebooks in bq studio with github contents (#1266) (58f13cb)
- Add snippet to evaluate ARIMA plus model in the Forecast a single time series with a univariate model tutorial (#1267) (3dcae2d)
- Add snippet to see the ARIMA coefficients in the Forecast a single time series with a univariate model tutorial (#1268) (059a564)
- Update
bigframes.pandas.pandas
docstrings (#1247) (c4bffc3) - Use 002 model for better scalability in text generation (#1270) (bb7a850)
1.31.0 (2025-01-05)
- Raise if trying to change
ordering_mode
after session has started (#1252) (8cfaae8) - Reduce the number of labels added to query jobs (#1245) (fdcdc18)
- Remove bq studio link (#1258) (dd4fd2e)
- Update bigframes.pandas.DatetimeMethods docstrings (#1246) (10f08da)
- Update semantic_operators.ipynb (#1260) (a2ed989)
1.30.0 (2024-12-30)
- Add
GeoSeries.x
andGeoSeries.y
(#1126) (4c3548f) - Add
LinearRegression.predict_explain()
to generateML.EXPLAIN_PREDICT
columns (#1190) (e13eca2) - Add
LogisticRegression.predict_explain()
to generateML.EXPLAIN_PREDICT
columns (#1222) (bcbc732) - Add
write_engine
parameter toread_FORMATNAME
methods to control how data is written to BigQuery (#371) (ed47ef1) - Add client side retry to GeminiTextGenerator (#1242) (8193abe)
- Add Gemini-pro-1.5 to GeminiTextGenerator Tuning and Support score() method in Gemini-pro-1.5 (#1208) (298fc73)
- Add support for
LinearRegression.predict_explain
andLogisticRegression.predict_explain
parameter,top_k_features
(#1228) (3068e19) - Support dataframe where method (#1166) (71b4053)
- Arima model series input. (#1237) (f7d52d9)
- Json in struct destination type (#1187) (200c9bb)
- Throw an error message when setting is_row_processor=True to read a multi param function (#1160) (b2816a5)
- Add an "open in BQ Studio" link to all BigFrames sample notebooks (#1223) (e0a8288)
- Add bq studio link for a new ipynb file called "bq_dataframes_template.ipynb" (#1239) (840aaff)
- Add example for logistic regression (#1240) (4d854fd)
- Add examples for ml PCA and SimpleImputer (#1236) (0d84459)
- Add KMeans example (#1234) (d87ab97)
- Add linear model example (#1235) (2c3e1fd)
- Add ml.model_selection examples (#1238) (50648e4)
- Add python snippet for "Create the time series model" section of the Forecast a single time series with a univariate model tutorial (#1227) (20f3190)
1.29.0 (2024-12-12)
- Add Gemini 2.0 text gen sample notebook (#1211) (9596b66)
- Update bigframes.pandas.index docs return types (#1191) (c63e7da)
1.28.0 (2024-12-11)
- (Series | DataFrame).plot.bar (#1152) (0fae2e0)
bigframes.bigquery.vector_search
supportsuse_brute_force
andfraction_lists_to_search
parameters (#1158) (131edc3)- Add
ARIMAPlus.predict_explain()
to generate forecasts with explanation columns (#1177) (05f8b4d) - Add client_endpoints_override to bq options (#1167) (be74b99)
- Add support for temporal types in dataframe's describe() method (#1189) (2d564a6)
- Allow join-free alignment of analytic expressions (#1168) (daef4f0)
- Series.isin supports bigframes.Series arg (#1195) (0d8a16b)
- Update llm.TextEmbeddingGenerator to 005 (#1186) (3072d38)
- Fix error loading local dataframes into bigquery (#1165) (5b355ef)
- Fix null index join with 'on' arg (#1153) (9015c33)
- Fix series.isin using local path always (#1202) (a44eafd)
- Add a code sample using
bpd.options.bigquery.ordering_mode = "partial"
(#909) (f80d705) - Add snippet for creating boosted tree model (#1142) (a972668)
- Add snippet for evaluating a boosted tree model (#1154) (9d8970a)
- Add snippet for predicting classifications using a boosted tree model (#1156) (e7b83f1)
- Add third party
pandas.Index methods
and docstrings (#1171) (a970294) - Fix Bigframes.Pandas.General_Function missing docs (#1164) (de923d0)
- Update
bigframes.pandas.Index
docstrings (#1144) (557ab8d)
1.27.0 (2024-11-16)
- Dataframe fillna with scalar. (#1132) (37f8c32)
- Exclude index columns from model fitting processes. (#1138) (8d4da15)
- Unordered mode too many labels issue. (#1148) (7216b21)
1.26.0 (2024-11-12)
- Add basic geopandas functionality (#962) (3759c63)
- Support
json_extract_string_array
in thebigquery
module (#1131) (4ef8bac)
- Fix Series.to_frame generating string label instead of int where name is None (#1118) (14e32b5)
- Update the API documentation with newly added rep (#1120) (72c228b)
- Reduce CURRENT_TIMESTAMP queries (#1114) (32274b1)
- Reduce dry runs from read_gbq with table (#1129) (f7e4354)
- Add file for Classification with a Boosted Treed Model and snippet for preparing sample data (#1135) (7ac6639)
- Add snippet for Linear Regression tutorial Predict Outcomes section (#1101) (108f4a9)
- Update
DataFrame
docstrings to include the errors section (#1127) (a38d4c4) - Update GroupBy docstrings (#1103) (9867a78)
- Update Session doctrings to include exceptions (#1130) (a870421)
1.25.0 (2024-10-29)
- Add the
ground_with_google_search
option for GeminiTextGenerator predict (#1119) (ca02cd4) - Add warning when user tries to access struct series fields with
__getitem__
(#1082) (20e5c58) - Allow
fit
to take additional eval data in linear and ensemble models (#1096) (254875c) - Support context manager for bigframes session (#1107) (5f7b8b1)
1.24.0 (2024-10-24)
1.23.0 (2024-10-23)
- Add
bigframes.bigquery.create_vector_index
to assist in creating vector index onARRAY<FLOAT64>
columns (#1024) (863d694) - Add gemini-1.5-pro-002 and gemini-1.5-flash-002 to known Gemini model list. (#1105) (7094c85)
- Add support for pandas series & data frames as inputs for ml models. (#1088) (30c8883)
- Cleanup temp resources with session deletion (#1068) (1d5373d)
- Show possible correct key(s) in
.__getitem__
KeyError message (#1097) (32fab96) - Support uploading local geo data (#1036) (51cdd33)
- Escape ids more consistently in ml module (#1074) (103e998)
- Model.fit metric not collected issue. (#1085) (06cec00)
- Remove index requirement from some dataframe APIs (#1073) (2d16f6d)
- Update session metrics in
read_gbq_query
(#1084) (dced460)
- Speed up tree transforms during sql compile (#1071) (d73fe9d)
- Utilize ORDER BY LIMIT over ROW_NUMBER where possible (#1077) (7003d1a)
- Add ml tutorial for Evaluate the model (#1038) (a120bae)
- Show best practice of closing the session to cleanup resources in sample notebooks (#1095) (62a88e8)
- Update docstrings of Session and related files (#1087) (bf93e80)
1.22.0 (2024-10-09)
- Support regional endpoints for more bigquery locations (#1061) (45b672a)
- Update LLM generators to warn user about model name instead of raising error. (#1048) (650d80d)
- Access MATERIALIZED_VIEW with read_gbq (#1070) (601e984)
- Correct zero row count in DataFrame from table view (#1062) (b536070)
- Fix generic error message when entering an incorrect column name (#1031) (5ac217d)
- Make
explode
respect the index labels (#1064) (99ca0df) - Make invalid location warning case-insensitive (#1044) (b6cd55a)
- Remove palm2 test case from llm load test (#1063) (575a10a)
- Show warning for unknown location set through .ctor (#1052) (02c2da7)
- Reduce schema tracking overhead (#1056) (1c3879d)
- Repr generates fewer queries (#1046) (d204603)
- Speedup internal tree comparisons (#1060) (4379438)
1.21.0 (2024-10-02)
- Add deprecation warning to PaLM2TextGenerator model (#1035) (1183b0f)
- Add DeprecationWarning for PaLM2TextEmbeddingGenerator (#1018) (4af5bbb)
- Add ml.model_selection.cross_validate support (#1020) (1a38063)
- Allow access of struct fields with dot operators on
Series
(#1019) (ef76f13)
- Ensure no double execution for to_pandas (#1032) (4992cc2)
- Remove pre-caching of remote function results (#1028) (0359bc8)
1.20.0 (2024-09-25)
- Add bigframes.bigquery.approx_top_count (#1010) (3263bd7)
- Add bigframes.ml.compose.SQLScalarColumnTransformer to create custom SQL-based transformations (#955) (1930b4e)
- Allow multiple columns input for llm models (#998) (2fe5e48)
- Limit pypi notebook to 7 days and add more info about differences with partial ordering mode (#1013) (3c54399)
- Move and edit existing linear-regression tutorial snippet (#991) (4cb62fd)
1.19.0 (2024-09-24)
- Add ml.model_selection.KFold class (#1001) (952cab9)
- Support bool and bytes types in
describe(include='all')
(#994) (cc48f58) - Support ingress settings in
remote_function
(#1011) (8e9919b)
1.18.0 (2024-09-18)
- Add "include" param to describe for string types (#973) (deac6d2)
- Add
subset
parameter toDataFrame.dropna
to select which columns to consider (#981) (f7c03dc)
- DataFrameGroupby.agg now works with unnamed tuples (#985) (0f047b4)
- Fix a bug that raises exception when re-indexing columns with their original order (#988) (596b03b)
- Make the
Series.apply
outcomeassign
able to the original dataframe in partial ordering mode (#874) (c94ead9)
- Limit ibis-framework version to 9.2.0 (#989) (06c1b33)
- Update to ibis-framework 9.x and newer sqlglot (#827) (89ea44f)
1.17.0 (2024-09-11)
- Add
__version__
alias to bigframes.pandas (#967) (9ce10b4) - Add Gemini 1.5 stable models support (#945) (c1cde19)
- Allow setting table labels in
to_gbq
(#941) (cccc6ca) - Define list accessor for bigframes Series (#946) (8e8279d)
- Enable read_csv() to process other files (#940) (3b35860)
- Include the bigframes package version alongside the feedback link in error messages (#936) (7b59b6d)
- Astype Decimal to Int64 conversion. (#957) (27764a6)
- Make
read_gbq_function
work for multi-param functions (#947) (c750be6) - Support
read_gbq_function
for axis=1 application (#950) (86e54b1)
- Add docstring returns section to Options (#937) (a2640a2)
- Update title of pypi notebook example to reflect use of the PyPI public dataset (#952) (cd62e60)
1.16.0 (2024-09-04)
- Add
DataFrame.struct.explode
to add struct subfields to a DataFrame (#916) (ad2f75e) - Implement
bigframes.bigquery.json_extract_array
(#910) (575a29e) - Recover struct column from exploded Series (#904) (7dd304c)
- Fix issue with iterating on >10gb dataframes (#949) (2b0f0fa)
- Improve
Series.replace
for dict input (#907) (4208044) - NullIndex in ML model.predict error (#917) (612271d)
- Struct field non-nullable type issue. (#914) (149d5ff)
- Unordered mode errors in ml train_test_split (#925) (85d7c21)
- Re-introduce support for numpy 1.24.x (#931) (3d71913)
- Update minimum support to Pandas 1.5.3 and Pyarrow 10.0.1 (#903) (7ed3962)
- Add Claude3 ML and RemoteFunc notebooks (#930) (cfd16c1)
- Create sample notebook to manipulate struct and array data (#883) (3031903)
- Update struct examples. (#953) (d632cd0)
- Use unstack() from BigQuery DataFrames instead of pandas in the PyPI sample notebook (#890) (d1883cc)
1.15.0 (2024-08-20)
- Add llm.TextEmbeddingGenerator to support new embedding models (#905) (6bc6a41)
- Add ml.llm.Claude3TextGenerator model (#901) (7050038)
- Add columns for "requires ordering/index" to supported APIs summary (#892) (d2fc51a)
- Remove duplicate description for
kms_key_name
(#898) (1053d56) - Update embedding model notebooks (#906) (d9b8ef5)
1.14.0 (2024-08-14)
- Implement
bigframes.bigquery.json_extract
(#868) (3dbf84b) - Implement
Series.str.__getitem__
(#897) (e027b7e)
- Generate SQL with fewer CTEs (#877) (eb60804)
- Speed up compilation by reducing redundant type normalization (#896) (e0b11bc)
- Add streaming html docs (#884) (171da6c)
- Fix the
DisplayOptions
doc rendering (#893) (3eb6a17) - Update streaming notebook (#887) (6e6f9df)
1.13.0 (2024-08-05)
df.apply(axis=1)
to support remote function with mutiple params (#851) (2158818)- Allow windowing in 'partial' ordering mode (#861) (ca26fe5)
- Create a separate OrderingModePartialPreviewWarning for more fine-grained warning filters (#879) (8753bdd)
- Create sample notebook using
ordering_mode="partial"
(#880) (c415eb9) - Update streaming notebook (#875) (e9b0557)
1.12.0 (2024-07-31)
- Add bigframes-mode label to query jobs (#832) (c9eaff0)
- Add config option to set partial ordering mode (#855) (823c0ce)
- Add stratify param support to ml.model_selection.train_test_split method (#815) (27f8631)
- Add streaming.StreamingDataFrame class (#864) (a7d7197)
- Allow DataFrame.join for self-join on Null index (#860) (e950533)
- Support remote function cleanup with
session.close
(#818) (ed06436) - Support to_csv/parquet/json to local files/objects (#858) (d0ab9cc)
- Fewer relation joins from df self-operations (#823) (0d24f73)
- Fix 'sql' property for null index (#844) (1b6a556)
- Fix unordered mode using ordered path to print frame (#839) (93785cb)
- Reduce redundant
remote_function
deployments (#856) (cbf2d42)
- Add partner attribution steps to integrations sample notebook (#835) (d7b333f)
- Make
get_global_session
/close_session
/reset_session
appears in the docs (#847) (01d6bbb)
1.11.1 (2024-07-08)
- Remove session and connection in llm notebook (#821) (74170da)
- Remove the experimental flask icon from the public docs (#820) (067ff17)
1.11.0 (2024-07-01)
- Add .agg support for size (#792) (87e6018)
- Add
bigframes.bigquery.json_set
(#782) (1b613e0) - Add
bigframes.streaming.to_pubsub
method to create continuous query that writes to Pub/Sub (#801) (b47f32d) - Add
DataFrame.to_arrow
to create Arrow Table from DataFrame (#807) (1e3feda) - Add
PolynomialFeatures
support toto_gbq
and pipelines (#805) (57d98b9) - Add Series.peek to preview data efficiently (#727) (580e1b9)
- Expose gcf memory param in
remote_function
(#803) (014765c) - More informative error when query plan too complex (#811) (136dc24)
1.10.0 (2024-06-21)
- Add dataframe.insert (#770) (e8bab68)
- Add groupby head API (#791) (44202bc)
- Add ml.preprocessing.PolynomialFeatures class (#793) (b4fbb51)
- Bigframes.streaming module for continuous queries (#703) (0433a1c)
- Include index columns in DataFrame.sql if they are named (#788) (c8d16c0)
- Allow
__repr__
to work with uninitialed DataFrame/Series/Index (#778) (e14c7a9) - Df.loc with the 2nd input as bigframes boolean Series (#789) (a4ac82e)
- Ensure numpy version matches in
remote_function
deployment (#798) (324d93c) - Fix temp table creation retries by now throwing if table already exists. (#787) (0e57d1f)
- Self-join optimization doesn't needlessly invalidate caching (#797) (1b96b80)
1.9.0 (2024-06-10)
- Allow functions returned from
bpd.read_gbq_function
to execute outside ofapply
(#706) (ad7d8ac) - Support
bigquery.vector_search()
(#736) (dad66fd) - Support
score()
in GeminiTextGenerator (#740) (b2c7d8b) - Support bytes type in
remote_function
(#761) (4915424) - Support fit() in GeminiTextGenerator (#758) (d751f5c)
- ARIMAPlus loads auto_arima_min_order param (#752) (39d7013)
- Improve to_pandas_batches for large results (#746) (61f18cb)
- Resolve issue with unset thread-local options (#741) (d93dbaf)
- Fix ML.EVALUATE spelling (#749) (7899749)
- Remove LogisticRegression normal_equation strategy (#753) (ea5d367)
1.8.0 (2024-05-31)
merge
only generates a default index if both inputs already have an index (#733) (25d049c)- Add
+
,-
as unary ops,^
binary op (#724) (968d825) - Add
GroupBy.size()
to get number of rows in each group (#479) (1fca588) - Add DataFrame
~
operator (#721) (354abc1) - Add GeminiText 1.5 Preview models (#737) (56cbd3b)
- Add slot_millis and add stats to session object (#725) (72e9583)
- Adds bigframes.bigquery.array_to_string to convert array elements to delimited strings (#731) (f12c906)
- Allow functions decorated with
bpd.remote_function()
to execute locally (#704) (d850da6) - Ensure
"bigframes-api"
label is always set on jobs, even if the API is unknown (#722) (1832778) - Support
ml.SimpleImputer
in bigframes (#708) (4c4415f) - Support type annotations to supply input and output types to
bpd.remote_function()
decorator (#717) (4a12e3c) - Support type annotations with
bpd.remote_function()
andaxis=1
(a preview feature) (#730) (e5a2992)
- Correct index labels in multiple aggregations for DataFrameGroupBy (#723) (6a78c89)
- Fix Null index assign series to column (#711) (ffb4b57)
- Set
bpd.remote_function()
sinput_types
andoutput_types
default toNone
to allow omitting them when type annotations are present (#729) (0e25a3b) - Warn and disable time travel for linked datasets (#712) (085fa9d)
1.7.0 (2024-05-20)
read_gbq_query
supportsfilters
(9386373)read_gbq
suggests a correct column name when one is not found (9386373)- Add
DefaultIndexKind.NULL
to use asindex_col
inread_gbq*
, creating an indexless DataFrame/Series (#662) (29e4886) - Bigframes.bigquery.array_agg(SeriesGroupBy|DataFrameGroupby) (#663) (412f28b)
- To_datetime supports utc=False for string inputs (#579) (adf9889)
read_gbq_table
respects primary keys even whenfilters
are set (#689) (9386373)- Fix type error in test_cluster (#698) (14d81c1)
- Improve escaping of literals and identifiers (#682) (da9b136)
- Properly identify non-unique index in tables without primary keys (#699) (6e0f4d8)
- Remove a usage of the
resource
package when not available, such as on Windows (#681) (96243f2) - The imported samples error and use peek() (#688) (1a0b744)
- Don't run query immediately from
read_gbq_table
iffilters
is set (9386373) - Use a
LIMIT
clause whenmax_results
is set (9386373)
- Add code snippets for imported onnx tutorials (#684) (cb36e46)
- Add code snippets for imported tensorflow model (#679) (b02c401)
- Use
class_weight="balanced"
in the logistic regression prediction tutorial (#678) (b951549)
1.6.0 (2024-05-13)
- Add
DataFrame.__delitem__
(#673) (2218c21) - Add
Series.case_when()
(#673) (2218c21) - Add
strategy="quantile"
in KBinsDiscretizer (#654) (c6c487f) - Add Series.combine (#680) (2fd1b81)
- Series.str.split (#675) (6eb19a7)
- Suggest correct options in bpd.options.bigquery.location (#666) (57ccabc)
- Support
axis=1
indf.apply
for scalar outputs (#629) (f6bdc4a) - Support gcf vpc connector in
remote_function
(#677) (9ca92d0) - Warn with a more specific
DefaultLocationWarning
category when no location can be detected (#648) (e084e54)
- Add jellyfish as a dependency for spelling correction (57ccabc)
- Add code snippets for llm text generatiion (#669) (93416ed)
- Add logistic regression samples (#673) (2218c21)
- Address lint errors in code samples (#665) (4fc8964)
- Document inlining of small data in
read_*
APIs (#670) (306953a)
1.5.0 (2024-05-07)
bigframes.options
andbigframes.option_context
now uses thread-local variables to prevent context managers in separate threads from affecting each other (#652) (651fd7d)- Add
ARIMAPlus.coef_
property exposingML.ARIMA_COEFFICIENTS
functionality (#585) (81d1262) - Add a unique session_id to Session and allow cleaning up sessions (#553) (c8d4e23)
- Add the
bigframes.bigquery
sub-package with abigframes.bigquery.array_length
function (#630) (9963f85) - Always do a query dry run when
option.repr_mode == "deferred"
(#652) (651fd7d) - Custom query labels for compute options (#638) (f561799)
- Warn with
DefaultIndexWarning
fromread_gbq
on clustered/partitioned tables with noindex_col
orfilters
set (#631, #658) (2715d2b, 73064dd) - Support
index_col=False
inread_csv
andengine="bigquery"
(73064dd) - Support gcf max instance count in
remote_function
(#657) (36578ab)
- Don't raise UnknownLocationWarning for US or EU multi-regions (#653) (8e4616b)
- Fix bug with na in the column labels in stack (#659) (4a34293)
- Use explicit session in
PaLM2TextGenerator
(#651) (e4f13c3)
- Add python code sample for multiple forecasting time series (#531) (16866d2)
- Fix the Palm2TextGenerator output token size (#649) (c67e501)
1.4.0 (2024-04-29)
- Add .cache() method to persist intermediate dataframe (#626) (a5c94ec)
- Add transpose support for small homogeneously typed DataFrames. (#621) (054075d)
- Allow single input type in
remote_function
(#641) (3aa643f) - Expose gcf max timeout in
remote_function
(#639) (dfeaad0) - Series binary ops compatible with more types (#618) (518d315)
- Support the
score
method forPaLM2TextGenerator
(#634) (3ffc1d2)
- Allow to_pandas to download more than 10GB (#637) (ce56495)
- Extend row hash to 128 bits to guarantee unique row id (#632) (9005c6e)
- Llm fine tuning tests (#627) (4724a1a)
- Llm palm score tests (#643) (cf4ec3a)
- Automatically condense internal expression representation (#516) (03c1b0d)
- Cache transpose to allow performant retranspose (#635) (44b738d)
- Add supported pandas apis on the main page (#628) (8d2a51c)
- Add the first sample for the Single time-series forecasting from Google Analytics data tutorial (#623) (2b84c4f)
- Address more technical writers' feedback (#640) (1e7793c)
1.3.0 (2024-04-22)
- Add
Series.struct.dtypes
property (#599) (d924ec2) - Add fine tuning
fit()
for Palm2TextGenerator (#616) (9c106bd) - Add quantile statistic (#613) (bc82804)
- Expose
max_batching_rows
inremote_function
(#622) (240a1ac) - Support primary key(s) in
read_gbq
by using as theindex_col
by default (#625) (75bb240) - Warn if location is set to unknown location (#609) (3706b4f)
- Address technical writers fb (#611) (9f8f181)
- Infer narrowest numeric type when combining numeric columns (#602) (8f9ece6)
- Use exact median implementation by default (#619) (9d205ae)
- Fix rendering of examples for multiple apis (#620) (9665e39)
- Set
index_cols
inread_gbq
as a best practice (#624) (70015b7)
1.2.0 (2024-04-15)
- Add hasnans, combine_first, update to Series (#600) (86e0f38)
- Add MultiIndex subclass. (#596) (5d0f149)
- Add pivot_table for DataFrame. (#473) (5f1d670)
- Add Series.autocorr (#605) (4ec8034)
- Support list of numerics in pandas.cut (#580) (290f95d)
- Address more technical writers feedback (#581) (4b08d92)
- Error for object dtype on read_pandas (#570) (8702dcf)
- Inverting int now does bitwise inversion rather than sign flip (#574) (5f1db8b)
- Loc setitem dtype issue. (#603) (b94bae9)
- Toc menu missing plotting name (#591) (eed12c1)
- (Series|Dataframe).dtypes (#598) (edef48f)
- Add code samples for
str
accessor methdos (#594) (a557ea2) - Add docs for
DataFrame
andSeries
dunder methods (#562) (8fc26c4) - Add examples for at/iat (#582) (3be4a2e)
1.1.0 (2024-04-04)
- (Series|DataFrame).explode (#556) (9e32f57)
- Add
DataFrame.eval
andDataFrame.query
(#361) (5e28ebd) - Add ColumnTransformer save/load (#541) (9d8cf67)
- Add ml.metrics.mean_squared_error (#559) (853c25e)
- Add support for numpy expm1, log1p, floor, ceil, arctan2 ops (#505) (e8e66cf)
- Add transformers save/load (#552) (d805241)
- Allow DataFrame binary ops to align on either axis and with loc… (#544) (6d8f3af)
- Expose
DataFrame.bqclient
to assist in integrations (#519) (0be8911) - Read_pandas accepts pandas Series and Index objects (#573) (f8821fe)
- Support
ML.GENERATE_EMBEDDING
inPaLM2TextEmbeddingGenerator
(#539) (1156c1e) - Support max_columns in repr and make repr more efficient (#515) (54e49cf)
- Assign NaN scalar to column error. (#513) (0a4153c)
- Don't download 100gb onto local python machine in load test (#537) (082c58b)
- Exclude list-like s parameter in plot.scatter (#568) (1caac27)
- Fix case where df.peek would fail to execute even with force=True (#511) (8eca99a)
- Fix error in
Series.drop(0)
(#575) (75dd786) - Include all names in MultiIndex repr (#564) (b188146)
- Plot.scatter s parameter cannot accept float-like column (#563) (8d39187)
- Product operation produces float result for all input types (#501) (6873b30)
- Reloaded transformer .transform error (#569) (39fe474)
- Rename PaLM2TextEmbeddingGenerator.predict output columns to be backward compatible (#561) (4995c00)
- Respect hard stack size limit and swallow limit change exception. (#558) (4833908)
- Restore string to date/time type coercion (#565) (4ae0262)
- Sync the notebook with embedding changes (#550) (347f2dd)
- Use bytes limit on frame inlining rather than element count (#576) (659a161)
bigframes.options.bigquery.project
andlocation
are optional in some circumstances (#548) (90bcec5)- Add "Supported pandas APIs" reference to the documentation (#542) (74c3915)
- Add General Availability banner to README (#507) (262ff59)
- Add opeartions in API docs (#557) (ea95761)
- Add progress_bar code sample (#508) (92a1af3)
- Add the code samples for metrics{auc, roc_auc_score, roc_curve} (#520) (5f37b09)
- Address more comments from technical writers to meet legal purposes (#571) (9084df3)
- Fix docs of ARIMAPlus.predict (#512) (3b80f95)
- Include Index in table-of-contents (#564) (b188146)
- Mark Gemini model as Pre-GA (#543) (769868b)
- Migrate the overview page to Bigframes official landing page (#536) (a0fb8bb)
1.0.0 (2024-03-25)
- rename model parameter
min_rel_progress
totol
early_stop
setting no longer supported, always usesTrue
- rename model parameter
n_parallell_trees
ton_estimators
- rename
class_weights
toclass_weight
- rename
learn_rate
tolearning_rate
- PCA
n_components
supports float value andNone
, default toNone
- rename various ml model parameters for consistency with sklearn (#491)
- Add configuration option to read_gbq (#401) (85cede2)
- Add ml ARIMAPlus model params (#488) (352cb85)
- Add ml KMeans model params (#477) (23a8d9a)
- Add ml LogisticRegression model params (#481) (f959b65)
- Add ml PCA model params (#474) (fb5d83b)
- Add params for LinearRegression model (#464) (21b2188)
- Add support for Python 3.12 (#231) (df2976f)
- Allow assigning directly to Series.name property (#495) (ad0e99e)
- Ensure
Series.str.len()
can get length of array columns (#497) (10c0446) - Option to use bq connection without check (#460) (0b3f8e5)
- PCA
n_components
supports float value andNone
, default toNone
(65c6f47) - Rename
class_weights
toclass_weight
(65c6f47) - Rename
learn_rate
tolearning_rate
(65c6f47) - Rename model parameter
min_rel_progress
totol
(65c6f47) - Rename model parameter
n_parallell_trees
ton_estimators
(65c6f47) - Rename various ml model parameters for consistency with sklearn (#491) (65c6f47)
- Support BQ regional endpoints for europe-west9, europe-west3, us-east4, and us-west1 (#504) (fbada4a)
- Support dataframe.cov (#498) (c4beafd)
- Support Series.dt.floor (#493) (2dd01c2)
- Support Series.dt.normalize (#483) (0bf1e91)
- Update plot sample to 1000 rows (#458) (60d4a7b)
early_stop
setting no longer supported, always usesTrue
(65c6f47)- Fix -1 offset lookups failing (#463) (2dfb9c2)
- Plot.scatter
c
argument functionalities (#494) (d6ee994) - Properly support format param for numerical input. (#486) (ae20c35)
- Renable to_csv and to_json related tests (#468) (2b9a01d)
- Sampling plot cannot preserve ordering if index is not ordered (#475) (a5345fe)
- Use actual BigQuery types rather than ibis types in to_pandas (#500) (82b4f91)
- Add code samples for metrics.{accuracy_score, confusion_matrix} (#478) (3e3329a)
- Add code samples for metrics.{recall_score, precision_score, f11_score} (#502) (370fe90)
- Improve API documentation (#489) (751266e)
- Update bigquery connection documentation (#499) (4bfe094)
- Update LLM + K-means notebook to handle partial failures (#496) (97afad9)
0.26.0 (2024-03-20)
- exclude remote models for .register() (#465)
- (Series|DataFrame).plot (#438) (1c3e668)
read_gbq_table
supportsLIKE
as a operator infilters
(#454) (d2d425a)- Add DataFrame.pipe() method (#421) (95f5a6e)
- Set
force=True
by default inDataFrame.peek()
(#469) (4e8e97d) - Support datetime related casting in (Series|DataFrame|Index).astype (#442) (fde339b)
- Support Series.dt.strftime (#453) (8f6e955)
- Any() on empty set now correctly returns False (#471) (f55680c)
- Df.drop_na preserves columns dtype (#457) (3bab1a9)
- Disable to_json and to_csv related tests (#462) (874026d)
- Exclude remote models for .register() (#465) (73fe0f8)
- Fix broken link in covid notebook (#450) (adadb06)
- Fix broken multiindex loc cases (#467) (b519197)
- Fix grouping series on multiple other series (#455) (3971bd2)
- Groupby aggregates no longer check if grouping keys are numeric (#472) (4fbf938)
- Raise
ValueError
whenread_pandas()
receives a bigframesDataFrame
(#447) (b28f9fd) - Series.(to_csv|to_json) leverages bq export (#452) (718a00c)
- Warn when
read_gbq
/read_gbq_table
uses the snapshot time cache (#441) (e16a8c0)
- Add code samples for
ml.metrics.r2_score
(#459) (85fefa2) - Add the docs for loc and iloc indexers (#446) (14ab8d8)
- Add the pages for at and iat indexers (#456) (340f0b5)
- Add version information to bug template (#437) (91bd39e)
- Indicate that project and location are optional in example notebooks (#451) (1df0140)
0.25.0 (2024-03-14)
- (Series|DataFrame).plot.(line|area|scatter) (#431) (0772510)
- Support CMEK for
remote_function
cloud functions (#430) (2fd69f4)
0.24.0 (2024-03-12)
read_parquet
uses a "pandas" engine to parse files by default. Useengine="bigquery"
for the previous behavior
- (Series|Dataframe).plot.hist() (#420) (4aadff4)
- Add detect_anomalies to ml ARIMAPlus and KMeans models (#426) (6df28ed)
- Add engine parameter to
read_parquet
(#413) (31325a1) - Add ml PCA.detect_anomalies method (#422) (8d82945)
- Support BYOSA in
remote_function
(#407) (d92ced2) - Support CMEK for BQ tables (#403) (9a678e3)
- Move
third_party.bigframes_vendored
tobigframes_vendored
(#424) (763edeb) - Only do row identity based joins when joining by index (#356) (76b252f)
- Read_pandas inline respects location (#412) (ae0e3ea)
- Add predict sample to samples/snippets/bqml_getting_started_test.py (#388) (6a3b0cc)
- Document minimum IAM requirement (#416) (36173b0)
- Fix the note rendering for DataFrames methods: nlargest, nsmallest (#417) (38bd2ba)
0.23.0 (2024-03-05)
- Add ml.metrics.pairwise.euclidean_distance (#397) (1726588)
- Add TextEmbedding model version support (#394) (e0f1ab0)
- Code exception in
remote_function
now prevents retry and surfaces in the client (#387) (dd3643d) - Docs link for metrics.pairwise (#400) (a60aba7)
0.22.0 (2024-02-27)
- rename cosine_similarity to paired_cosine_distances (#393)
- move model optional args to kwargs (#381)
- Add
DataFrames.corr()
method (#379) (67fd434) - Add ml.metrics.pairwise.manhattan_distance (#392) (9d31865)
- Enable regional endpoints for me-central2 (#386) (469674d)
- Avoid ibis warning for "database" table() method argument (#390) (a0490a4)
- Correct the numeric literal dtype (#365) (93b02cd)
- Rename cosine_similarity to paired_cosine_distances (#393) (81ece46)
- Add a code sample for creating a kmeans model (#267) (4291d65)
- Fix
bigframes.pandas.concat
documentation (#382) (234b61c)
0.21.0 (2024-02-13)
- Add
Series.cov
method (#368) (443db22) - Add ml.llm.GeminiTextGenerator model (#370) (de1e0a4)
- Add ml.metrics.pairwise.cosine_similarity function (#374) (126f566)
- Add XGBoostModel (#363) (d5518b2)
- Limited support of lambdas in
Series.apply
(#345) (208e081) - Support bigframes.pandas.to_datetime for scalars, iterables and series. (#372) (ffb0d15)
- Support read_gbq wildcard table path (#377) (90caf86)
0.20.1 (2024-02-06)
- Add a sample to demonstrate the evaluation results (#364) (cff0919)
- Fix the
DataFrame.apply
code sample (#366) (1866a26)
0.20.0 (2024-01-30)
- Add
DataFrame.peek()
as an efficient alternative tohead()
results preview (#318) (9c34d83) - Add ARIMA_EVAULATE options in forecasting models (#336) (73e997b)
- Add Index constructor, repr, copy, get_level_values, to_series (#334) (e5d054e)
- Improve error message for drive based BQ table reads (#344) (0794788)
- Update cut to work without labels = False and show intervals as dict (#335) (4ff53db)
- Chance default connection name in getting_started.ipnyb (#347) (677f014)
- Series iteration correctly returns values instead of index (#339) (2c6af9b)
0.19.2 (2024-01-22)
- Read_gbq large response issue (#332) (b8178b9)
- Use object dtype for ARRAY columns in
to_pandas()
with pandas 1.x (#329) (374ddb5)
- Add
DataFrame.applymap
documentation (#326) (bd531a1) - Add code samples for series methods (#323) (32cc6fa)
- Add remote model requirements (#333) (c91f70c)
0.19.1 (2024-01-17)
- Handle multi-level columns for df aggregates properly (#305) (5bb45ba)
- Update max_output_token limitation. (#308) (5cccd36)
0.19.0 (2024-01-09)
- Add 'columns' as an alias for 'col_order' (#298) (a01b271)
- Add Series dt.tz and dt.unit properties (#303) (2e1a403)
- Add to_gbq() method for LLM models (#299) (dafbc1b)
- Allow manually set clustering_columns in dataframe.to_gbq (#302) (9c21323)
- Support assigning to columns like a property (#304) (f645c56)
- Support upcasting numeric columns in concat (#294) (e3a056a)
- DF.drop tuple input as multi-index (#301) (21391a9)
- Fix bug converting non-string labels to sql ids (#296) (a61c5fe)
0.18.0 (2024-01-02)
- Add dataframe.to_html (#259) (2cd6489)
- Add IntervalIndex support to bigframes.pandas.cut (#254) (6c1969a)
- Add replace method to DataFrame (#261) (5092215)
- Specific pyarrow mappings for decimal, bytes types (#283) (a1c0631)
- Dataframes to_gbq now creates dataset if it doesn't exist (#222) (bac62f7)
- Exclude pandas 2.2.0rc0 to unblock prerelease tests (#292) (ac1a745)
- Fix DataFrameGroupby.agg() issue with as_index=False (#273) (ab49350)
- Make
Series.str.replace
work for simple strings (#285) (ad67465) - Update dataframe.to_gbq to dedup column names. (#286) (746115d)
- Use setuptools.find_namespace_packages (#246) (9ec352a)
- Add code snippets for explore query result page (#278) (7cbbb7d)
- Code samples for
astype
common to DataFrame and Series (#280) (95b673a) - Code samples for
DataFrame.copy
andSeries.copy
(#290) (7cbc2b0) - Code samples for
drop
andfillna
(#284) (9c5012e) - Code samples for
isna
,isnull
,dropna
,isin
(#289) (ad51035) - Code samples for
rename
,size
(#293) (eb69f60) - Code samples for
reset_index
andsort_values
(#282) (acc0eb7) - Code samples for
sample
,get
,Series.round
(#295) (c2b1892) - Code samples for
Series.{add, replace, unique, T, transpose}
(#287) (0e1bbfc) - Code samples for
Series.{map, to_list, count}
(#290) (7cbc2b0) - Code samples for
Series.{name, std, agg}
(#293) (eb69f60) - Code samples for
Series.groupby
andSeries.{sum,mean,min,max}
(#280) (95b673a) - Code samples for DataFrame
set_index
,items
(#295) (c2b1892) - Fix the rendering for
get_dummies
(#291) (252f3a2)
0.17.0 (2023-12-14)
- Add
filters
argument toread_gbq
for enhanced data querying (#198) (034f71f) - Add module/class level api tracking (#272) (4f3db3d)
- Deprecate
use_regional_endpoints
(#199) (319a1f2)
- Increase recursion limit, cache compilation tree hashes (#184) (b54791c)
- Replaced raise
NotImplementedError
with returnNotImplemented
(#258) (a133822)
- Add code samples for
values
andvalue_counts
(#249) (f247d95) - Add sample for getting started with BQML (#141) (fb14f54)
0.16.0 (2023-12-12)
- Add ARIMAPlus.predict parameters (#264) (99598c7)
- Add DataFrame from_dict and from_records methods (#244) (8d81e24)
- Add DataFrame.select_dtypes method (#242) (1737acc)
- Add nunique method to Series/DataFrameGroupby (#256) (c8ec245)
- Support dataframe.loc with conditional columns selection (#233) (3febea9)
- Enfore pandas version requirement <2.1.4 (#265) (9dd63f6)
- Exclude pandas 2.1.4 from prerelease tests to unblock e2e tests (b02fc2c)
- Fix value_counts column label for normalize=True (#245) (d3fa6f2)
- Migrate e2e tests to bigframes-load-testing project (8766ac6)
- Ml.sql logic (#262) (68c6fdf)
- Update the llm_kmeans notebook (#247) (66d1839)
- Add code samples for
shape
andhead
(#257) (5bdcc65) - Add example for dataframe.melt, dataframe.pivot, dataframe.stac… (#252) (8c63697)
- Add example to dataframe.nlargest, dataframe.nsmallest, datafra… (#234) (e735412)
- Add examples for dataframe.cummin, dataframe.cummax, dataframe.cumsum, dataframe.cumprod (#243) (0523a31)
- Add examples for dataframe.nunique, dataframe.diff, dataframe.a… (#251) (77074ec)
- Correct the docs for
option_context
(#263) (d21c6dd) - Correct the params rendering for
ml.remote
andml.ensemble
modules (#248) (c2829e3) - Fix return annotation in API docstrings (#253) (89a1c67)
0.15.0 (2023-11-29)
- model.predict returns all the columns (#204)
- Add info and memory_usage methods to dataframe (#219) (9d6613d)
- Add remote vertex model support (#237) (0bfc4fb)
- Add the recent api method for ML component (#225) (ed8876d)
- Model.predict returns all the columns (#204) (416171a)
- Send warnings on LLM prediction partial failures (#216) (81125f9)
- Add df snapshots lookup for
read_gbq
(#229) (d0d9b84) - Avoid unnecessary row_number() on sort key for io (#211) (a18d40e)
- Dedup special character (#209) (dd78acb)
- Invalid JSON type of the notebook (#215) (a729831)
- Make to_pandas override enable_downsampling when sampling_method is manually set. (#200) (ae03756)
- Polish the llm+kmeans notebook (#208) (e8532b1)
- Update the llm+kmeans notebook with recent change (#236) (f8917ab)
- Use anonymous dataset to create
remote_function
(#205) (69b016e)
- Add code samples for
index
andcolumn
properties (#212) (c88d38e) - Add code samples for df reshaping, function, merge, and join methods (#203) (010486c)
- Add examples for dataframe.kurt, dataframe.std, dataframe.count (#232) (f9c6e72)
- Add examples for dataframe.mean, dataframe.median, dataframe.va… (#228) (edd0522)
- Add examples for dataframe.min, dataframe.max and dataframe.sum (#227) (3a375e8)
- Code samples for
Series.dot
andDataFrame.dot
(#226) (b62a07a) - Code samples for
Series.where
andSeries.mask
(#217) (52dfad2) - Code samples for dataframe.any, dataframe.all and dataframe.prod (#223) (d7957fa)
- Make the code samples reflect default bq connection usage (#206) (71844b0)
0.14.1 (2023-11-16)
0.14.0 (2023-11-14)
- Add 'cross' join support (#176) (765446a)
- Add 'index', 'pad', 'nearest' interpolate methods (#162) (6a28403)
- Add series.sample (identical to existing dataframe.sample) (#187) (37914a4)
- Add unordered sql compilation (#156) (58f420c)
- Log most recent API calls as
recent-bigframes-api-xx
labels on BigQuery jobs (#145) (4ea33b7) - Read_gbq creates order deterministically without table copy (#191) (8ab81de)
- Support
date_series.astype("string[pyarrow]")
to cast DATE to STRING (#186) (aee0e8e) - Support
series.at[row_label] = scalar
(#173) (0c8bd33) - Temporary resources no longer use BigQuery Sessions (#194) (4a02cac)
- All sort operation are now stable (#195) (3a2761f)
- Default to 7 days expiration for
read_csv
,read_json
,read_parquet
(#193) (03606cd) - Deprecate the
remote_service_type
in llm model (#180) (a8a409a) - For reset_index on unnamed multiindex, always use level_[n] label (#182) (f95000d)
- Match pandas behavior when assigning listlike to empty dfs (#172) (c1d1f42)
- Use anonymous dataset instead of session dataset for temp tables (#181) (800d44e)
- Use random table for
read_pandas
(#192) (741c75e) - Use random table when loading data for
read_csv
,read_json
,read_parquet
(#175) (9d2e6dc)
- Add code samples for
read_gbq_function
using community UDFs (#188) (7506eab) - Add docstring code samples for
Series.apply
andDataFrame.map
(#185) (c816d84) - Add llm kmeans notebook as an included example (#177) (d49ae42)
- Use
head()
to get topn
results, not to preview results (#190) (87f84c9)
0.13.0 (2023-11-07)
to_gbq
without a destination table writes to a temporary table (#158) (e1817c9)- Add
DataFrame.__iter__
,DataFrame.iterrows
,DataFrame.itertuples
, andDataFrame.keys
methods (#164) (c065071) - Add
Series.__iter__
method (#164) (c065071) - Add interpolate() to series and dataframe (#157) (b9cb55c)
- Support 32k text-generation and multilingual embedding models (#161) (5f0ea37)
0.12.0 (2023-11-01)
- Add
DataFrame.melt
(#113) (4e4409c) - Add
DataFrame.to_pandas_batches()
to download largeDataFrame
objects (#136) (3afd4a3) - Add bigframes.options.compute.maximum_bytes_billed option that sets maximum bytes billed on query jobs (#133) (63c7919)
- Add pandas.qcut (#104) (8e44518)
- Add pd.get_dummies (#149) (d8baad5)
- Add unstack to series, add level param (#115) (5edcd19)
- Implement operator
@
forDataFrame.dot
(#139) (79a638e) - Populate ibis version in user agent (#140) (c639a36)
- Don't override the global logging config (#138) (2ddbf74)
- Fix bug with column names under repeated column assignment (#150) (29032d0)
- Resolve plotly rendering issue by using ipython html for job pro… (#134) (39df43e)
- Use indexee's session for loc listlike cases (#152) (27c5725)
- Add artithmetic df sample code (#153) (ac44ccd)
- Fix indentation on
read_gbq_function
code sample (#163) (0801d96) - Link to ML.EVALUATE BQML page for score() methods (#137) (45c617f)
0.11.0 (2023-10-26)
- Add back
reset_session
as an alias forclose_session
(#124) (694a85a) - Change
query
parameter toquery_or_table
inread_gbq
(#127) (f9bb3c4)
- Expose
bigframes.pandas.reset_session
as a public API (#128) (b17e1f4) - Use series's own session in series.reindex listlike case (#135) (95bff3f)
- Add runnable code samples for DataFrames I/O methods and property (#129) (6fea8ef)
- Add runnable code samples for reading methods (#125) (a669919)
0.10.0 (2023-10-19)
0.9.0 (2023-10-18)
- rename
bigframes.pandas.reset_session
toclose_session
(#101)
- Add
bigframes.options.bigquery.application_name
for partner attribution (#117) (52d64ff) - Add AtIndexer getitems (#107) (752b01f)
- Rename
bigframes.pandas.reset_session
toclose_session
(#101) (36693bf) - Send BigQuery cancel request when canceling bigframes process (#103) (e325fbb)
- Support external packages in
remote_function
(#98) (ec10c4a) - Use ArrowDtype for STRUCT columns in
to_pandas
(#85) (9238fad)
- Add documentation for
Series.struct.field
andSeries.struct.explode
(#114) (a6dab9c) - Add open-source link in API doc (#106) (db51fe3)
- Update ML overview API doc (#105) (1b3f3a5)
0.8.0 (2023-10-12)
- The default behavior of
to_parquet
is changing from no compression to'snappy'
compression.
- Support compression in
to_parquet
(a8c286f)
0.7.0 (2023-10-11)
- Add aliases for several series properties (#80) (c0efec8)
- Add equals methods to series/dataframe (#76) (636a209)
- Add iat and iloc accessing by tuples of integers (#90) (228aeba)
- Add level param to DataFrame.stack (#88) (97b8bec)
- Allow df.drop to take an index object (#68) (740c451)
- Use default session connection (#87) (4ae4ef9)
0.6.0 (2023-10-04)
- Add df.unstack (#63) (4a84714)
- Add idxmin, idxmax to series, dataframe (#74) (781307e)
- Add ml.preprocessing.KBinsDiscretizer (#81) (24c6256)
- Add multi-column dataframe merge (#73) (c9fa85c)
- Add update and align methods to dataframe (#57) (bf050cf)
- Support STRUCT data type with
Series.struct.field
to extract child fields (#71) (17afac9)
- Avoid
403 response too large to return
error withread_gbq
and large query results (#77) (8f3b5b2) - Change return type of
Series.loc[scalar]
(#40) (fff3d45) - Fix df/series.iloc by list with multiindex (#79) (971d091)
0.5.0 (2023-09-28)
- Add
DataFrame.kurtosis
/DF.kurt
method (c1900c2) - Add
DataFrame.rolling
andDataFrame.expanding
methods (c1900c2) - Add
items
,apply
methods toDataFrame
. (#43) (3adc1b3) - Add axis param to simple df aggregations (#52) (9cf9972)
- Add index
dtype
,astype
,drop
,fillna
, aggregate attributes. (#38) (1a254a4) - Add ml.preprocessing.LabelEncoder (#50) (2510461)
- Add ml.preprocessing.MaxAbsScaler (#56) (14b262b)
- Add ml.preprocessing.MinMaxScaler (#64) (392113b)
- Add more index methods (#54) (a6e32aa)
- Support
calculate_p_values
parameter inbigframes.ml.linear_model.LinearRegression
(c1900c2) - Support
class_weights="balanced"
inLogisticRegression
model (c1900c2) - Support
df[column_name] = df_only_one_column
(c1900c2) - Support
early_stop
parameter inbigframes.ml.linear_model.LinearRegression
(c1900c2) - Support
enable_global_explain
parameter inbigframes.ml.linear_model.LinearRegression
(c1900c2) - Support
l2_reg
parameter inbigframes.ml.linear_model.LinearRegression
(c1900c2) - Support
learn_rate_strategy
parameter inbigframes.ml.linear_model.LinearRegression
(c1900c2) - Support
ls_init_learn_rate
parameter inbigframes.ml.linear_model.LinearRegression
(c1900c2) - Support
max_iterations
parameter inbigframes.ml.linear_model.LinearRegression
(c1900c2) - Support
min_rel_progress
parameter inbigframes.ml.linear_model.LinearRegression
(c1900c2) - Support
optimize_strategy
parameter inbigframes.ml.linear_model.LinearRegression
(c1900c2) - Support casting string to integer or float (#59) (3502f83)
- Fix header skipping logic in
read_csv
(#49) (d56258c) - Generate unique ids on join to avoid id collisions (#65) (7ab65e8)
- LabelEncoder params consistent with Sklearn (#60) (632caec)
- Loosen filter items tests to accomodate shifting pandas impl (#41) (edabdbb)
- Add ability to cache dataframe and series to session table (#51) (416d7cb)
- Inline small
Series
andDataFrames
in query text (#45) (5e199ec) - Reimplement unpivot to use cross join rather than union (#47) (f9a93ce)
- Simplify join order to use multiple order keys instead of string. (#36) (5056da6)
- Link to Remote Functions code samples from README and API reference (c1900c2)
0.4.0 (2023-09-16)
- Add
axis
parameter todroplevel
andreorder_levels
(7c6b0dd) - Add
bfill
andffill
toDataFrame
andSeries
(7c6b0dd) - Add
DataFrame.combine
andDataFrame.combine_first
(#27) (7c6b0dd) - Add
DataFrame.nlargest
,nsmallest
(7c6b0dd) - Add
DataFrame.pct_change
andSeries.pct_change
(7c6b0dd) - Add
DataFrame.skew
andGroupBy.skew
(7c6b0dd) - Add
DataFrame.to_dict
,to_excel
,to_latex
,to_records
,to_string
,to_markdown
,to_pickle
,to_orc
(7c6b0dd) - Add
diff
method toDataFrame
andGroupBy
(7c6b0dd) - Add
filter
andreindex
toSeries
andDataFrame
(7c6b0dd) - Add
reindex_like
toDataFrame
andSeries
(7c6b0dd) - Add
swaplevel
toDataFrame
andSeries
(7c6b0dd) - Add partial support for
Sereies.replace
(7c6b0dd) - Support
DataFrame.loc[bool_series, column] = scalar
(7c6b0dd) - Support a persistent
name
inremote_function
(7c6b0dd)
remote_function
uses same credentials as other APIs (7c6b0dd)- Add type hints to models (7c6b0dd)
- Raise error when ARIMAPlus is used with Pipeline (7c6b0dd)
- Remove
transforms
parameter inmodel.fit
(breaking change) (7c6b0dd) - Support column joins with "None indexer" (7c6b0dd)
- Use for literals
Int64Dtype
incut
(7c6b0dd) - Use lowercase strings for parameter literals in
bigframes.ml
(breaking change) (7c6b0dd)
bigframes-api
label to I/O query jobs (7c6b0dd)
- Document possible parameter values for PaLM2TextGenerator (7c6b0dd)
- Document region logic in README (7c6b0dd)
- Fix OneHotEncoder sample (7c6b0dd)
0.3.2 (2023-09-06)
0.3.1 (2023-09-05)
0.3.0 (2023-09-02)
- Add
bigframes.get_global_session()
andbigframes.reset_session()
aliases (a32b747) - Add
bigframes.pandas.read_pickle
function (a32b747) - Add
components_
,explained_variance_
, andexplained_variance_ratio_
properties tobigframes.ml.decomposition.PCA
(89b9503) - Add
fit_transform
tobigquery.ml
transformers (a32b747) - Add
Series.dropna
andDataFrame.fillna
(8fab755) - Add
Series.str
methodsisalpha
,isdigit
,isdecimal
,isalnum
,isspace
,islower
,isupper
,zfill
,center
(a32b747) - Support
bigframes.pandas.merge()
(8fab755) - Support
DataFrame.isin
with list and dict inputs (8fab755) - Support
DataFrame.pivot
(a32b747) - Support
DataFrame.stack
(89b9503) - Support
DataFrame
-DataFrame
binary operations (8fab755) - Support
df[my_column] = [a python list]
(89b9503) - Support
Index.is_monotonic
(8fab755) - Support
np.arcsin
,np.arccos
,np.arctan
,np.sinh
,np.cosh
,np.tanh
,np.arcsinh
,np.arccosh
,np.arctanh
,np.exp
with Series argument (89b9503) - Support
np.sin
,np.cos
,np.tan
,np.log
,np.log10
,np.sqrt
,np.abs
with Series argument (89b9503) - Support
pow()
and power operator inDataFrame
andSeries
(8fab755) - Support
read_json
withengine=bigquery
for newline-delimited JSON files (89b9503) - Support
Series.corr
(89b9503) - Support
Series.map
(8fab755) - Support for
np.add
,np.subtract
,np.multiply
,np.divide
,np.power
(8fab755) - Support MultiIndex for DataFrame columns (a32b747)
- Use
pandas.Index
for column labels (a32b747) - Use default session and connection in
ml.llm
andml.imported
(8fab755)
- Add error message to
set_index
(a32b747) - Align column names with pandas in
DataFrame.agg
results (89b9503) - Allow (but still not recommended)
ORDER BY
inread_gbq
input when anindex_col
is defined (89b9503) - Check for IAM role on the BigQuery connection when initializing a
remote_function
(89b9503) - Check that types are specified in
read_gbq_function
(a32b747) - Don't use query cache for Session construction (a32b747)
- Include survey link in abstract
NotImplementedError
exception messages (89b9503) - Label temp table creation jobs with
source=bigquery-dataframes-temp
label (89b9503) - Make
X_train
argument names consistent across methods (8fab755) - Raise AttributeError for unimplemented pandas methods (89b9503)
- Raise exception for invalid function in
read_gbq_function
(a32b747) - Support spaces in column names in
DataFrame
initializater (89b9503)
- Add local cache for
__repr_*__
methods (a32b747) - Lazily instantiate client library objects (89b9503)
- Use
row_number()
filter forhead
/tail
(8fab755)
- Add ML section under Overview (a32b747)
- Add release status to table of contents (a32b747)
- Add samples and best practices to
read_gbq
docs (a32b747) - Correct the return types of Dataframe and Series (a32b747)
- Create subfolders for notebooks (a32b747)
- Fix link to GitHub (89b9503)
- Highlight bigframes is open-source (a32b747)
- Sample ML Drug Name Generation notebook (a32b747)
- Set
options.bigquery.project
in sample code (89b9503) - Transform remote function user guide into sample code (a32b747)
- Update remote function notebook with read_gbq_function usage (8fab755)
- Add KMeans.cluster_centers_.
- Allow column labels to be any type handled by bq df, column labels can be integers now.
- Add dataframegroupby.agg().
- Add Series Property is_monotonic_increasing and is_monotonic_decreasing.
- Add match, fullmatch, get, pad str methods.
- Add series isin function.
- Update ML package to use sessions for queries.
- Optimize
read_gbq
withindex_col
set to cluster byindex_col
. - Raise ValueError if the location mismatched.
read_gbq
no longer uses 'time travel' with query inputs.
- Add docstring to _uniform_sampling to avoid user using it.
- Correct link to code repository in
setup.py
and use correct terminology forconsole.cloud.google.com
links.
- Add
bigframes.pandas
package with an API compatible with pandas. Supported data sources include: BigQuery SQL queries, BigQuery tables, CSV (local and GCS), Parquet (local and Cloud Storage), and more. - Add
bigframes.ml
package with an API inspired by scikit-learn. Train machine learning models and run batch predicition, powered by BigQuery ML.
0.0.0 (2023-02-22)
- Empty package to reserve package name.