Skip to content

Releases: mars-project/mars

v0.5.3

24 Oct 04:55
1349072
Compare
Choose a tag to compare

This is the release notes of v0.5.3. See here for the complete list of solved issues and merged PRs.

New Features

  • DataFrame
    • Add DataFrame.to_parquet support (#1653)

Enhancements

  • Optimize memory usage for brute-force algorithm in NearestNeighbors (#1648)

Bug fixes

  • Fix the wrong dtypes of DataFrameSetitem's inputs (#1627)
  • Fix issue that output_type does not take effect for df.apply (#1628)
  • Fix registration for DataFrameSetLabel operand(#1633)
  • Eliminate TimeoutError when there are running nodes (#1639)
  • Fix issue that serialization of transpose failed when input has unknown shape (#1638)
  • Fix PSRS error when chunks has fewer rows than partition number (#1644)
  • Fix md.concat which may occupy huge amount of memory on client when all of DataFrames own large RangeIndex (#1651)

v0.6.0a3

01 Oct 01:43
8146d1b
Compare
Choose a tag to compare
v0.6.0a3 Pre-release
Pre-release

This is the release notes of v0.6.0a3. See here for the complete list of solved issues and merged PRs.

Highlights

  • Brand-new API fetch_log is implemented so that in a distributed environment, it helps users to fetch logs which output in custom functions without effort on client side. For more details, refer to #1564 .

New Features

  • DataFrame
    • Implements df.rebalance() (#1572)
    • Add support for {DataFrame,Series}.{where,mask} (#1577)
    • Add read_parquet support (#1576)
    • Added DataFrame.isin support (#1584)
    • Implements DataFrame.stack (#1591)
    • Implements {DataFrame,Series,GroupBy}.{all,any} (#1600)
    • Add support for pearson coefficients (corr, corrwith and autocorr) (#1587)
  • Learn
  • Deployment
    • Support rescaling worker numbers in Kubernetes (#1571)
  • Others
    • Implements fetch_log API (#1574)

Bug fixes

  • Fix the failure when fetching the result of Series.sum (#1583)
  • Fix the failure of DataFrame reduction operators (#1589)
  • Fix error on fitting LGBMModel twice (#1598)
  • Fix train_test_split when some input is Series (#1610)
  • Fix build_faiss_index when some index type cannot be merged (#1609)
  • Allow LightGBM wrapper to use numpy arrays (#1607)
  • Add an extra sort key in PSRS to make distinct pivot (#1612)
  • Fixes md.read_csv when dtypes is not inferred correctly (#1606)
  • Fix Ray 1.0 compatibility (#1620)

Documentation

  • Add docs about reading data from HDFS (#1619)

v0.5.2

30 Sep 17:00
2732b1b
Compare
Choose a tag to compare

This is the release notes of v0.5.2. See here for the complete list of solved issues and merged PRs.

Highlights

  • Brand-new API fetch_log is implemented so that in a distributed environment, it helps users to fetch logs which output in custom functions without effort on client side. For more details, refer to #1564 .

New Features

  • DataFrame
    • Implements df.rebalance() (#1573)
    • Add support for {DataFrame,Series}.{where,mask} (#1579)
    • Add read_parquet support (#1581)
    • Add DataFrame.isin support (#1592)
    • Implements DataFrame.stack (#1594)
    • Add support for {DataFrame,Series,GroupBy}.{all,any} (#1601)
    • Add support for pearson coefficients (corr, corrwith and autocorr) (#1616)
  • Others
    • Implements fetch_log API (#1582)

Bug fixes

  • Fix the failure when fetching the result of Series.sum (#1585)
  • Fix failures of DataFrame reduction operators (#1595)
  • Fix error on fitting LGBMModel twice (#1599)
  • Add extra sort key in PSRS to make distinct pivot (#1613)
  • Fix build_faiss_index when some index type cannot be merged (#1614)
  • Fix train_test_split when some input is Series (#1615)
  • Fixes md.read_csv when dtypes is not inferred correctly (#1617)

v0.6.0a2

14 Sep 03:08
5e2c743
Compare
Choose a tag to compare
v0.6.0a2 Pre-release
Pre-release

This is the release notes of v0.6.0a2. See here for the complete list of solved issues and merged PRs.

New Features

  • DataFrame
    • Allow submitting Mars jobs in custom functions (#1559)
    • Add df.select_dtypes support (#1565, thanks @lipengsh!)
    • Add df.map_chunk support (#1569)
  • Others
    • Support running Mars under KubeDL (#1549)

Enhancements

  • Add configurable label options in Kubernetes cluster (#1547)

Bug fixes

  • Fix md.read_csv for Ray executor (#1541)
  • Allow returning None when using groupby.apply (#1544)
  • Use relative paths to avoid web rendering issues under backward proxies (#1540)
  • Fix bug that cannot pass numpy array to mt.swapaxes (#1553, thanks @YoshieraHuang!)
  • Fix pandas 1.1.2 compatibility (#1562)
  • Fix compatibility for tsfresh 0.17.0 (#1566)

v0.5.1

14 Sep 04:39
00c12ae
Compare
Choose a tag to compare

This is the release notes of v0.5.1. See here for the complete list of solved issues and merged PRs.

New Features

  • DataFrame
    • Allow submitting Mars jobs in custom functions (#1560)
    • Add df.select_dtypes support (#1568, thanks @lipengsh!)
    • Add df.map_chunk support (#1570)

Enhancements

  • Add configurable label options in Kubernetes cluster (#1550)

Bug fixes

  • Use relative paths to avoid web rendering issues under backward proxies (#1545)
  • Allow returning None when using groupby.apply (#1548)
  • Fix bug that cannot pass numpy array to mt.swapaxes (#1561, thanks @YoshieraHuang!)
  • Fix pandas 1.1.2 compatibility (#1563)
  • Fix compatibility for tsfresh 0.17.0 (#1567)

v0.5.0

29 Aug 09:32
0352988
Compare
Choose a tag to compare

This is the release notes of v0.5.0. See here for the complete list of solved issues and merged PRs.

This release note only covers the difference from v0.5.0rc1; for all highlights and changes, please refer to the release notes of the pre-releases:

New Features

  • DataFrame
    • Support use_arrow_dtype for md.read_csv and md.read_sql (#1495)
  • Others
    • Store and query graph information in batch (#1504)

Enhancements

  • Fix compatibility for gevent>=20.5.1 (#1493)
  • Add an option to control writing shuffle data into disk (#1516)
  • Unify logic of modes including eager, kernel and build (#1530)
  • Optimize mars.learn.cluster.KMeans when n_clusters is relatively large (#1536)

Bug fixes

  • Update mt.split to support list and tuple (#1509, thanks @YoshieraHuang!)
  • Fix pandas 1.1 compatibility (#1515)
  • Fix mt.isclose when some of the arguments is scalar (#1518)
  • Fix mt.linalg.norm when axis is negative (#1519, thanks @YoshieraHuang!)
  • Fix arctan2 when arguments contains scalar (#1520)
  • Unregister scheduler observer when destroying actors (#1526)
  • Fix creating Mars DataFrame from an empty pandas DataFrame (#1531)
  • Support df.groupby().count() for arrow dtype with and without pyarrow installed (#1532)
  • Fix DataFrame reduction on GPU (#1535)

v0.4.7

29 Aug 09:22
2832802
Compare
Choose a tag to compare

This is the release notes of v0.4.7. See here for the complete list of solved issues and merged PRs.

New Features

  • Store and query graph info in batch (#1503)

Enhancements

  • Fix compatibility for gevent>=20.5.1 (#1494)
  • Add an option to control writing shuffle data into disk (#1527)

Bug fixes

  • Fix arrow_array_to_objects when input is a Series whose index is not RangeIndex(n) (#1496)

Tests

  • Fixed statsmodel version (#1537)

v0.6.0a1

29 Aug 04:39
6d253b8
Compare
Choose a tag to compare
v0.6.0a1 Pre-release
Pre-release

This is the release notes of v0.6.0a1. See here for the complete list of solved issues and merged PRs.

New Features

  • DataFrame
    • Support use_arrow_dtype for md.read_csv and md.read_sql (#1491)
  • Others
    • Store and query graph information in batch (#1501)
    • Integrate with Ray (#1508)

Enhancements

  • Fix compatibility for gevent>=20.5.1 (#1490)
  • Add an option to control writing shuffle data into disk (#1513)
  • Unify logics of modes including eager, kernel and build (#1528)
  • Optimize mars.learn.cluster.KMeans when n_clusters is relatively large (#1511)

Bug fixes

  • Fix mt.linalg.norm when axis is negative (#1499, thanks @YoshieraHuang!)
  • Fix mt.isclose when some of the arguments is scalar (#1498)
  • Fix mt.arctan2 when arguments contain scalar (#1502)
  • Update mt.split to support list and tuple (#1507, thanks @YoshieraHuang!)
  • Fix pandas 1.1 compatibility (#1437)
  • Fix creating Mars DataFrame from an empty pandas DataFrame (#1522)
  • Unregister scheduler observer when destroying actors (#1525)
  • Support df.groupby().count() for arrow dtype with and without pyarrow installed (#1523)
  • Fix DataFrame reduction on GPU (#1534)

Documentation

v0.4.6

16 Aug 04:46
87aa9fb
Compare
Choose a tag to compare

This is the release notes of v0.4.6. See here for the complete list of solved issues and merged PRs.

New Features

  • DataFrame
    • Implements md.qcut (#1473)
    • Implements {DataFrame, Series}.reindex (#1483)
    • Add support for ArrowListDtype as well as ArrowListArray (#1487)

Enhancements

  • Serialize results in worker before storing into shared storages (#1474)
  • Raise timeout when assigning failed for a long time (#1477)
  • Fix pickling arrow types & allow specifying parallel number in IO runners (#1482)

Bug fixes

  • Support ExtensionDtype in df.astype and complex serialization (#1464)
  • Fix incorrect index_value in df.drop() (#1488)

v0.5.0rc1

15 Aug 10:43
50feb68
Compare
Choose a tag to compare
v0.5.0rc1 Pre-release
Pre-release

This is the release notes of v0.5.0rc1. See here for the complete list of solved issues and merged PRs.

New Features

  • DataFrame
    • Implements md.qcut (#1468)
    • Implements {DataFrame, Series}.reindex (#1481)
    • Add support for ArrowListDtype as well as ArrowListArray (#1486)

Enhancements

  • Serialize results in worker before storing into shared storages (#1470)
  • Raise timeout when assigning failed for a long time (#1475)
  • Use f-string to replace most of string formattings (#1484)

Bug fixes

  • Fix reference cycle in promise.all_ (#1452)
  • Support ArrowStringDtype for DataFrame.sort_values() (#1455)
  • Support serializing complex scalars (#1459)
  • Support ExtensionDtype in df.astype (#1462)
  • Fix incorrect index_value in df.drop() (#1466)
  • Fix pickling arrow types & allow specifying parallel number in IO runners (#1480)