Skip to content

Releases: mars-project/mars

v0.6.7

22 Mar 03:06
9df5d9a
Compare
Choose a tag to compare

This is the release notes of v0.6.7. See here for the complete list of solved issues and merged PRs.

New Features

  • DataFrame
    • Implement mt.insert and mt.delete (#2042)

Enhancements

  • Support reusing kubedl cluster by job name (#2036)

Bug fixes

  • Re-enable the from/to vineyard test cases, and set meta for tensor/dataframe properly(#2030)
  • Fix wrong results of mt.insert (#2048)
  • Fix for mt.insert when insert values is a mars tensor (#2053)

v0.7.0a7

08 Mar 12:21
204dd90
Compare
Choose a tag to compare
v0.7.0a7 Pre-release
Pre-release

This is the release notes of v0.7.0a7. See here for the complete list of solved issues and merged PRs.

New Features

  • DataFrame
    • Implements {DataFrame, Series}.pct_change (#2014)
  • Tensor
    • Implements tree arithmetic for tensor add and multiplication (#2024)

Project Galois

  • Oscar
    • Add support for batch interfaces for actors (#2013)
    • [oscar] Add cancel support, optimize error handling, add kill_actor API (#2027)
  • Service
    • Add initial service implementations (#2010)

Enhancements

  • Use mmap files to reduce memory usage in proxima builder (#1866)
  • Support setting column with different index for DataFrame (#2020)

Bug fixes

  • Fix errors when calling where() on reshape results (#2011)
  • Fix log error when yielding to another remote (#2022)

v0.6.6

07 Mar 12:04
c0c0faf
Compare
Choose a tag to compare

This is the release notes of v0.6.6. See here for the complete list of solved issues and merged PRs.

New Features

  • DataFrame
    • Implements {DataFrame, Series}.pct_change (#2015)
  • Tensor
    • Implements tree arithmetic for tensor add and multiplication (#2028)

Enhancements

  • Use mmap files to reduce memory usage in proxima builder (#2016)
  • Support setting column with different index for DataFrame (#2025)

Bug fixes

  • Fix IndexError in Series.sort_values when some chunk is empty (#2001)
  • Fix mars crashes on ray >= 1.2.0 (#2003, thanks @fyrestone!)
  • Add errors argument for groupby.sample to ignore errors when group size less than n (#2007)
  • Fix errors when calling where() on reshape results (#2012)
  • Fix log error when yielding to another remote (#2026)

v0.7.0a6

25 Feb 12:17
5172ff8
Compare
Choose a tag to compare
v0.7.0a6 Pre-release
Pre-release

This is the release notes of v0.7.0a6. See here for the complete list of solved issues and merged PRs.

New Features

  • DataFrame
    • Implements Index.__getitem__ (#1971)
    • Implements {DataFrame,Series}.sample (#1983)
    • Implements DataFrameGroupBy.sample (#1994)
  • Tensor
    • Implements stats.chisquare (#1974)
    • Implement ttests and gamma functions (#1986)

Project Galois

  • Oscar
    • [oscar] Fix actor promise & add tests (#1958)
    • [oscar] Add communication layer for Mars backend (#1989)
    • [oscar] Implements Mars backend for oscar (#1996)
  • Storage
    • [storage][vineyard] Implement storage lib of vineyard backend (#1952, thanks @acezen!)
    • [storage][shared_memory] Add storage backend of multiprocessing.shared_memory (#1969)
    • [storage][cuda] Add cuda backend storage implementation (#1981)
    • [storage][ray] Implements Ray storage (#1992, thanks @fyrestone!)

Enhancements

  • Allow wrapping existing models with Mars class constructors (#1956)
  • Optimize performance of DataFrame.describe() (#1961)
  • Initialize filesystem and aio libs (#1980)

Bug fixes

  • Fix MarsDMatrix when input tensor has unknown chunk shape (#1966)
  • Fix tensor sorting with empty chunks (#1968)
  • Re-enable the from/to vineyard test cases, and set meta for tensor/dataframe properly. (#1967)
  • Fix ValueError when reducing tensors with empty chunks (#1978)
  • Fix job hang when error message can't be pickled (#1990)
  • Fix IndexError in Series.sort_values when some chunk is empty (#1999)
  • Fix mars crashes on ray >= 1.2.0 (#1998, thanks @fyrestone!)
  • Add errors argument for groupby.sample to ignore errors when group size less than n (#2002)

v0.6.5

22 Feb 06:22
c06130b
Compare
Choose a tag to compare

This is the release notes of v0.6.5. See here for the complete list of solved issues and merged PRs.

New Features

  • DataFrame
    • Implements Index.__getitem__ (#1975)
    • Implements {DataFrame,Series}.sample (#1987)
    • Implements DataFrameGroupBy.sample (#1995)
  • Tensor
    • Implements stats.chisquare (#1976)
    • Implement ttests and gamma functions (#1988)

Enhancements

  • Allow wrapping existing models with Mars class constructors (#1957)
  • Optimize performance of DataFrame.describe() (#1962)
  • Initialize filesystem libs (#1982)

Bug fixes

  • Fix tensor sorting with empty chunks (#1973)
  • Fix MarsDMatrix when input tensor has unknown chunk shape (#1970)
  • Fix ValueError when reducing tensors with empty chunks (#1979)
  • Fix job hang when error message can't be pickled (#1993)

Tests

  • Add tests and releases for Python 3.9 (#1955)

v0.7.0a5

01 Feb 13:22
b190fcd
Compare
Choose a tag to compare
v0.7.0a5 Pre-release
Pre-release

This is the release notes of v0.7.0a5. See here for the complete list of solved issues and merged PRs.

New Features

  • DataFrame
    • Implements DataFrame.{eval,query} (#1898)
    • Implements {DataFrame, Series}.duplicated() (#1907)
    • Implements is_monotonic properties (#1939)
    • Implements {DataFrame,Series}.set_axis (#1950)

Project Galois

  • [oscar] Add actor driver & structure adjustment (#1925)
  • [oscar][ray backend] Actor creation (#1916, thanks @fyrestone!)
  • Add new serializer implementation (#1937)
  • Implement storage lib of Arrow plasma as well as disk (#1904)

Enhancements

  • Allow set verify_ssl to False for kubernetes configuration (#1911)
  • Optimize generating mock DataFrames (#1913)
  • Move opcodes out of protobuf definition (#1944)

Bug fixes

  • To vineyard: avoid copy when chunks are already in vineyard (vineyard is the backend). (#1899)
  • Fix rechunk when input tileable has unknown shape (#1912)
  • Fix KeyError when comparing series (#1920)
  • Fix rechunk when chunks have different dtypes that cannot compare (#1922)
  • Collect available ports before running LightGBM task (#1927)
  • Fix KeyError when column pruning is applied (#1929)
  • Fix shuffling data in mars.learn module (#1931)
  • Fix memory estimation of StartTracker for XGBoost (#1934)
  • Fix accuracy_score for distributed execution (#1945)

Tests

  • Add tests and releases for Python 3.9 (#1954)

v0.6.4

30 Jan 06:09
96704f4
Compare
Choose a tag to compare

This is the release notes of v0.6.4. See here for the complete list of solved issues and merged PRs.

New Features

  • DataFrame
    • Implements DataFrame.{eval,query} (#1900)
    • Implements {DataFrame, Series}.duplicated() (#1909)
    • Implements is_monotonic properties (#1946)
    • Implements {DataFrame,Series}.set_axis (#1951)

Enhancements

  • Optimize generating mock DataFrames (#1915)
  • Move opcodes out of protobuf definition (#1947)

Bug fixes

  • Fix rechunk when input tileable has unknown shape (#1914)
  • Fix KeyError when comparing series (#1921)
  • Fix rechunk when chunks have different dtypes that cannot compare (#1926)
  • Collect available ports before running LightGBM task (#1927)
  • Fix KeyError when column pruning is applied (#1933)
  • Fix error when shuffling data in mars.learn module (#1936)
  • Fix memory estimation of StartTracker for XGBoost (#1936)
  • Fix accuracy_score for distributed execution (#1948)

v0.7.0a4

17 Jan 12:29
7f1a8b8
Compare
Choose a tag to compare
v0.7.0a4 Pre-release
Pre-release

This is the release notes of v0.7.0a4. See here for the complete list of solved issues and merged PRs.

New Features

  • DataFrame
    • Add more functionalities for md.Index (#1860)
    • Implements {DataFrame,Series}.rename_axis (#1867)

Enhancements

  • Allow internal serialization to use JSON (#1880)
  • Optimize performance of {md.read_csv(), md.read_parquet()}.head() (#1878)
  • Optimize performance of df.sort_values().head() (#1884)
  • Support column pruning for groupby().agg() on data sources (#1886)
  • Improve named_{dataframe, series, tensor} that it's able to get more meta (#1896)

Bug fixes

  • Support unknown shape for mt.reshape, mt.histogram and md.DataFrame (#1869)
  • Fix wrongly raised error: Tileable object must be executed first before being fetched (#1872)
  • Fix reshape when input tensor has unknown shape and 1 chunk (#1874)
  • Fix stuck of threaded actor operations in gevent==20.12.0 (#1879)
  • Fix sorting string columns with None value & sorting with empty chunks (#1891)
  • Adapt vineyardhandler.py to latest vineyard. (#1887)

Documentation

  • LFAI & Data: Add required documents (#1865)

v0.6.3

17 Jan 14:22
236bbcf
Compare
Choose a tag to compare

This is the release notes of v0.6.3. See here for the complete list of solved issues and merged PRs.

New Features

  • DataFrame
    • Add more functionalities for md.Index (#1864)
    • Implements {DataFrame,Series}.rename_axis (#1870)

Enhancements

  • Allow internal serialization to use JSON (#1882)
  • Optimize performance of {md.read_csv(), md.read_parquet()}.head() (#1883)
  • Optimize performance of df.sort_values().head() (#1888)
  • Support column pruning for groupby().agg() on data sources (#1889)
  • Improve named_{dataframe, series, tensor} that it's able to get more meta (#1897)

Bug fixes

  • Fix wrongly raised error: Tileable object must be executed first before being fetched (#1875)
  • Support unknown shape for mt.reshape, mt.histogram and md.DataFrame (#1876)
  • Fix stuck of threaded actor operations in gevent==20.12.0 (#1881)
  • Fix sorting string columns with None value & sorting with empty chunks (#1893)

v0.6.2

03 Jan 04:54
d5d8fab
Compare
Choose a tag to compare

This is the release notes of v0.6.2. See here for the complete list of solved issues and merged PRs.

New Features

  • DataFrame
    • Implements head() on groupby objects (#1851)
  • Learn
    • Implements mars.learn.preprocessing.{MinMaxScaler, minmax_scale}(#1858)

Enhancements

  • Improve Proxima recall_by_id computation method (#1807, thanks @rg070836rg!)
  • Revise to/from vineyard, of Tensor and DataFrame. (#1806)
  • Add memory estimation for read_parquet as well as read_csv (#1815)
  • Support using compound agg function in lambda (#1819)
  • Add incremental_index argument to reset_index which by default is False (#1842)
  • Support to_pandas in a batch way for DataFrame and Series (#1859)
  • Support specifying memory scale in kubernetes (#1861)

Bug fixes

  • Fix compatibility for scikit-learn 0.24.0 (#1820)
  • Remove unnecessary iterative tiling when predicting via XGBoost and data from/to parquet (#1821)
  • Resolve KeyError when calling delete_keys for ray backend (#1854)
  • Fix compatibility for pandas 1.2.0 (#1862)