Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ install:
script:
- export PATH="$HOME/miniconda/bin:$PATH"

- python -m pytest modin/dataframe/test/test_dataframe.py
- python -m pytest modin/dataframe/test/test_concat.py
- python -m pytest modin/dataframe/test/test_io.py
- python -m pytest modin/dataframe/test/test_groupby.py
- python -m pytest modin/pandas/test/test_dataframe.py
- python -m pytest modin/pandas/test/test_concat.py
- python -m pytest modin/pandas/test/test_io.py
- python -m pytest modin/pandas/test/test_groupby.py
4 changes: 2 additions & 2 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ Modin

|

*Modin is the parent project of Pandas on Ray*
*Modin is a library for unifying the way you interact with your data*

Modin can be installed with pip: ``pip install modin``

Expand All @@ -24,7 +24,7 @@ Pandas on Ray
|.. code-block:: python |.. code-block:: python |
| | |
| # Normal pandas import | # Pandas on Ray import |
| import pandas as pd | import modin.dataframe as pd |
| import pandas as pd | import modin.pandas as pd |
| | |
| df = pd.DataFrame({'col1': [1, 2, 3], | df = pd.DataFrame({'col1': [1, 2, 3], |
| 'col2': [1.0, 2.0, 3.0]}) | 'col2': [1.0, 2.0, 3.0]}) |
Expand Down
4 changes: 2 additions & 2 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ Modin
<a href="https://github.com/modin-project/modin"><img style="position: absolute; top: 0; right: 0; border: 0;" src="https://camo.githubusercontent.com/365986a132ccd6a44c23a9169022c0b5c890c387/68747470733a2f2f73332e616d617a6f6e6177732e636f6d2f6769746875622f726962626f6e732f666f726b6d655f72696768745f7265645f6161303030302e706e67" alt="Fork me on GitHub" data-canonical-src="https://s3.amazonaws.com/github/ribbons/forkme_right_red_aa0000.png"></a>
</embed>

*Modin is the parent project of Pandas on Ray*
*Modin is a library for unifying the way you interact with your data*

Modin can be installed with pip: ``pip install modin``

Expand All @@ -22,7 +22,7 @@ Pandas on Ray
|.. code-block:: python |.. code-block:: python |
| | |
| # Normal pandas import | # Pandas on Ray import |
| import pandas as pd | import modin.dataframe as pd |
| import pandas as pd | import modin.pandas as pd |
| | |
| df = pd.DataFrame({'col1': [1, 2, 3], | df = pd.DataFrame({'col1': [1, 2, 3], |
| 'col2': [1.0, 2.0, 3.0]}) | 'col2': [1.0, 2.0, 3.0]}) |
Expand Down
6 changes: 3 additions & 3 deletions docs/pandas_on_ray.rst
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ to use Pandas on Ray just like you would Pandas.
.. code-block:: python

# import pandas as pd
import modin.dataframe as pd
import modin.pandas as pd

Currently, we have part of the Pandas API implemented and are working toward
full functional parity with Pandas.
Expand All @@ -29,7 +29,7 @@ output:

.. code-block:: text

>>> import modin.dataframe as pd
>>> import modin.pandas as pd

Waiting for redis server at 127.0.0.1:14618 to respond...
Waiting for redis server at 127.0.0.1:31410 to respond...
Expand All @@ -39,7 +39,7 @@ output:
View the web UI at http://localhost:8889/notebooks/ray_ui36796.ipynb?token=ac25867d62c4ae87941bc5a0ecd5f517dbf80bd8e9b04218
======================================================================

Once you have executed ``import modin.dataframe as pd``, you're ready to begin
Once you have executed ``import modin.pandas as pd``, you're ready to begin
running your pandas pipeline as you were before.

APIs Supported
Expand Down
2 changes: 1 addition & 1 deletion docs/pandas_supported.rst
Original file line number Diff line number Diff line change
Expand Up @@ -716,7 +716,7 @@ this.
List of Other Supported Operations Available on Import
------------------------------------------------------

If you ``import modin.dataframe as pd`` the following operations are available
If you ``import modin.pandas as pd`` the following operations are available
from ``pd.<op>``, e.g. ``pd.concat``. If you do not see an operation that
**pandas** enables and would like to request it, feel free to `open an issue`_.
Make sure you tell us your primary use-case so we can make it happen faster!
Expand Down
File renamed without changes.
2 changes: 1 addition & 1 deletion modin/dataframe/concat.py → modin/pandas/concat.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ def concat(objs, axis=0, join='outer', join_axes=None, ignore_index=False,
if type_check is not None:
raise ValueError("cannot concatenate object of type \"{0}\"; only "
"pandas.Series, pandas.DataFrame, "
"and modin.dataframe.DataFrame objs are "
"and modin.pandas.DataFrame objs are "
"valid", type(type_check))

all_series = all(isinstance(obj, pandas.Series)
Expand Down
34 changes: 17 additions & 17 deletions modin/dataframe/dataframe.py → modin/pandas/dataframe.py
Original file line number Diff line number Diff line change
Expand Up @@ -65,22 +65,22 @@ def __init__(self, data=None, index=None, columns=None, dtype=None,
Dict can contain Series, arrays, constants, or list-like
objects.
index (pandas.Index, list, ObjectID): The row index for this
dataframe.
columns (pandas.Index): The column names for this dataframe, in
DataFrame.
columns (pandas.Index): The column names for this DataFrame, in
pandas Index object.
dtype: Data type to force. Only a single dtype is allowed.
If None, infer
copy (boolean): Copy data from inputs.
Only affects DataFrame / 2d ndarray input
col_partitions ([ObjectID]): The list of ObjectIDs that contain
the column dataframe partitions.
the column DataFrame partitions.
row_partitions ([ObjectID]): The list of ObjectIDs that contain the
row dataframe partitions.
row DataFrame partitions.
block_partitions: A 2D numpy array of block partitions.
row_metadata (_IndexMetadata):
Metadata for the new dataframe's rows
Metadata for the new DataFrame's rows
col_metadata (_IndexMetadata):
Metadata for the new dataframe's columns
Metadata for the new DataFrame's columns
"""
if isinstance(data, DataFrame):
self._frame_data = data._frame_data
Expand Down Expand Up @@ -2001,7 +2001,7 @@ def filter(self, items=None, like=None, regex=None, axis=None):
axis: axis to filter on

Returns:
A new dataframe with the filter applied.
A new DataFrame with the filter applied.
"""
nkw = com._count_not_none(items, like, regex)
if nkw > 1:
Expand Down Expand Up @@ -2159,13 +2159,13 @@ def gt(self, other, axis='columns', level=None):
return self._operator_helper(pandas.DataFrame.gt, other, axis, level)

def head(self, n=5):
"""Get the first n rows of the dataframe.
"""Get the first n rows of the DataFrame.

Args:
n (int): The number of rows to return.

Returns:
A new dataframe with the first n rows of the dataframe.
A new DataFrame with the first n rows of the DataFrame.
"""
if n >= len(self._row_metadata):
return self.copy()
Expand Down Expand Up @@ -2254,7 +2254,7 @@ def info_helper(df):
lines = result.split('\n')

# Class denoted in info() output
class_string = '<class \'modin.dataframe.dataframe.DataFrame\'>\n'
class_string = '<class \'modin.pandas.dataframe.DataFrame\'>\n'

# Create the Index info() string by parsing self.index
index_string = self.index.summary() + '\n'
Expand Down Expand Up @@ -3492,7 +3492,7 @@ def reset_index(self, level=None, drop=False, inplace=False, col_level=0,
Args:
level: Only remove the given levels from the index. Removes all
levels by default
drop: Do not try to insert index into dataframe columns. This
drop: Do not try to insert index into DataFrame columns. This
resets the index to the default integer index.
inplace: Modify the DataFrame in place (do not create a new object)
col_level : If the columns have multiple levels, determines which
Expand Down Expand Up @@ -4244,13 +4244,13 @@ def swaplevel(self, i=-2, j=-1, axis=0):
"github.com/ray-project/ray.")

def tail(self, n=5):
"""Get the last n rows of the dataframe.
"""Get the last n rows of the DataFrame.

Args:
n (int): The number of rows to return.

Returns:
A new dataframe with the last n rows of this dataframe.
A new DataFrame with the last n rows of this DataFrame.
"""
if n >= len(self._row_metadata):
return self
Expand Down Expand Up @@ -4873,10 +4873,10 @@ def __setitem__(self, key, value):
self.insert(loc=loc, column=key, value=value)

def __len__(self):
"""Gets the length of the dataframe.
"""Gets the length of the DataFrame.

Returns:
Returns an integer length of the dataframe object.
Returns an integer length of the DataFrame object.
"""
return len(self._row_metadata)

Expand All @@ -4899,7 +4899,7 @@ def __iter__(self):
"""Iterate over the columns

Returns:
An Iterator over the columns of the dataframe.
An Iterator over the columns of the DataFrame.
"""
return iter(self.columns)

Expand Down Expand Up @@ -5260,7 +5260,7 @@ def _copartition(self, other, new_index):
return zip(new_partitions_self, new_partitions_other)

def _operator_helper(self, func, other, axis, level, *args):
"""Helper method for inter-dataframe and scalar operations"""
"""Helper method for inter-DataFrame and scalar operations"""
if isinstance(other, DataFrame):
return self._inter_df_op_helper(
lambda x, y: func(x, y, axis, level, *args),
Expand Down
File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ def __init__(self, dfs=None, index=None, axis=0, lengths_oid=None,
"""Inits a IndexMetadata from Ray DataFrame partitions

Args:
dfs ([ObjectID]): ObjectIDs of dataframe partitions
dfs ([ObjectID]): ObjectIDs of DataFrame partitions
index (pandas.Index): Index of the Ray DataFrame.
axis: Axis of partition (0=row partitions, 1=column partitions)

Expand Down Expand Up @@ -72,7 +72,7 @@ def _set__lengths(self, lengths):
_lengths = property(_get__lengths, _set__lengths)

def _get__coord_df(self):
"""Get the coordinate dataframe wrapped by this _IndexMetadata.
"""Get the coordinate DataFrame wrapped by this _IndexMetadata.

Since we may have had an index set before our coord_df was
materialized, we'll have to apply it to the newly materialized df
Expand All @@ -85,7 +85,7 @@ def _get__coord_df(self):
return self._coord_df_cache

def _set__coord_df(self, coord_df):
"""Set the coordinate dataframe wrapped by this _IndexMetadata.
"""Set the coordinate DataFrame wrapped by this _IndexMetadata.

Sometimes we set the _IndexMetadata's coord_df outside of the
constructor, generally using fxns like drop(). This produces a modified
Expand Down
10 changes: 5 additions & 5 deletions modin/dataframe/indexing.py → modin/pandas/indexing.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@
and perform lookup/item write.

_LocIndexer and _iLocIndexer is responsible for indexer specific logic and
lookup computation. Loc will take care of enlarge dataframe. Both indexer
lookup computation. Loc will take care of enlarge DataFrame. Both indexer
will take care of translating pandas's lookup to Ray DataFrame's internal
lookup.

Expand Down Expand Up @@ -145,8 +145,8 @@ def __init__(self, ray_df):
def __getitem__(self, row_lookup, col_lookup, ndim):
"""
Args:
row_lookup: A pandas dataframe, a partial view from row_coord_df
col_lookup: A pandas dataframe, a partial view from col_coord_df
row_lookup: A pandas DataFrame, a partial view from row_coord_df
col_lookup: A pandas DataFrame, a partial view from col_coord_df
ndim: the dimension of returned data
"""
if ndim == 2:
Expand Down Expand Up @@ -218,8 +218,8 @@ def _generate_view(self, row_lookup, col_lookup):
def __setitem__(self, row_lookup, col_lookup, item):
"""
Args:
row_lookup: A pandas dataframe, a partial view from row_coord_df
col_lookup: A pandas dataframe, a partial view from col_coord_df
row_lookup: A pandas DataFrame, a partial view from row_coord_df
col_lookup: A pandas DataFrame, a partial view from col_coord_df
item: The new item needs to be set. It can be any shape that's
broadcastable to the product of the lookup tables.
"""
Expand Down
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,8 @@

import pytest
import pandas
import modin.dataframe as pd
from modin.dataframe.utils import (
import modin.pandas as pd
from modin.pandas.utils import (
to_pandas,
from_pandas
)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,8 @@
import pandas
import pandas.util.testing as tm
from pandas.tests.frame.common import TestData
import modin.dataframe as pd
from modin.dataframe.utils import to_pandas
import modin.pandas as pd
from modin.pandas.utils import to_pandas


@pytest.fixture
Expand Down Expand Up @@ -1997,7 +1997,7 @@ def test_infer_objects():
@pytest.fixture
def test_info(ray_df):
info_string = ray_df.info()
assert '<class \'modin.dataframe.dataframe.DataFrame\'>\n' in info_string
assert '<class \'modin.pandas.dataframe.DataFrame\'>\n' in info_string
info_string = ray_df.info(memory_usage=True)
assert 'memory_usage: ' in info_string

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,8 @@
import sys
import pandas
import numpy as np
import modin.dataframe as pd
from modin.dataframe.utils import (
import modin.pandas as pd
from modin.pandas.utils import (
from_pandas,
to_pandas)

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,8 @@
import pytest
import numpy as np
import pandas
from modin.dataframe.utils import to_pandas
import modin.dataframe as pd
from modin.pandas.utils import to_pandas
import modin.pandas as pd
import os
import sqlite3

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
from __future__ import print_function

import pytest
import modin.dataframe as pd
import modin.pandas as pd


@pytest.fixture
Expand Down
14 changes: 7 additions & 7 deletions modin/dataframe/utils.py → modin/pandas/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -125,11 +125,11 @@ def _get_nan_block_id(n_row=1, n_col=1, transpose=False):


def _get_lengths(df):
"""Gets the length of the dataframe.
"""Gets the length of the DataFrame.
Args:
df: A remote pandas.DataFrame object.
Returns:
Returns an integer length of the dataframe object. If the attempt
Returns an integer length of the DataFrame object. If the attempt
fails, returns 0 as the length.
"""
try:
Expand All @@ -141,11 +141,11 @@ def _get_lengths(df):


def _get_widths(df):
"""Gets the width (number of columns) of the dataframe.
"""Gets the width (number of columns) of the DataFrame.
Args:
df: A remote pandas.DataFrame object.
Returns:
Returns an integer width of the dataframe object. If the attempt
Returns an integer width of the DataFrame object. If the attempt
fails, returns 0 as the length.
"""
try:
Expand All @@ -164,7 +164,7 @@ def _partition_pandas_dataframe(df, num_partitions=None, row_chunksize=None):
into. Has priority over chunksize.
row_chunksize (int): The number of rows to put in each partition.
Returns:
[ObjectID]: A list of object IDs corresponding to the dataframe
[ObjectID]: A list of object IDs corresponding to the DataFrame
partitions
"""
if num_partitions is not None:
Expand Down Expand Up @@ -342,7 +342,7 @@ def _build_row_lengths(df_row):

@ray.remote
def _build_coord_df(lengths, index):
"""Build the coordinate dataframe over all partitions."""
"""Build the coordinate DataFrame over all partitions."""
filtered_lengths = [x for x in lengths if x > 0]
coords = None
if len(filtered_lengths) > 0:
Expand Down Expand Up @@ -462,7 +462,7 @@ def decorator(cls):

@ray.remote
def _reindex_helper(old_index, new_index, axis, npartitions, *df):
"""Reindexes a dataframe to prepare for join/concat.
"""Reindexes a DataFrame to prepare for join/concat.

Args:
df: The DataFrame partition
Expand Down