-
Notifications
You must be signed in to change notification settings - Fork 7.1k
[DataFrame] Implements df.as_matrix #2001
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 2 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -994,10 +994,31 @@ def test_as_blocks(): | |
|
|
||
|
|
||
| def test_as_matrix(): | ||
| ray_df = create_test_dataframe() | ||
| test_data = TestData() | ||
|
||
| frame = rdf.DataFrame(test_data.frame) | ||
| mat = frame.as_matrix() | ||
|
|
||
| frameCols = frame.columns | ||
|
||
| for i, row in enumerate(mat): | ||
| for j, value in enumerate(row): | ||
| col = frameCols[j] | ||
| if np.isnan(value): | ||
| assert np.isnan(frame[col][i]) | ||
| else: | ||
| assert value == frame[col][i] | ||
|
|
||
| with pytest.raises(NotImplementedError): | ||
| ray_df.as_matrix() | ||
| # mixed type | ||
| mat = rdf.DataFrame(test_data.mixed_frame).as_matrix(['foo', 'A']) | ||
| assert mat[0, 0] == 'bar' | ||
|
|
||
| df = rdf.DataFrame({'real': [1, 2, 3], 'complex': [1j, 2j, 3j]}) | ||
| mat = df.as_matrix() | ||
| assert mat[0, 0] == 1j | ||
|
|
||
| # single block corner case | ||
| mat = rdf.DataFrame(test_data.frame).as_matrix(['A', 'B']) | ||
| expected = test_data.frame.reindex(columns=['A', 'B']).values | ||
| tm.assert_almost_equal(mat, expected) | ||
|
||
|
|
||
|
|
||
| def test_asfreq(): | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
__array__does the same thing here. Would be better if this called that under the hood, so that it can be optimized in one place later.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you think b/c
as_matrixhas thecolumnskwargs, that we should leaveas_matrixlike it is now, but then usereturn self.as_matrix()for__array__, so it will be optimized in one place, but we can deal with the columns?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, that works too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After review, it looks like they do similar things, but
__array__takesdtypes(our implementation is just disregarding that for some reason), andas_matrixtakes columns, so let's just keep them separate for now. I'll add a TODO here also, though.