Add functions to compare Column objects with iterable references and to compare DataFrame objects with mapping references#66
Conversation
00e34a6 to
91c4f31
Compare
|
nice idea! do we want to check the data type too? |
d05595f to
a88360c
Compare
| force-single-line = true | ||
|
|
||
| [tool.black] | ||
| line-length = 90 |
There was a problem hiding this comment.
To sync with pre-commit.
| import dataframe_api_compat.pandas_standard | ||
| import dataframe_api_compat.polars_standard | ||
|
|
||
| DType = TypeVar("DType") |
There was a problem hiding this comment.
Looks unused, can return if needed.
4a20813 to
55944ce
Compare
|
|
||
|
|
||
| def test_column_sorted_indices_ascending(library: str) -> None: | ||
| df = integer_dataframe_6(library).persist() |
There was a problem hiding this comment.
I deleted .persist() call in several places, since the same call occurs in new comparison functions, which generates warnings, but due to the repository settings - errors. If this is incorrect, then we need a public way to check the ._is_persisted field, so as not to call the method several times.
| pd.testing.assert_frame_equal(result_pd, expected) | ||
| expected = {"a": [1, 2, 3], "b": [4, 5, 6], "result": [1.0, 32.0, 729.0]} | ||
| expected_dtype = {"a": ns.Int64, "b": ns.Int64, "result": ns.Float64} | ||
| compare_dataframe_with_reference(result, expected, expected_dtype) # type: ignore[arg-type] |
There was a problem hiding this comment.
I don’t know exactly why in some places mypy gives an error that has to be turned off, because it is a false positive. The first thing that catches my eye is that the lists inside the dictionaries have different types, for example int and float (not a homogeneous type).
| if dtype == "Float32": | ||
| return Namespace.Float32() | ||
| if dtype == "bool": | ||
| if dtype in ("bool", "boolean"): |
There was a problem hiding this comment.
I discovered it by accident while experimenting. It is possible that this is no longer necessary for the current changes.
| "UInt16": "uint16", | ||
| "UInt8": "uint8", | ||
| "boolean": "bool", | ||
| "Float64": "float64", |
There was a problem hiding this comment.
I also discovered by accident, it seems that the float type was missing, but if it was done on purpose, I can try to redo it.
There was a problem hiding this comment.
i probably just forgot it - let's add float32 too?
cef7e10 to
6360940
Compare
|
@MarcoGorelli ready for review :) |
|
@MarcoGorelli friendly ping :) A little information for context, after I manage to rewrite the tests in a backend-independent manner, I will try to integrate Modin into your repository. Such preliminary changes are necessary to avoid code duplication. |
MarcoGorelli
left a comment
There was a problem hiding this comment.
awesome!
sorry it took a while to get to
just got two minor comments, but this is great
| "UInt16": "uint16", | ||
| "UInt8": "uint8", | ||
| "boolean": "bool", | ||
| "Float64": "float64", |
There was a problem hiding this comment.
i probably just forgot it - let's add float32 too?
| if not hasattr(dtype, "startswith"): | ||
| dtype = str(dtype) |
There was a problem hiding this comment.
is it possible to do this in a less hacky way?
There was a problem hiding this comment.
We can try to use name attribute if it exists.
|
@MarcoGorelli there are new deprecation warnings from new polars release: FAILED tests/groupby/aggregate_test.py::test_aggregate[polars-lazy] - DeprecationWarning: `pl.count()` is deprecated. Please use `pl.len()` instead.
FAILED tests/groupby/aggregate_test.py::test_aggregate_only_size[polars-lazy] - DeprecationWarning: `pl.count()` is deprecated. Please use `pl.len()` instead.
FAILED tests/groupby/size_test.py::test_group_by_size[polars-lazy] - DeprecationWarning: `count` is deprecated. It has been renamed to `len`.What should I do in this case? |
|
easiest thing would be to address that in a separate PR, and to set the new polars release as the minimum version (polars it moving quite fast so backwards compatibility is less of a concern there) |
|
|
54173c3 to
f9aa10d
Compare
Signed-off-by: Anatoly Myachev <anatoly.myachev@intel.com>
Signed-off-by: Anatoly Myachev <anatoly.myachev@intel.com>
Signed-off-by: Anatoly Myachev <anatoly.myachev@intel.com>
Signed-off-by: Anatoly Myachev <anatoly.myachev@intel.com>
Signed-off-by: Anatoly Myachev <anatoly.myachev@intel.com>
Signed-off-by: Anatoly Myachev <anatoly.myachev@intel.com>
Signed-off-by: Anatoly Myachev <anatoly.myachev@intel.com>
Signed-off-by: Anatoly Myachev <anatoly.myachev@intel.com>
Signed-off-by: Anatoly Myachev <anatoly.myachev@intel.com>
Signed-off-by: Anatoly Myachev <anatoly.myachev@intel.com>
Signed-off-by: Anatoly Myachev <anatoly.myachev@intel.com>
f9aa10d to
a4c4aee
Compare
|
|
||
| @pytest.mark.skipif( | ||
| tuple(int(v) for v in pl.__version__.split(".")) < (0, 19, 0), | ||
| parse(pl.__version__) < Version("0.19.0"), |
There was a problem hiding this comment.
This will help the tests work with release candidates, such as polars==0.20.6rc1
|
@MarcoGorelli ready for review |
MarcoGorelli
left a comment
There was a problem hiding this comment.
thanks @anmyachev !
|
thanks for the review @MarcoGorelli! |
The changes are aimed at getting rid of the use of the
interchange_to_pandasfunction, so that the tests were implementation independent.So far the new functions have only been applied to
tests\columnfolder.