Remote IO support in cudf-polars#19921
Remote IO support in cudf-polars#19921rapids-bot[bot] merged 45 commits intorapidsai:branch-25.10from
Conversation
…to use-remote-io-easy-interface
|
/ok to test ef1e0a2 |
| # This works fine when the file has no leading blank lines, | ||
| # but currently we do some file introspection | ||
| # to skip blanks before parsing the header. | ||
| # For remote files we cannot determine if leading blank lines | ||
| # exist, so we're punting on CSV support. | ||
| # TODO: Once the CSV reader supports skipping leading | ||
| # blank lines natively, we can remove this guard. |
|
/ok to test 4e4cb87 |
jameslamb
left a comment
There was a problem hiding this comment.
Right now this PR is adding pytest-httpserver as a hard runtime dependency of cudf-polars, which I think was not your intention. I suggested a fix for that.
dependencies.yaml
Outdated
| - output_types: conda | ||
| packages: | ||
| - cudf-polars==25.10.*,>=0.0.0a0 | ||
| - pytest-httpserver |
There was a problem hiding this comment.
This doesn't belong in this depends_on_cudf_polars list.
- these
depends_on_*lists are only intended to serve one dependency - placing this here makes
pytest-httpservera runtime dependency ofcudf-polars(which I believe you didn't intend)
Please move this to cudf-polars testing dependencies instead:
Line 889 in 8921686
|
/ok to test 4cb3909 |
|
/ok to test 482e70d |
| if not isinstance(src, (os.PathLike, str)): | ||
| raise ValueError("All sources must be of the same type!") | ||
| if not (os.path.isfile(src) or self._is_remote_file_pattern.match(src)): | ||
| if not (os.path.isfile(src) or SourceInfo._is_remote_uri(src)): |
There was a problem hiding this comment.
| if not (os.path.isfile(src) or SourceInfo._is_remote_uri(src)): | |
| if not (os.path.isfile(src) or self._is_remote_uri(src)): |
Nit
There was a problem hiding this comment.
Since _is_remote_uri is a staticmethod, SourceInfo._is_remote_uri is preferred, right?
There was a problem hiding this comment.
I'm not sure if there's a particular preference. Generally using self would still automatically respect a subclass' override of _is_remote_uri which is a nice guarantee.
Probably not worth another commit & CI run if this PR is close to merging, but good to consider in the future.
jameslamb
left a comment
There was a problem hiding this comment.
Approving as the packaging changes look ok to me, thanks for fixing that.
|
/ok to test 30a0594 |
|
/merge |
Description
Checklist