feat: Add `{nw,DataFrame}.from_dicts` #3148

felixgwilliams · 2025-09-20T18:15:27Z

What type of PR is this? (check all applicable)

Related issues

Related issue [Enh]: Add nw.from_dicts to convert a sequence of dictionaries representing rows to a data frame #3145
Closes [Enh]: Add nw.from_dicts to convert a sequence of dictionaries representing rows to a data frame #3145

Checklist

Code follows style guide (ruff)
Tests added
Documented the changes

If you have comments or can explain your changes, please do so below

This PR adds a from_dicts function and methods, that can be used to create a data frame from a sequence of dicts that represent rows. It is available for eager backends.

Tracking

fix(python): Widen from_dicts to Iterable[Mapping[str, Any]] pola-rs/polars#24584

TODO: examples for docstring

FBruzzesi

Thanks @felixgwilliams this looks promising and already close to the finish line.

I left a few comments, arguably nitpicks.

A couple of things that are missing:

Exposing the functionalities in stable.v1
A test to check for both empty data and no schema resulting in a shape=(0,0) dataframe.

The same as:

def test_from_dict_empty(eager_backend: EagerAllowed) -> None:
    result = nw.DataFrame.from_dict({}, backend=eager_backend)
    assert result.shape == (0, 0)

narwhals/dataframe.py

narwhals/functions.py

narwhals/dataframe.py

- expose from_dicts in v1 change docstrings for consistency - add test for empty data without a schema and get it running for polars - remove native namespace argument so that tests pass - Not done: replace Sequence[dict[str, Any]] with Sequence[dict[str, PythonLiteral]]

FBruzzesi · 2025-09-21T10:29:30Z

Thanks for all the adjustments @felixgwilliams - I would stall for a moment waiting for an opinion on #3148 (comment), aside that the feature seems ready 👌🏼

- narwhals-dev#2786 - narwhals-dev#3148 (comment)

… tests for a non-dict mapping

…mappings with `from_dicts`

Closes pola-rs#24583 Downstream in `narwhals`, we discovered the typing wasn't updated alongside the runtime support added in `1.30.0` ### Related - pola-rs#22638 - pola-rs#19322 - narwhals-dev/narwhals#3148 (comment) - narwhals-dev/narwhals#3148 (comment)

- `pl.DataFrame(schema)` defaults to `None` - Having a single branch for empty `data`, means we can safely index into it

narwhals/_polars/dataframe.py

Co-authored-by: Dan Redding <[email protected]>

narwhals-dev#3148 (comment)

I should really update the others soon

Seems a few of these were missed in narwhals-dev#2981

Will add an IDE preview

narwhals/stable/v1/__init__.py

narwhals/dataframe.py

The ids look like `"polars0-dict"`, `"pyarrow0-mappingproxy"`

felixgwilliams · 2025-09-24T20:29:19Z

I noticed that when the schemas of the dicts is not consistent and schema is not specified, the behavior is different when using the PyArrow backend, because PyArrow only looks at the first row to establish the (see here), whereas pandas uses all the rows and polars seems to use the first 100 by default.

Do you think it's worth highlighting the differences in the docstring or do you think "If not specified, the schema will be inferred by the native library." covers it? I think it's surprising enough to be worth mentioning, but I'm conscious of not wanting to be too verbose.

narwhals/dataframe.py

narwhals/functions.py

narwhals/dataframe.py

Co-authored-by: Dan Redding <[email protected]>

dangotbanned · 2025-09-25T10:32:17Z

I noticed that when the schemas of the dicts is not consistent and schema is not specified, the behavior is different

pyarrow only looks at the first row to establish the schema (see here)

pandas uses all the rows

polars uses the first 100 by default.

Well spotted!

Do you think it's worth highlighting the differences in the docstring or do you think

"If not specified, the schema will be inferred by the native library."

covers it?
I think it's surprising enough to be worth mentioning, but I'm conscious of not wanting to be too verbose.

Agreed, this does seem like it would be helpful to document, considering:

the JSON context ([Enh]: Add nw.from_dicts to convert a sequence of dictionaries representing rows to a data frame #3145 (comment)) I framed this in
missing keys are common enough to become a typing feature

Someone will get burned by this eventually 😳

What to do?

Docs

Although we could specify these differences in the docstring of from_dicts, I suspect we might also see the same kind of differences in:

DataFrame.from_dict
Series.from_iterable
... maybe others in API: io functions for v2 #2116

If that's the case, then we could benefit from some narrative docs in the user guide.
That way we could do side-by-side comparisons without clogging up the docstrings 😅

Note

Might be best to split that out into a follow-up issue?

dangotbanned · 2025-09-25T10:32:25Z

Tests

I noticed that when the schemas of the dicts is not consistent and schema is not specified, the behavior is different

Add tests for this then 😄

Each should have data with a schema change in row:

2 (xfail pyarrow)
99 (xfail pyarrow)
101 (xfail pyarrow, polars)

I'd recommend using this fixture instead of eager_backend, but in this case it's because the input data size is larger than normal

narwhals/tests/conftest.py

Lines 320 to 323 in 63c5022

    
           @pytest.fixture(params=[el for el in TEST_EAGER_BACKENDS if not isinstance(el, str)]) 
        
           def eager_implementation(request: pytest.FixtureRequest) -> EagerAllowed: 
        
               """Use if a test is heavily parametric, skips `str` backend.""" 
        
               return request.param  # type: ignore[no-any-return]

felixgwilliams · 2025-09-25T10:55:40Z

Add tests for this then 😄

I'll write the tests this evening. Thanks for the pointers 👍.

As for the docstrings, are we happy with leaving them as is and covering the issue separately? Otherwise I wrote some bullet points explaining the differences that I haven't committed yet. I don't really have a good feeling for what is too long for a narwhals docstring.

dangotbanned · 2025-09-25T11:03:40Z

Otherwise I wrote some bullet points explaining the differences that I haven't committed yet. I don't really have a good feeling for what is too long for a narwhals docstring.

I'm not 100% sure if you need more permissions for it, but I'd normally use a suggestion (step 7) for that kinda thing

I'm happy to take a look

narwhals/dataframe.py

…ctionary keys in from_dicts

dangotbanned · 2025-09-29T09:55:05Z

@felixgwilliams sorry for the delay, I'm hoping to review this later today 🙏

dangotbanned

Thanks for the PR @felixgwilliams

I've only got a few minor suggestions - I think we're pretty much good to go

narwhals/dataframe.py

narwhals/functions.py

tests/frame/from_dicts_test.py

tests/from_dicts_test.py

Co-authored-by: Dan Redding <[email protected]>

felixgwilliams · 2025-09-29T17:00:45Z

Thanks @dangotbanned for your helpful suggestions. I'll be sure to remember the tip about comments being part of the code.

dangotbanned

Thanks again @felixgwilliams, welcome aboard 😉

felixgwilliams added 3 commits September 20, 2025 19:15

add placeholders for from_dicts function

592d6fc

add implementation and unit tests

9ea6a97

TODO: examples for docstring

add examples to docstrings copied from from_dict

0e49901

felixgwilliams force-pushed the feat/from_dicts branch from 86400b3 to 0e49901 Compare September 20, 2025 18:15

felixgwilliams marked this pull request as ready for review September 20, 2025 22:11

FBruzzesi changed the title ~~Feat/from dicts~~ feat: Add support for {nw, DataFrame}.from_dicts Sep 21, 2025

FBruzzesi added the enhancement New feature or request label Sep 21, 2025

FBruzzesi reviewed Sep 21, 2025

View reviewed changes

felixgwilliams added 3 commits September 21, 2025 10:09

remove test for deprecated call in from_dicts_test

a734b9b

add v1 and v2 tests for from_dicts

01bb262

dangotbanned and others added 4 commits September 21, 2025 11:47

refactor: Skip unreachable Unknown

77b3921

- narwhals-dev#2786 - narwhals-dev#3148 (comment)

change data type on from_dicts to Sequence[Mapping[str, Any]] and add…

076a821

… tests for a non-dict mapping

use types.MappingProxyType instead of custom type to test non-dict …

1dde847

…mappings with `from_dicts`

Merge branch 'main' into feat/from_dicts

130427d

This comment was marked as resolved.

Sign in to view

dangotbanned mentioned this pull request Sep 23, 2025

typing: from_dicts too narrow Iterable[dict[str, Any]] pola-rs/polars#24583

Closed

2 tasks

handle mappings as dicts in from_dicts for polars <1.30

697bb66

dangotbanned mentioned this pull request Sep 23, 2025

fix(python): Widen from_dicts to Iterable[Mapping[str, Any]] pola-rs/polars#24584

Merged

This comment was marked as resolved.

Sign in to view

dangotbanned added 2 commits September 24, 2025 10:42

docs: Move compat info to FROM_DICTS_ACCEPTS_MAPPINGS flag

cbbdc5d

refactor(polars): Simplify empty cases

3b5ff0c

- `pl.DataFrame(schema)` defaults to `None` - Having a single branch for empty `data`, means we can safely index into it

dangotbanned reviewed Sep 24, 2025

View reviewed changes

narwhals/_polars/dataframe.py Outdated Show resolved Hide resolved

felixgwilliams and others added 3 commits September 24, 2025 13:48

Update narwhals/_polars/dataframe.py

06b0482

Co-authored-by: Dan Redding <[email protected]>

revert: Avoid introducing deprecated native_namespace

5e1164d

narwhals-dev#3148 (comment)

refactor(typing): Use IntoSchema everywhere

3a2d4a3

I should really update the others soon

dangotbanned added 2 commits September 24, 2025 15:21

docs: Remove *Returns* section

c3e5936

Seems a few of these were missed in narwhals-dev#2981

refactor(suggestion): Make stable functions, aliases

1b37317

Will add an IDE preview

dangotbanned reviewed Sep 24, 2025

View reviewed changes

narwhals/stable/v1/__init__.py Outdated Show resolved Hide resolved

dangotbanned reviewed Sep 24, 2025

View reviewed changes

narwhals/dataframe.py Outdated Show resolved Hide resolved

dangotbanned and others added 3 commits September 24, 2025 17:58

test: Merge dict vs Mapping cases

e0cf86d

The ids look like `"polars0-dict"`, `"pyarrow0-mappingproxy"`

test: Use assert_equal_data

4154c81

add a more distinct example to from_dicts docstring

9b12199

dangotbanned reviewed Sep 25, 2025

View reviewed changes

narwhals/dataframe.py Outdated Show resolved Hide resolved

dangotbanned reviewed Sep 25, 2025

View reviewed changes

narwhals/functions.py Outdated Show resolved Hide resolved

dangotbanned reviewed Sep 25, 2025

View reviewed changes

narwhals/dataframe.py Show resolved Hide resolved

Apply suggestions from code review

6603bf5

Co-authored-by: Dan Redding <[email protected]>

felixgwilliams commented Sep 25, 2025

View reviewed changes

narwhals/dataframe.py Show resolved Hide resolved

felixgwilliams and others added 2 commits September 25, 2025 21:52

tests: test to highlight how eager backends deal with inconsistent di…

868b488

…ctionary keys in from_dicts

Merge branch 'main' into feat/from_dicts

8de7904

dangotbanned reviewed Sep 29, 2025

View reviewed changes

narwhals/dataframe.py Show resolved Hide resolved

narwhals/functions.py Show resolved Hide resolved

tests/frame/from_dicts_test.py Outdated Show resolved Hide resolved

tests/from_dicts_test.py Outdated Show resolved Hide resolved

dangotbanned added the eager-only label Sep 29, 2025

felixgwilliams and others added 2 commits September 29, 2025 17:57

Apply suggestions from code review

f2b45b6

Co-authored-by: Dan Redding <[email protected]>

add tip on non-uniform keys to nw.from_dicts

a2e3d80

dangotbanned mentioned this pull request Sep 29, 2025

Document differences in schema inference between backends #3164

Open

dangotbanned approved these changes Sep 29, 2025

View reviewed changes

dangotbanned changed the title ~~feat: Add support for {nw, DataFrame}.from_dicts~~ feat: Add {nw,DataFrame}.from_dicts Sep 29, 2025

dangotbanned merged commit 25447d3 into narwhals-dev:main Sep 29, 2025
28 of 31 checks passed

felixgwilliams deleted the feat/from_dicts branch September 30, 2025 17:21

feat: Add {nw,DataFrame}.from_dicts #3148

feat: Add {nw,DataFrame}.from_dicts #3148

Uh oh!

Conversation

felixgwilliams commented Sep 20, 2025 • edited by dangotbanned Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What type of PR is this? (check all applicable)

Related issues

Checklist

If you have comments or can explain your changes, please do so below

Tracking

Uh oh!

FBruzzesi left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

FBruzzesi commented Sep 21, 2025

Uh oh!

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

Uh oh!

Uh oh!

Uh oh!

felixgwilliams commented Sep 24, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dangotbanned commented Sep 25, 2025

What to do?

Docs

Uh oh!

dangotbanned commented Sep 25, 2025

Tests

Uh oh!

felixgwilliams commented Sep 25, 2025

Uh oh!

dangotbanned commented Sep 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

dangotbanned commented Sep 29, 2025

Uh oh!

dangotbanned left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

felixgwilliams commented Sep 29, 2025

Uh oh!

dangotbanned left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

feat: Add `{nw,DataFrame}.from_dicts` #3148

feat: Add `{nw,DataFrame}.from_dicts` #3148

felixgwilliams commented Sep 20, 2025 •

edited by dangotbanned

Loading

FBruzzesi left a comment •

edited

Loading

dangotbanned commented Sep 25, 2025 •

edited

Loading