chore: Remove unreachable hash check by dangotbanned · Pull Request #2750 · narwhals-dev/narwhals

dangotbanned · 2025-06-28T13:44:29Z

What type of PR is this? (check all applicable)

Related issues

Checklist

Code follows style guide (ruff)
Tests added
Documented the changes

If you have comments or can explain your changes, please do so below

Could you give an example of how we'd get columns like that in the places the function is currently used?
import pandas as pd

>>> pd.DataFrame([[1, 2, 3], [4, 5, 6]], columns=([1], [2], [3]))
TypeError: unhashable type: 'list'
AFAIK, pandas only supports Hashable column "names" - so I'm a little confused 🤔

We're checking an existing NativeFrame.columns - so I would have thought that condition is not reachable

Follow-up to #2749 (comment) If CI can catch this, then I'll add a comment explaining where this is needed

MarcoGorelli · 2025-06-28T13:51:53Z

ah we ended up removing it in https://github.com/narwhals-dev/narwhals/pull/2011/files

MarcoGorelli · 2025-06-28T14:52:13Z

Doesn't hurt to keep it, as per the linked test case? Like this it gives a clearer error message for malformed pandas dfs?

dangotbanned · 2025-06-28T15:08:53Z

Doesn't hurt to keep it, as per the linked test case? Like this it gives a clearer error message for malformed pandas dfs?

Sorry I'm not sure I understand.

You linked a test that was removed 4 months ago that has the comment:

def test_from_non_hashable_column_name() -> None:
   # This is technically super-illegal
    # BUT, it shows up in a scikit-learn test, so...

If that test is important - we should add it back?

Personally, I can't see why we should add error handling for a test fails in scikit-learn.
Maybe I could see this as important if we ran scikit-learn's test suite or if we had a test ourselves

As-is, we just have a check that's added overhead on every .with_native 🤔

MarcoGorelli · 2025-06-28T15:23:27Z

I doubt the overhead is noticeable, a comment with a link to the pandas issue should be fine

FBruzzesi · 2025-06-28T17:36:59Z

@MarcoGorelli I am also having a hard time to figure out how this can be used in practice. I am able to create a dataframe, but any operation I am running is leading to an error, with the exception of accessing the columns attribute itself:

In [1]: import pandas as pd

In [2]: pd.__version__
Out[2]: '1.1.3'

In [3]: df = pd.DataFrame([[1, 2], [3, 4]], columns=["pizza", ["a", "b"]])

In [4]: df.columns
Out[4]: Index(['pizza', ['a', 'b']], dtype='object')

In [5]: df
Out[5]: ---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
File pandas/_libs/hashtable_class_helper.pxi:1709, in pandas._libs.hashtable.PyObjectHashTable.map_locations()

TypeError: unhashable type: 'list'
Exception ignored in: 'pandas._libs.index.IndexEngine._call_map_locations'
Traceback (most recent call last):
  File "pandas/_libs/hashtable_class_helper.pxi", line 1709, in pandas._libs.hashtable.PyObjectHashTable.map_locations
TypeError: unhashable type: 'list'

if that's the case, len(set(...)) would raise the same error, and we don't need to necessarily/explicitly do it as well - at least for now. If eventually scikit-learn starts developing and find issues, we can change it or add a specific check for pandas 🤔

dangotbanned · 2025-06-28T17:42:03Z

I doubt the overhead is noticeable, a comment with a link to the pandas issue should be fine

It isn't necessarily the overhead per-call that I'm worried about, the issue is that this check is on most/all method calls - potentially multiple times.
I really feel like a stronger case needs to be made to justify this

MarcoGorelli · 2025-06-28T17:43:10Z

I am able to create a dataframe, but any operation I am running, is leading to an error

hmmm ok thanks, that is a sign that such a state is so broken that we shouldn't deal with it

MarcoGorelli

thanks both!

dangotbanned · 2025-06-28T17:45:47Z

thanks both!

Thanks @MarcoGorelli! 😍

chore: Do we really have coverage?

cf09822

Follow-up to #2749 (comment) If CI can catch this, then I'll add a comment explaining where this is needed

dangotbanned added the internal label Jun 28, 2025

dangotbanned changed the title ~~chore: Do we really have coverage?~~ chore: Remove unreachable hash check Jun 28, 2025

dangotbanned marked this pull request as ready for review June 28, 2025 13:53

Merge branch 'main' into cov-check_column_names_are_unique

d610182

dangotbanned requested a review from FBruzzesi June 28, 2025 14:46

MarcoGorelli approved these changes Jun 28, 2025

View reviewed changes

MarcoGorelli merged commit 11343f1 into main Jun 28, 2025
43 of 45 checks passed

MarcoGorelli deleted the cov-check_column_names_are_unique branch June 28, 2025 17:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: Remove unreachable hash check#2750

chore: Remove unreachable hash check#2750
MarcoGorelli merged 2 commits intomainfrom
cov-check_column_names_are_unique

dangotbanned commented Jun 28, 2025 •

edited

Loading

Uh oh!

MarcoGorelli commented Jun 28, 2025

Uh oh!

MarcoGorelli commented Jun 28, 2025

Uh oh!

dangotbanned commented Jun 28, 2025 •

edited

Loading

Uh oh!

MarcoGorelli commented Jun 28, 2025 •

edited

Loading

Uh oh!

FBruzzesi commented Jun 28, 2025 •

edited

Loading

Uh oh!

dangotbanned commented Jun 28, 2025

Uh oh!

MarcoGorelli commented Jun 28, 2025

Uh oh!

MarcoGorelli left a comment

Uh oh!

Uh oh!

dangotbanned commented Jun 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

dangotbanned commented Jun 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What type of PR is this? (check all applicable)

Related issues

Checklist

If you have comments or can explain your changes, please do so below

Uh oh!

MarcoGorelli commented Jun 28, 2025

Uh oh!

MarcoGorelli commented Jun 28, 2025

Uh oh!

dangotbanned commented Jun 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

MarcoGorelli commented Jun 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

FBruzzesi commented Jun 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dangotbanned commented Jun 28, 2025

Uh oh!

MarcoGorelli commented Jun 28, 2025

Uh oh!

MarcoGorelli left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

dangotbanned commented Jun 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

dangotbanned commented Jun 28, 2025 •

edited

Loading

dangotbanned commented Jun 28, 2025 •

edited

Loading

MarcoGorelli commented Jun 28, 2025 •

edited

Loading

FBruzzesi commented Jun 28, 2025 •

edited

Loading