Skip to content

chore: Remove pandas-like specific check_column_names_are_unique util#2749

Merged
MarcoGorelli merged 1 commit intomainfrom
chore/single-check_column_names_are_unique
Jun 28, 2025
Merged

chore: Remove pandas-like specific check_column_names_are_unique util#2749
MarcoGorelli merged 1 commit intomainfrom
chore/single-check_column_names_are_unique

Conversation

@FBruzzesi
Copy link
Member

What type of PR is this? (check all applicable)

  • 💾 Refactor
  • ✨ Feature
  • 🐛 Bug Fix
  • 🔧 Optimization
  • 📝 Documentation
  • ✅ Test
  • 🐳 Other

Related issues

Spotted by @dangotbanned, see comment

Checklist

  • Code follows style guide (ruff)
  • Tests added
  • Documented the changes

len_unique_columns = len(set(columns))
try:
len_unique_columns = len(set(columns))
except TypeError as exc: # pragma: no cover
Copy link
Member Author

@FBruzzesi FBruzzesi Jun 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Set on unhashable type raises a TypeError:

set(([1], [2]))
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[1], line 1
----> 1 set(([1], [2]))

TypeError: unhashable type: 'list'

Should we raise a ValueError as before instead?

Copy link
Member

@dangotbanned dangotbanned Jun 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you give an example of how we'd get columns like that in the places the function is currently used?

import pandas as pd

>>> pd.DataFrame([[1, 2, 3], [4, 5, 6]], columns=([1], [2], [3]))
TypeError: unhashable type: 'list'

AFAIK, pandas only supports Hashable column "names" - so I'm a little confused 🤔

We're checking an existing NativeFrame.columns - so I would have thought that condition is not reachable

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is, there's a test which gets there

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is, there's a test which gets there

Is there?

Why do we have a # pragma: no cover?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is a pandas version-specific thing - then a comment explaining the no cover would help

I made the suggestion in (#2511 (comment)) based on the fact this appears unreachable

@FBruzzesi FBruzzesi requested a review from dangotbanned June 28, 2025 12:44
Copy link
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MarcoGorelli MarcoGorelli merged commit 3b9247e into main Jun 28, 2025
34 checks passed
@MarcoGorelli MarcoGorelli deleted the chore/single-check_column_names_are_unique branch June 28, 2025 13:03
dangotbanned added a commit that referenced this pull request Jun 28, 2025
Follow-up to #2749 (comment)

If CI can catch this, then I'll add a comment explaining where this is needed
MarcoGorelli pushed a commit that referenced this pull request Jun 28, 2025
chore: Do we really have coverage?

Follow-up to #2749 (comment)

If CI can catch this, then I'll add a comment explaining where this is needed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants