-
-
Notifications
You must be signed in to change notification settings - Fork 19.4k
Open
Labels
Description
On master (commit 40b4bb4):
>>> data = """A
1
CAT
3"""
>>> f = lambda x: x
>>> read_csv(StringIO(data), na_values='CAT', converters={'A': f}, engine='c')
A
0 1
1 CAT
2 3
>>> read_csv(StringIO(data), na_values='CAT', converters={'A': f}, engine='python')
A
0 1
1 NaN
2 3I expect both to give the same output, though I believe the Python output is more correct because it respects na_values unlike the C engine. I thought the simple fix would be to remove the continue statement here, but that causes test failures, so probably a more involved refactoring might be needed to align the order of converter application, NaN value conversion, and dtype conversion.
IMO this should be added to #12686, as this is a difference in behaviour between the two engines.
xref #5232