nullable ints interpreted as floats #1

kokes · 2020-06-02T08:29:26Z

While pandas supports nullable ints via extension arrays, they are still not the default when reading data in. So you can easily get float64 for nullable int series, so you could save data by converting floats to these nullable int types.

It sort of depends on being able to accurately detect that these floats can be converted to ints without loss (well, without much loss, few floats map to ints precisely) - though you already lost data by the automatic conversion to floats in the first place, so we're effectively reverting this loss.

In [8]: data = list(range(1000)) + [None]                                                                                                              
In [9]: s = pd.Series(data) # defaults to float64 due to NaN                                                                                                                           
In [10]: s2 = s.astype('Int16') # Int16 nullable dtype                                                                         

In [11]: s.memory_usage()                                                                                                                              
Out[11]: 8136

In [12]: s2.memory_usage()                                                                                                                             
Out[12]: 3131

The text was updated successfully, but these errors were encountered:

ianozsvald · 2020-06-02T08:51:21Z

Much obliged for the feedback, I've updated the README to note convert_dtype in Pandas for these. I figure I'll wait for some more feedback from others (especially for how this tool crashes on datasets I haven't considered!) before I make a first round of fixes. Cheers!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

nullable ints interpreted as floats #1

nullable ints interpreted as floats #1

kokes commented Jun 2, 2020 •

edited by ianozsvald

Loading

ianozsvald commented Jun 2, 2020

nullable ints interpreted as floats #1

nullable ints interpreted as floats #1

Comments

kokes commented Jun 2, 2020 • edited by ianozsvald Loading

ianozsvald commented Jun 2, 2020

kokes commented Jun 2, 2020 •

edited by ianozsvald

Loading