Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CSV.write should conditionally convert type unstable iterators #1087

Open
robsmith11 opened this issue May 7, 2023 · 0 comments
Open

CSV.write should conditionally convert type unstable iterators #1087

robsmith11 opened this issue May 7, 2023 · 0 comments

Comments

@robsmith11
Copy link

After trying to save a 6 column DataFrame to a 5GB CSV file, I had to kill the Julia session after a few minutes of it heavily swapping on my laptop with 16GB of memory.

As pointed out here [1], the memory usage can be significantly reduce by making the iteration type-stable with the Tables.columntable function. I was able to write my file in a few seconds after making that change.

Since it's quite common to work with narrow but long DataFrames, shouldn't CSV.write just check the dimensions and decide when to convert to a type stable table?

[1] https://stackoverflow.com/questions/65584387/julia-csv-write-very-memory-inefficient

@nickrobinson251 nickrobinson251 added improvement Improve an existing feature/functionality performance design and removed improvement Improve an existing feature/functionality labels May 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants