Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEAT] Add dataframe iteration on rows and change default buffer size #2685

Merged
merged 4 commits into from
Aug 19, 2024

Conversation

jaychia
Copy link
Contributor

@jaychia jaychia commented Aug 19, 2024

  1. Changes default results_buffer_size on our dataframe iteration methods to use the total number of available CPUs on the current machine instead of 1, which was empirically observed by users to significant slow down processing speeds
  2. Adds a new df.iter_rows() API which is used by __iter__, but provides an alternative API if a user needs to configure results_buffer_size

@github-actions github-actions bot added the enhancement New feature or request label Aug 19, 2024
@github-actions github-actions bot added the documentation Improvements or additions to documentation label Aug 19, 2024
@jaychia
Copy link
Contributor Author

jaychia commented Aug 19, 2024

Thread with more context on the change:
https://dist-data.slack.com/archives/C052CA6Q9N1/p1723125121790669

@jaychia jaychia merged commit 9f5aee8 into main Aug 19, 2024
44 checks passed
@jaychia jaychia deleted the jay/defaults-iter branch August 19, 2024 19:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants