-
Notifications
You must be signed in to change notification settings - Fork 66
Replace pandas-based LP file writing with polars implementation #496
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Remove pandas-based LP writing functions and replace with polars versions - Rename polars functions to remove '_polars' suffix for consistent API - Create separate get_printers_scalar() for non-LP functions (highspy, gurobi, mosek) - Update get_printers() to handle polars dataframes for LP writing - Consolidate "lp" and "lp-polars" io_api options to use same implementation - Remove unused imports and cleanup handle_batch function
|
the replacement comes with a small trade off for smaller problems where the pandas based IO is still a bit faster. however for larger problems polars significantly speeds up. any opinios @coroa @lkstrp ? (note that the line of the crossover point in the upper right is a bit off, but you get the message) |
|
also tagging @fneum |
|
thinking about it I would say we go for it. we are talking about maximally 7 ms slower in bad configurations (tiny problems) but minutes faster for large problems, and the code get streamlined. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thinking about it I would say we go for it. we are talking about maximally 7 ms slower in bad configurations (tiny problems) but minutes faster for large problems, and the code get streamlined.
Agreed! Also the cleanup is nice
The polars migration broke NaN validation because check_has_nulls_polars only checked for null values, not NaN values. In polars, these are distinct concepts. This fix enhances the validation to detect both null and NaN values in numeric columns while avoiding type errors on non-numeric columns. Fixes failing tests in test_inconsistency_checks.py that expected ValueError to be raised when variables have NaN bounds.
|
Agree as well, but we should do a memory check comparison as well (e.g. with a PyPSA-EUR case). |
good point, but no need to worry as we have the slicing logic which ensures that memory requirements are bound |
|
Maybe someone can do it nevertheless? I am not sure if anyone ever tested the polars writing implementation on PyPSA-Eur type large problem (even if it has same slicing logic). |
|
Anyone knows whether the polars code here is within the confines of the narwhals compat layer (which is a subset of the full polars api)? Then the switching between pandas and polars could easily be made a config option. And atlite does not have to depend on polars, but it could remain optional. |

Closes # (if applicable).
Changes proposed in this Pull Request
Checklist
doc.doc/release_notes.rstof the upcoming release is included.