Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversion to pyarrow crashes the interpreter with a core dump when using decimals #17425

Closed
2 tasks done
cpcloud opened this issue Jul 4, 2024 · 4 comments · Fixed by #17445
Closed
2 tasks done

Conversion to pyarrow crashes the interpreter with a core dump when using decimals #17425

cpcloud opened this issue Jul 4, 2024 · 4 comments · Fixed by #17445
Assignees
Labels
A-dtype-decimal Area: decimal data type accepted Ready for implementation bug Something isn't working P-high Priority: high python Related to Python Polars

Comments

@cpcloud
Copy link

cpcloud commented Jul 4, 2024

Checks

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest version of Polars.

Reproducible example

In [9]: import polars as pl

In [10]: pl.__version__
Out[10]: '1.0.0'

In [11]: df = pl.DataFrame({"a": [1.0]})

In [12]: df
Out[12]:
shape: (1, 1)
┌─────┐
│ a   │
│ --- │
│ f64 │
╞═════╡
│ 1.0 │
└─────┘

In [13]: df = df.select(b=pl.col("a").cast(pl.Decimal(76, 38)))

In [14]: df
Out[14]:
shape: (1, 1)
┌─────────────────────────────────┐
│ b                               │
│ ---                             │
│ decimal[76,38]                  │
╞═════════════════════════════════╡
│ 0.9999999999999999774880982345… │
└─────────────────────────────────┘

In [15]: df.to_arrow()
/arrow/cpp/src/arrow/type.cc:1466:  Check failed: (precision) <= (kMaxPrecision)
zsh: abort (core dumped)  ipython

Log output

No response

Issue description

Polars dumps core when casting decimals.

Expected behavior

I would expect an error message, or a successful conversion to pyarrow, really anything but dumping core and crashing the Python interpreter.

Installed versions

--------Version info---------
Polars:               1.0.0
Index type:           UInt32
Platform:             Linux-6.6.36-x86_64-with-glibc2.39
Python:               3.10.14 (main, Mar 19 2024, 21:46:16) [GCC 13.3.0]

----Optional dependencies----
adbc_driver_manager:  <not installed>
cloudpickle:          3.0.0
connectorx:           <not installed>
deltalake:            0.18.1
fastexcel:            <not installed>
fsspec:               2024.6.1
gevent:               <not installed>
great_tables:         <not installed>
hvplot:               <not installed>
matplotlib:           3.9.0
nest_asyncio:         1.6.0
numpy:                2.0.0
openpyxl:             <not installed>
pandas:               2.2.2
pyarrow:              16.1.0
pydantic:             2.7.4
pyiceberg:            <not installed>
sqlalchemy:           2.0.31
torch:                <not installed>
xlsx2csv:             <not installed>
xlsxwriter:           <not installed>
@cpcloud cpcloud added bug Something isn't working needs triage Awaiting prioritization by a maintainer python Related to Python Polars labels Jul 4, 2024
@stinodego stinodego added P-high Priority: high A-dtype-decimal Area: decimal data type and removed needs triage Awaiting prioritization by a maintainer labels Jul 4, 2024
@ritchie46
Copy link
Member

It seems that we allow a higher precision than pyarrow/arrow. It seems that they abort the process once that check on pyarrow's side fails.

@ritchie46
Copy link
Member

We should raise, pyarrows 16 byte decimal has a max precision of 38.

@cpcloud
Copy link
Author

cpcloud commented Jul 4, 2024

Yeah, it looks like they're using C asserts, which tend to call std::abort 😬

@ritchie46
Copy link
Member

Yeah, Given that we cross the FFI boundary, an abort probably is the most sensible thing they can do if they cannot consume the pointers. Will ensure we raise. 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-dtype-decimal Area: decimal data type accepted Ready for implementation bug Something isn't working P-high Priority: high python Related to Python Polars
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

4 participants