-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lightning on x86_64 will crash when importing an empty table via Parquet #52518
Comments
The panic is caused by wrong file size calculation in tidb/pkg/lightning/mydump/loader.go Line 866 in 572e5c4
when a parquet file contains 0 rows, this line will be reached with size := int64(0.0 / 0.0 * 0.0) Result of converting NaN to int64 in Go is architecture dependent. On arm64, the result is 0, but on x86_64 it is -9223372036854775808 (int64min). This then later caused tidb/lightning/pkg/importer/import.go Line 1069 in 572e5c4
and finally feed into Prometheus that caused the panic tidb/lightning/pkg/importer/import.go Lines 1092 to 1093 in 572e5c4
|
BTW, I think it is too resource intensive to estimate the Parquet file size through the current implementation of
This means we need to open the file twice to guess the size. For external storage like S3 this means Lightning will need to download every Parquet file twice at the "load data source" process. On the customer that triggers this bug, it has taken 1h42m for a process that should have been just a very quick walk-dir (similar issue for |
To get the BTW I think the uncompressed size in parquet header should be usable but the PR auther said no #46984 (comment) . A bit suspicious. |
@lance6716 the first open used In any case, perhaps even O(number of files) is still too heavy, comparing to the later sampledIndexRatio procedure which is O(number of tables). It should not waste 2 hours before pre-check is able to run. |
/found customer |
Bug Report
Please answer these questions before submitting your issue. Thanks!
1. Minimal reproduce step (Required)
On Amazon Aurora, create an empty table.
Create an AWS Aurora Snapshot (or directly unzip the attached file → account.audits_log.parquet.zip)
Import the snapshot using https://docs.pingcap.com/tidb/stable/migrate-aurora-to-tidb
2. What did you expect to see? (Required)
Import success
3. What did you see instead (Required)
TiDB Lightning panicked with
4. What is your TiDB version? (Required)
CPU: x86_64, OS: Linux
The text was updated successfully, but these errors were encountered: