Skip to content

Conversation

@gszadovszky
Copy link
Contributor

Because of the ambigous sorting order of float/double the following changes were made at building column indexes (write path):

  • Not writing column index for a column where any of the min/max values contain a NaN value.
  • Using -0.0 as min value and +0.0 as max value independently from which 0.0 was in the values.

@gszadovszky gszadovszky merged commit 1f95eca into apache:column-indexes Aug 23, 2018
zivanfi pushed a commit that referenced this pull request Oct 18, 2018
This is a squashed feature branch merge including the changes listed below. The detailed history can be found in the 'column-indexes' branch.

* PARQUET-1211: Column indexes: read/write API (#456)
* PARQUET-1212: Column indexes: Show indexes in tools (#479)
* PARQUET-1213: Column indexes: Limit index size (#480)
* PARQUET-1214: Column indexes: Truncate min/max values (#481)
* PARQUET-1364: Invalid row indexes for pages starting with nulls (#507)
* PARQUET-1310: Column indexes: Filtering (#509)
* PARQUET-1386: Fix issues of NaN and +-0.0 in case of float/double column indexes (#515)
* PARQUET-1389: Improve value skipping at page synchronization (#514)
* PARQUET-1381: Fix missing endRecord after merging columnIndex
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants