Skip to content

Conversation

@bkietz
Copy link
Member

@bkietz bkietz commented Oct 14, 2020

No description provided.

@github-actions
Copy link

@jorisvandenbossche
Copy link
Member

Thanks! So this was already fixed in partition type simplification PR (#8367). The added tests look good.

Now, falling back to string is certainly better as raising an error. But I am wondering: do we want to do more inference than just int32 and string? Could we infer int64 here? That might quickly get complicated / a can of worms ?

@bkietz
Copy link
Member Author

bkietz commented Oct 15, 2020

I think it's not worthwhile to fallback to int64. The only use cases I can think of for long integers in a partition column are

  • A low cardinality* set of externally derived identifiers which happen to be numeric. Arithmetic operations on such identifiers would not be meaningful, so there is no benefit to making that column integral
  • Year stored as nanoseconds since the epoch (or similar int-stored-as-multiple, int-stored-with-offset situation). In this situation I'd advise the user to derive a new column which stores the year more meaningfully

* If the column has high cardinality then it's not really a candidate for partitioning in the first place.

@jorisvandenbossche
Copy link
Member

to derive a new column which stores the year more meaningfully

But to derive that, you might need to cast the numbers/strings first to ints, to be able to do the calculation to timestamps?

But anyway, your arguments are convincing that this is probably not a typical use case. And, the user can still provide a manual schema is they really want to). So this is fine for me!

kszucs pushed a commit that referenced this pull request Oct 19, 2020
…alls back to string

Closes #8462 from bkietz/10145-Integer-like-partition-fi

Authored-by: Benjamin Kietzman <[email protected]>
Signed-off-by: Joris Van den Bossche <[email protected]>
@bkietz bkietz deleted the 10145-Integer-like-partition-fi branch October 30, 2020 14:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants