-
Notifications
You must be signed in to change notification settings - Fork 16.5k
Fix msg "'<' not supported between instances of 'str' and 'int'" #4236
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Druid returns NULL as 0, typed as int. This causes pandas to fail when it tries to sort heterogeneous types.
631135e to
1e4f6d8
Compare
betodealmeida
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we change pydruid to do this automatically, or maybe convert the zero to None?
| cols += query_obj.get('groupby') or [] | ||
| cols += query_obj.get('columns') or [] | ||
| cols += query_obj.get('metrics') or [] | ||
| cols += groupby |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cols = [DTTM_ALIAS] + groupby + columns + metrics
may be a bit cheaper and cleaner
| cols = [col for col in cols if col in df.columns] | ||
| df = df[cols] | ||
|
|
||
| for col in groupby + columns: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it be possible for some of these columns to not be in df.columns given the list comprehension at line 1249?
|
@betodealmeida I agree that pydruid should take care of that, but that change may not be backward compatible, so I'm not sure how to handle it. I have to admit I'm confused to see the related bugs. I feel like it's a new behavior (new version of Druid? PyDruid? error in druid ingestion?) as I we should have had this problem before... related? |
|
Dug deeper and got to the root cause in #4358 |
Druid returns NULL as 0, typed as int. This causes pandas to fail
when it tries to sort heterogeneous types.