Fix array casting #8253

betodealmeida · 2019-09-18T20:20:23Z

SUMMARY

The fix introduced in #8226 is not working for some types. We have a query returning the following error:

<U10 cannot be converted to an IntegerDtype

This happens because before creating the Pandas dataframe we cast the data into a Numpy array, and Numpy is casting all columns to the same type. I fixed it by keeping the dtype as "object".

TEST PLAN

Query now runs successfully.

ADDITIONAL INFORMATION

REVIEWERS

@khtruong

codecov-io · 2019-09-18T20:43:26Z

Codecov Report

Merging #8253 into master will not change coverage.
The diff coverage is 100%.

@@           Coverage Diff           @@
##           master    #8253   +/-   ##
=======================================
  Coverage   65.68%   65.68%           
=======================================
  Files         481      481           
  Lines       23348    23348           
  Branches     2572     2572           
=======================================
  Hits        15335    15335           
  Misses       7875     7875           
  Partials      138      138

Impacted Files	Coverage Δ
superset/dataframe.py	`94.48% <100%> (ø)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 12fb8e7...02f51d4. Read the comment docs.

DiggidyDave · 2019-09-18T20:54:42Z

what are the cases where they would not be the same type? is that suggestive of a bigger issue?

otherwise, LGTM

betodealmeida · 2019-09-18T21:30:51Z

what are the cases where they would not be the same type? is that suggestive of a bigger issue?

otherwise, LGTM

@DiggidyDave, we're casting the results from the DB — a list of tuples — into a Numpy array so we can address each column efficiently:

>>> data = [("a", 1), ("b", 10)]
>>> np.array(data)
array([['a', '1'],
       ['b', '10']], dtype='<U2')
>>> np.array(data)[:,0]  # first column
array(['a', 'b'], dtype='<U2')
>>> np.array(data)[:,1]  # second column
array(['1', '10'], dtype='<U2')

Note that the numbers were cast to unicode, since that's the common type between int and unicode.

If we use "object", though:

>>> np.array(data, dtype='object')
array([['a', 1],
       ['b', 10]], dtype=object)

Fix array casting

02f51d4

pull-request-size bot added the size/XS label Sep 18, 2019

khtruong approved these changes Sep 18, 2019

View reviewed changes

betodealmeida merged commit 8e1fc2b into apache:master Sep 18, 2019

betodealmeida mentioned this pull request Sep 20, 2019

Fix no data in Presto #8268

Merged

12 tasks

DanyRay420 mentioned this pull request Feb 23, 2024

[Snyk] Upgrade deck.gl from 8.8.27 to 8.9.34 DanyRay420/superset#2

Open

mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 0.35.0 First shipped in 0.35.0 labels Feb 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix array casting #8253

Fix array casting #8253

Uh oh!

betodealmeida commented Sep 18, 2019

Uh oh!

codecov-io commented Sep 18, 2019 •

edited

Loading

Uh oh!

DiggidyDave commented Sep 18, 2019

Uh oh!

betodealmeida commented Sep 18, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Fix array casting #8253

Fix array casting #8253

Uh oh!

Conversation

betodealmeida commented Sep 18, 2019

CATEGORY

SUMMARY

TEST PLAN

ADDITIONAL INFORMATION

REVIEWERS

Uh oh!

codecov-io commented Sep 18, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

DiggidyDave commented Sep 18, 2019

Uh oh!

betodealmeida commented Sep 18, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

codecov-io commented Sep 18, 2019 •

edited

Loading