Skip to content

Conversation

@dhirschfeld
Copy link
Contributor

These names can be truncated/translated so won't always map back to the
column names

@dhirschfeld
Copy link
Contributor Author

import sqlalchemy as sa

col_name = 'a_column_which_has_more_than_24_chars'
table = sa.Table(
    'Test', sa.MetaData(),
    sa.Column('id', sa.Integer, primary_key=True),
    sa.Column(col_name, sa.Float(32), nullable=False),
)

In sqlalchemy a compiled expression can include truncated column names

In [88]: from sqlalchemy.dialects.oracle.cx_oracle import OracleDialect_cx_oracle

In [89]: dialect = OracleDialect_cx_oracle()

In [90]: s = table.select().limit(10)

In [91]: print(s)
SELECT "Test".id, "Test".a_column_which_has_more_than_24_chars 
FROM "Test"
 LIMIT :param_1

In [92]: print(s.compile(dialect=dialect))
SELECT id, a_column_which_has_more__1 
FROM (SELECT "Test".id AS id, "Test".a_column_which_has_more_than_24_chars AS a_column_which_has_more__1 
FROM "Test") 
WHERE ROWNUM <= :param_1

Currently odo relies on ResultProxy.keys to get the column names without any code to handle potentially truncated names which can result in errors like below:

In [41]: odo(data.head(), pd.DataFrame)
Traceback (most recent call last):

  File "<ipython-input-41-15ecd5ef9d0d>", line 1, in <module>
    odo(data.head(), pd.DataFrame)

  File "C:\Miniconda3\lib\site-packages\odo\odo.py", line 91, in odo
    return into(target, source, **kwargs)

  File "C:\Miniconda3\lib\site-packages\multipledispatch\dispatcher.py", line 164, in __call__
    return func(*args, **kwargs)

  File "C:\Miniconda3\lib\site-packages\blaze\compute\core.py", line 379, in into
    return into(a, result, **kwargs)

  File "C:\Miniconda3\lib\site-packages\multipledispatch\dispatcher.py", line 164, in __call__
    return func(*args, **kwargs)

  File "C:\Miniconda3\lib\site-packages\odo\into.py", line 43, in wrapped
    return f(*args, **kwargs)

  File "C:\Miniconda3\lib\site-packages\odo\into.py", line 53, in into_type
    return convert(a, b, dshape=dshape, **kwargs)

  File "C:\Miniconda3\lib\site-packages\odo\core.py", line 83, in __call__
    return _transform(self.graph, *args, **kwargs)

  File "C:\Miniconda3\lib\site-packages\odo\core.py", line 106, in _transform
    x = f(x, excluded_edges=excluded_edges, **kwargs)

  File "C:\Miniconda3\lib\site-packages\odo\backends\sql.py", line 756, in select_or_selectable_to_frame
    dtype=[(str(c), dtypes[c]) for c in columns]))

  File "C:\Miniconda3\lib\site-packages\odo\backends\sql.py", line 756, in <listcomp>
    dtype=[(str(c), dtypes[c]) for c in columns]))

KeyError: 'pasaavailability_schedul_1'

This can be avoided entirely by instead getting the name from each column in the Select.columns collection.

These names can be truncated/translated so won't always map back to the
column names
@dhirschfeld
Copy link
Contributor Author

Test failure is the same random failure which is fixed by #557

@dhirschfeld
Copy link
Contributor Author

Would also be great to get this one merged

@llllllllll
Copy link
Member

looks good, thanks!

@llllllllll llllllllll merged commit 0a24032 into blaze:master Jul 24, 2017
@dhirschfeld dhirschfeld deleted the column-names branch July 25, 2017 00:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants