Skip to content

Issue with migrating directly from AWS Glue to Hive #103

@hlmiao

Description

@hlmiao

I am trying to migrate Glue Catalog to Hive Metastore of an EMR Cluster ( I used an external MySQL database as my Hive metastore).

I followed all the steps to migrate directly from AWS Glue to Hive, but i experienced " 'str' object has no attribute '_jdf' "when i run the Glue ETL job. See the full error message below:

2021-11-11 09:33:53,573 ERROR [main] glue.ProcessLauncher (Logging.scala:logError(73)): Error from Python:Traceback (most recent call last):
File "/tmp/export_from_datacatalog.py", line 138, in
main()
File "/tmp/export_from_datacatalog.py", line 134, in main
connection=glue_context.extract_jdbc_conf(connection_name)
File "/tmp/export_from_datacatalog.py", line 38, in datacatalog_migrate_to_hive_metastore
transform_databases_tables_partitions(sc, sql_context, hive_metastore, databases, tables, partitions)
File "/tmp/localPyFiles-3222c3b6-ae99-42e0-be66-ac44ed10e9ab/hive_metastore_migration.py", line 1445, in transform_databases_tables_partitions
.transform(hms=hive_metastore, databases=databases, tables=tables, partitions=partitions)
File "/tmp/localPyFiles-3222c3b6-ae99-42e0-be66-ac44ed10e9ab/hive_metastore_migration.py", line 1227, in transform
(ms_sds, ms_tbls, ms_partitions) = self.extract_sds(ms_tbls, ms_partitions)
File "/tmp/localPyFiles-3222c3b6-ae99-42e0-be66-ac44ed10e9ab/hive_metastore_migration.py", line 1018, in extract_sds
.drop_columns(['ID', 'type'])
File "/tmp/localPyFiles-3222c3b6-ae99-42e0-be66-ac44ed10e9ab/hive_metastore_migration.py", line 182, in drop_columns
df = df.drop(col)
File "/opt/amazon/spark/python/lib/pyspark.zip/pyspark/sql/dataframe.py", line 2519, in drop
jdf = self._jdf.drop(self._jseq(cols))
AttributeError: 'str' object has no attribute '_jdf'

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions