Create SerDe with "column", "column.types" #44

HotSushi · 2020-10-26T21:51:08Z

How Serde works for AVRO, ORC, parquet in Hive?

AvroSerDe, ParquetSerDe looks at two properties that hive sets "columns", "columns.types" (in case there is schema evolution few more properties: "hive.exec.schema.evolution","schema.evolution.columns","schema.evolution.columns.types"). And constructs object inspectors just for those specific columns (by converting them to native filetype schema first, for example: orc does "strings -> TypeDescription -> oi" or avro does "strings -> Schema -> oi"). The oi returned only contains schema that hive expects.

who/how sets "columns", "column.types"?

it gets it from hive schema

How iceberg works today?

doesn't look at the properties set by hive at all.
doesn't look at schema evolution props.
creates a raw object inspector out of whatever table schema is set.

How do RecordReaders work for AVRO, ORC, parquet in Hive?

In ORC, Avro (AvroContainerInputFormat), the record reader again looks at "columns", "columns.types", "hive.exec.schema.evolution","schema.evolution.columns","schema.evolution.columns.types", to get schema that is expected by hive. And reads the file using that schema as projection.

HotSushi · 2020-10-29T22:34:02Z

This was just a prototype. Closing this, as a better solution is available here: #45

shardulm94 and others added 3 commits October 12, 2020 12:53

Hive: Use Hive table location in HiveIcebergSplit

89d15fe

Changes to make it work with Hive 1.1

3ad04d7

create SerDe from column, column.types

6e6d65f

HotSushi closed this Oct 29, 2020

HotSushi deleted the hive-location-fix-with-new-serde-path branch November 20, 2020 00:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Create SerDe with "column", "column.types" #44

Create SerDe with "column", "column.types" #44

Uh oh!

HotSushi commented Oct 26, 2020

Uh oh!

HotSushi commented Oct 29, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Create SerDe with "column", "column.types" #44

Create SerDe with "column", "column.types" #44

Uh oh!

Conversation

HotSushi commented Oct 26, 2020

Uh oh!

HotSushi commented Oct 29, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants