Skip to content

Conversation

@marton-bod
Copy link
Contributor

@marton-bod marton-bod commented Apr 8, 2021

Note: CI build will fail since Tez dependency is SNAPSHOT, using: apache/tez#101

  • TezTask queries the DAGClient to get each reducer (or mapper, if map-only) vertex's ID and number of tasks. There's some necessary translation between vertex ID and job ID.
  • It puts this info into the session conf suffixed by the table name (for multitable insert cases). For non-Iceberg table writes, this info should not be populated.
  • This jobId/taskNum info is retrievable from the MetaHook side, which can thus create the job context and commit the operation.

Assumption: for each table, there is only one vertex that writes to it.

@marton-bod
Copy link
Contributor Author

@pvary @lcspinter can you please take a look?

throws MetaException {
// construct the job context
JobConf jobConf = new JobConf(conf);
String tableName = TableIdentifier.of(table.getDbName(), table.getTableName()).toString();
Copy link
Contributor

@pvary pvary Apr 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How this will get the Catalog information when @lcspinter's catalog related changes get in?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure yet, I'll look into it. The catalog name might be stored in the table params, or we might need to load the Iceberg table to find out if we can. @lcspinter might have some more insights

@marton-bod
Copy link
Contributor Author

@pvary I've uploaded a set of new commits addressing your review comments + changing the long-term approach with the temporary listing solution, as discussed. The listing solution is mostly based on the current downstream TP solution just simplified and refined a bit to save the information into the session conf. This should make it easy to swap out later with the proper vertex id-based implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants