[FLINK-1154] Quickfix to kill TaskManagers in YARN mode. #208

rmetzger · 2014-11-17T09:58:44Z

This quickfix should probably only go into the 0.7 branch for the 0.7.1 release since the new Akka-based JM/TMs have this issue fixed.

StephanEwen · 2014-11-17T11:58:24Z

flink-runtime/src/main/java/org/apache/flink/runtime/instance/InstanceManager.java

I would strongly vote to remove that. There is no need to kill a TaskManager just because it lost its heartbeat. The TaskManager may very well reconnect later and be available again.

I Agree. I'll implement it using a separate method just for YARN.

StephanEwen · 2014-11-17T12:00:17Z

I am not sure that the stop() function should go into the TaskManager interface. Until now, the TaskManager was not assumed to run in its own proccess.

I think putting this into the TaskManagerRunner would make more sense.

rmetzger · 2014-11-20T17:30:43Z

I've updated the pull request so that the TM killing is happening in a separate call hierarchy.

I also added code to the YARN Client for making it work with the Google Storage file system wrapper. Only the Flink YARN client works with GCloud storage, not the runtime (See FLINK-1266).

This closes apache#208

…ng support (apache#208) Currently, Apache Flink's does not support storage partition join, which can lead to unnecessary data shuffles in batch mode. This PR implements a basic version to support that. This pull request introduces a new optimizer configuration option `table.optimizer.storage-partition-join-enabled` and query planer changes to detect when both sides of a join are partitioned by the join keys and compatible, allowing it to apply a storage partition join strategy. This avoids unnecessary shuffles by leveraging the source's partitioning. Key changes include: - Addition of the `SupportsPartitioning` interface for table sources to expose partitioning information. - Implementation of `KeyGroupedPartitioning` to represent partitioning schemes. - Integration of partitioning awareness in the batch physical sort-merge join rule to conditionally use the storage partition join when enabled and applicable. - [for Testing] Serialization and deserialization utilities for partitioning metadata. - [for Testing] Extension of the test values table factory to support partitioning. - Comprehensive unit and integration tests verifying the new join strategy and its configuration. This enhancement is currently applicable only in batch mode and requires the source tables to be partitioned by the join keys. - Added test util for serialization and deserialization of partitioning metadata, so we can create a test table with a KeyGroupPartition. - Added integration tests (`TestStoragePartitionJoin`) that verify the optimizer plan changes when the storage partition join is enabled or disabled. - Verified that existing tests pass and that the new join strategy is correctly applied only when the configuration is enabled and partitioning is compatible. - Manual verification of execution plans to confirm the absence of unnecessary shuffles when storage partition join is enabled. - verified unit test in table-planner module result is the same as before the change: ``` [ERROR] Tests run: 8671, Failures: 4, Errors: 0, Skipped: 1 ``` It was tested that before the change test failures is also 4: --------- Co-authored-by: Jeyhun Karimov <[email protected]>

StephanEwen reviewed Nov 17, 2014
View reviewed changes

[FLINK-1154] Quickfix to kill TaskManagers in YARN mode.

fa7640f

rmetzger force-pushed the flink1154 branch 2 times, most recently from 9a78e78 to 58788a8 Compare November 20, 2014 17:24

Use separate call hierarchy to kill the TaskManagers

b21fa72

rmetzger force-pushed the flink1154 branch from 58788a8 to b21fa72 Compare November 20, 2014 17:26

wip

6952317

rmetzger closed this Nov 21, 2014

rmetzger added the component=Deployment/YARN label Mar 14, 2019

zhijiangW pushed a commit to zhijiangW/flink that referenced this pull request Jul 23, 2019

[FLINK-11754] Translate Roadmap page into Chinese

7f603e6

This closes apache#208

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[FLINK-1154] Quickfix to kill TaskManagers in YARN mode. #208

[FLINK-1154] Quickfix to kill TaskManagers in YARN mode. #208

Uh oh!

rmetzger commented Nov 17, 2014

Uh oh!

StephanEwen Nov 17, 2014

Uh oh!

rmetzger Nov 17, 2014

Uh oh!

StephanEwen commented Nov 17, 2014

Uh oh!

rmetzger commented Nov 20, 2014

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[FLINK-1154] Quickfix to kill TaskManagers in YARN mode. #208

[FLINK-1154] Quickfix to kill TaskManagers in YARN mode. #208

Uh oh!

Conversation

rmetzger commented Nov 17, 2014

Uh oh!

StephanEwen Nov 17, 2014

Choose a reason for hiding this comment

Uh oh!

rmetzger Nov 17, 2014

Choose a reason for hiding this comment

Uh oh!

StephanEwen commented Nov 17, 2014

Uh oh!

rmetzger commented Nov 20, 2014

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants