[SPARK-48394][CORE] Cleanup mapIdToMapIndex on mapoutput unregister #46706

Ngone51 · 2024-05-22T14:54:01Z

What changes were proposed in this pull request?

This PR cleans up mapIdToMapIndex when the corresponding mapstatus is unregistered in three places:

removeMapOutput
removeOutputsByFilter
addMapOutput (old mapstatus overwritten)

Why are the changes needed?

There is only one valid mapstatus for the same mapIndex at the same time in Spark. mapIdToMapIndex should also follows the same rule to avoid chaos.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Unit tests.

Was this patch authored or co-authored using generative AI tooling?

No.

core/src/main/scala/org/apache/spark/MapOutputTracker.scala

mridulm · 2024-05-22T15:43:43Z

core/src/main/scala/org/apache/spark/MapOutputTracker.scala

+   * Exposed for testing.
   */
-  private[this] val mapIdToMapIndex = new OpenHashMap[Long, Int]()
+  private[spark] val mapIdToMapIndex = new HashMap[Long, Int]()


QQ: Why change to HashMap from OpenHashMap ? (it is specialized for Long and Int)

+1 for the above question.

OpenHashMap doesn't support remove operation.

core/src/main/scala/org/apache/spark/MapOutputTracker.scala

mridulm · 2024-05-22T15:49:03Z

core/src/main/scala/org/apache/spark/MapOutputTracker.scala

      _numAvailableMapOutputs -= 1
-      mapStatusesDeleted(mapIndex) = mapStatuses(mapIndex)
+      val currentMapStatus = mapStatuses(mapIndex)
+      mapIdToMapIndex.remove(currentMapStatus.mapId)


Removing it here will mean we cant query for it in mapStatusesDeleted, where we are relying on mapId -> mapIndex being in mapIdToMapIndex even when mapIndex is in mapStatusesDeleted

We should move this cleanup to when mapStatusesDeleted is being cleaned up.

Same applies to the cases below as well.

This is a good point. But IIUC, mapStatusesDeleted will only be cleanedup when there is recovery happen using K8s. So it's not guaranteed to be always cleaned up in the end. I removed the dependency of mapIdToMapIndex for mapStatusesDeleted as it's not a common use case.

dongjoon-hyun

BTW, @Ngone51 , this should have a new JIRA ID because the original is Apache Spark 3.5.0. This PR cannot be a follow-up of the released JIRA issue.

Ngone51 · 2024-05-23T02:22:58Z

@dongjoon-hyun Thanks for the reminder. Have created a separate ticket: SPARK-48394.

mridulm

Looks good to me, thanks for fixing this @Ngone51 !

dongjoon-hyun

+1, LGTM (with one minor test case prefix comment).

dongjoon-hyun · 2024-05-24T01:39:58Z

core/src/test/scala/org/apache/spark/MapOutputTrackerSuite.scala

    }
  }
+
+  test("mapIdToMapIndex should cleanup unused mapIndexes after removeOutputsByFilter") {


Please use Jira ID test prefix for this bug fix.

core/src/test/scala/org/apache/spark/MapOutputTrackerSuite.scala

dongjoon-hyun · 2024-05-24T23:03:03Z

Merged to master only because this was defined as Improvement.

Ngone51 · 2024-05-25T06:00:30Z

@dongjoon-hyun @mridulm Sorry, can we make it a bug and backport to maintenance release branches? This actually causes us an issue internally. I was pushing a quick fix before realizing it is the root cause.

The issue leads to shuffle fetch failure and the job failure in the end. It happens this way:

Stage A computes the partition P0 by task t1 (TID, a.k.a mapId) on executor e1
Executor Y starts deommission
Executor Y reports false-positve lost to driver during its decommission
Stage B reuse the shuffle dependency with Stage A, and computes the partition P0 again by task t2 on executor e2
When task t2 finishes, we see two items ((t1 -> P0), (t2 -> P0)) for the same paritition in mapIdToMapIndex but only one item (mapStatuses(P0)=MapStatus(t2, e2)) in mapStatuses.
Executor Y starts to migrate task t1's mapstatus (to executor e3 for example) and call updateMapOutput on driver. Regarding to 5), we'd use mapId (i.e., t1) to get mapIndex (i.e., P0) and use P0 to get task t2's mapstatus.

// updateMapOutput
val mapIndex = mapIdToMapIndex.get(mapId)
val mapStatusOpt = mapIndex.map(mapStatuses(_)).flatMap(Option(_))

Task t2's mapstatus's location then would be updated to executor e3 but it's indeed still located on executor e2. This finally leads to the fetch failure in the end.

mridulm · 2024-05-25T19:32:19Z

Looks like a valid bug to me - can you raise a backport please ?
@dongjoon-hyun , thoughts on changing the type from improvement to bug (to commit to 3.5) ?

dongjoon-hyun · 2024-05-25T22:54:45Z

Of cource, @Ngone51 can.

Feel free to update the JIRA issue and backport this.

BTW, please create the JIRA issue properly next time, @Ngone51 , because it's used for our communication .

Ngone51 · 2024-05-26T14:32:22Z

Thanks. Created the backport PR (#46747) for branch-3.5.

This PR cleans up `mapIdToMapIndex` when the corresponding mapstatus is unregistered in three places: * `removeMapOutput` * `removeOutputsByFilter` * `addMapOutput` (old mapstatus overwritten) There is only one valid mapstatus for the same `mapIndex` at the same time in Spark. `mapIdToMapIndex` should also follows the same rule to avoid chaos. No. Unit tests. No. Closes apache#46706 from Ngone51/SPARK-43043-followup. Lead-authored-by: Yi Wu <[email protected]> Co-authored-by: wuyi <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>

This PR backports #46706 to branch 3.5. ### What changes were proposed in this pull request? This PR cleans up `mapIdToMapIndex` when the corresponding mapstatus is unregistered in three places: * `removeMapOutput` * `removeOutputsByFilter` * `addMapOutput` (old mapstatus overwritten) ### Why are the changes needed? There is only one valid mapstatus for the same `mapIndex` at the same time in Spark. `mapIdToMapIndex` should also follows the same rule to avoid chaos. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Unit tests. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #46768 from Ngone51/SPARK-48394-3.5. Authored-by: Yi Wu <[email protected]> Signed-off-by: Kent Yao <[email protected]>

This PR backports apache#46706 to branch 3.5. ### What changes were proposed in this pull request? This PR cleans up `mapIdToMapIndex` when the corresponding mapstatus is unregistered in three places: * `removeMapOutput` * `removeOutputsByFilter` * `addMapOutput` (old mapstatus overwritten) ### Why are the changes needed? There is only one valid mapstatus for the same `mapIndex` at the same time in Spark. `mapIdToMapIndex` should also follows the same rule to avoid chaos. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Unit tests. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#46768 from Ngone51/SPARK-48394-3.5. Authored-by: Yi Wu <[email protected]> Signed-off-by: Kent Yao <[email protected]>

fix

fc06d6a

Ngone51 requested review from dongjoon-hyun and jiangxb1987 May 22, 2024 14:54

github-actions bot added the CORE label May 22, 2024

mridulm reviewed May 22, 2024

View reviewed changes

dongjoon-hyun requested changes May 22, 2024

View reviewed changes

Ngone51 changed the title ~~[SPARK-43043][FOLLOW-UP] Cleanup mapIdToMapIndex on mapoutput unregister~~ [SPARK-48394][CORE] Cleanup mapIdToMapIndex on mapoutput unregister May 23, 2024

Ngone51 added 2 commits May 23, 2024 10:31

address comments

a52388b

remove the dependency of mapIdToMapIndex for mapStatusesDeleted

ab2cb35

mridulm approved these changes May 23, 2024

View reviewed changes

Ngone51 requested a review from dongjoon-hyun May 24, 2024 00:25

dongjoon-hyun approved these changes May 24, 2024

View reviewed changes

Ngone51 commented May 24, 2024

View reviewed changes

Ngone51 added 4 commits May 24, 2024 10:28

Update core/src/test/scala/org/apache/spark/MapOutputTrackerSuite.scala

541f0af

Update core/src/test/scala/org/apache/spark/MapOutputTrackerSuite.scala

6ba0a18

Update core/src/test/scala/org/apache/spark/MapOutputTrackerSuite.scala

656c1d5

fix fmt

ee79518

dongjoon-hyun closed this in 6cd1ccc May 24, 2024

Ngone51 mentioned this pull request May 26, 2024

[SPARK-48394][3.5][CORE] Cleanup mapIdToMapIndex on mapoutput unregister #46747

Closed

Ngone51 mentioned this pull request May 28, 2024

[SPARK-48394][3.5][CORE] Cleanup mapIdToMapIndex on mapoutput unregister #46768

Closed

[SPARK-48394][CORE] Cleanup mapIdToMapIndex on mapoutput unregister #46706

[SPARK-48394][CORE] Cleanup mapIdToMapIndex on mapoutput unregister #46706

Uh oh!

Conversation

Ngone51 commented May 22, 2024

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

Uh oh!

mridulm May 22, 2024

Choose a reason for hiding this comment

Uh oh!

dongjoon-hyun May 22, 2024

Choose a reason for hiding this comment

Uh oh!

Ngone51 May 23, 2024

Choose a reason for hiding this comment

Uh oh!

mridulm May 23, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mridulm May 22, 2024

Choose a reason for hiding this comment

Uh oh!

Ngone51 May 23, 2024

Choose a reason for hiding this comment

Uh oh!

dongjoon-hyun left a comment

Choose a reason for hiding this comment

Uh oh!

Ngone51 commented May 23, 2024

Uh oh!

mridulm left a comment

Choose a reason for hiding this comment

Uh oh!

dongjoon-hyun left a comment

Choose a reason for hiding this comment

Uh oh!

dongjoon-hyun May 24, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dongjoon-hyun commented May 24, 2024

Uh oh!

Ngone51 commented May 25, 2024

Uh oh!

mridulm commented May 25, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dongjoon-hyun commented May 25, 2024

Uh oh!

Ngone51 commented May 26, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

mridulm commented May 25, 2024 •

edited

Loading