[AutoParallel]Revise Infermeta of LayerNorm for Sequence-Data Hybrid Parallelism #58776

JZ-LIANG · 2023-11-07T11:47:20Z

PR types

Function optimization

PR changes

Others

Description

Pcard-76459

The Current LayerNorm Implement would flatten the broadcast axes (axis before "begin_norm_axis") of LayerNorm, which would hinder broadcast axes to be sharded by different Mesh dimensions.

In Sequence-Data Hybrid Parallelism, we need to shard both batch and sequence axes (broadcast axes) of LayerNorm input to get the best Performance.

Therefore we plan to remove the "flatten logic" in LayerNorm Op.

Before PR:
all axes before 『begin_norm_axis』will be flatten into one axis for var and std of LN.

After PR:
the shape of mean and var will keep the same as the shape of input 『begin_norm_axis』.

# pseudocode
in = paddle.randn(shape = [8, 1024, 768])
var, mean, out = layer_norm(in, begin_norm_axis=2)

# before PR
out.shape == [8, 1024, 768]
var.shape == [8192]
mean.shape == [8192]

# after PR
out.shape == [8, 1024, 768]
var.shape == [8, 1024]
mean.shape == [8, 1024]

The Logic modified in this PR:

Layernorm InferMeta
Layernorm InferSpmd
Layernorm Prim composite rule
Layernorm CPU kernel
Layernorm XPU & MLKDNN Unitest

pkuzyc

Are there some corresponding cases in unit test? If so, unit test should also be revised.

xiaoguoguo626807

LGTM for composite rules

zhiqiu

LGTM

chenwhql

LGTM

pkuzyc

LGTM for spmd rules

…Parallelism (PaddlePaddle#58776) * modify infermate * bugfix for kernel and spmd * fix prim * update unitest

JZ-LIANG added 3 commits November 7, 2023 19:37

modify infermate

1af4140

bugfix for kernel and spmd

f1eb880

fix prim

41308b7

JZ-LIANG requested review from pkuzyc, xiaoguoguo626807 and zhangbopd November 8, 2023 09:30

pkuzyc reviewed Nov 8, 2023

View reviewed changes

update unitest

1d4209f

JZ-LIANG changed the title ~~Revise Infermeta of LayerNorm~~ [AutoParallel]Revise Infermeta of LayerNorm for Sequence-Data Hybrid Parallelism Nov 9, 2023

update unitest

1b7e61d

JZ-LIANG requested review from zhiboniu and zhiqiu November 9, 2023 06:36

JZ-LIANG closed this Nov 9, 2023

JZ-LIANG reopened this Nov 9, 2023

update comment

39b3b9e

xiaoguoguo626807 previously approved these changes Nov 10, 2023

View reviewed changes

update unitest

5485b41

JZ-LIANG dismissed xiaoguoguo626807’s stale review via 5485b41 November 10, 2023 07:44

zhiqiu reviewed Nov 10, 2023

View reviewed changes

JZ-LIANG requested review from zhiqiu, pkuzyc and xiaoguoguo626807 November 13, 2023 06:01

chenwhql approved these changes Nov 13, 2023

View reviewed changes

pkuzyc reviewed Nov 13, 2023

View reviewed changes

xiaoguoguo626807 approved these changes Nov 14, 2023

View reviewed changes

JZ-LIANG merged commit db105fd into PaddlePaddle:develop Nov 14, 2023

danleifeng pushed a commit to danleifeng/Paddle that referenced this pull request Nov 14, 2023

[AutoParallel]Revise Infermeta of LayerNorm for Sequence-Data Hybrid …

12e80f9

…Parallelism (PaddlePaddle#58776) * modify infermate * bugfix for kernel and spmd * fix prim * update unitest

USTCKAY mentioned this pull request Nov 15, 2023

fix layer norm ut PaddlePaddle/PaddleCustomDevice#838

Merged

SecretXV pushed a commit to SecretXV/Paddle that referenced this pull request Nov 28, 2023

[AutoParallel]Revise Infermeta of LayerNorm for Sequence-Data Hybrid …

6fad637

…Parallelism (PaddlePaddle#58776) * modify infermate * bugfix for kernel and spmd * fix prim * update unitest

haohongxiang mentioned this pull request Jan 18, 2024

[Auto Parallel] Support dynamic semi-auto training in Llama2 model PaddlePaddle/PaddleNLP#7851

Merged

From00 mentioned this pull request Sep 11, 2024

Add fast_ln spmd rules PaddlePaddle/PaddleNLP#9125

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AutoParallel]Revise Infermeta of LayerNorm for Sequence-Data Hybrid Parallelism #58776

[AutoParallel]Revise Infermeta of LayerNorm for Sequence-Data Hybrid Parallelism #58776

JZ-LIANG commented Nov 7, 2023 •

edited

Loading

pkuzyc left a comment •

edited

Loading

xiaoguoguo626807 left a comment

zhiqiu left a comment

chenwhql left a comment

pkuzyc left a comment

[AutoParallel]Revise Infermeta of LayerNorm for Sequence-Data Hybrid Parallelism #58776

[AutoParallel]Revise Infermeta of LayerNorm for Sequence-Data Hybrid Parallelism #58776

Conversation

JZ-LIANG commented Nov 7, 2023 • edited Loading

PR types

PR changes

Description

pkuzyc left a comment • edited Loading

Choose a reason for hiding this comment

xiaoguoguo626807 left a comment

Choose a reason for hiding this comment

zhiqiu left a comment

Choose a reason for hiding this comment

chenwhql left a comment

Choose a reason for hiding this comment

pkuzyc left a comment

Choose a reason for hiding this comment

JZ-LIANG commented Nov 7, 2023 •

edited

Loading

pkuzyc left a comment •

edited

Loading