Skip to content

Conversation

@AdvancedCompiler
Copy link
Contributor

@AdvancedCompiler AdvancedCompiler commented Nov 18, 2025

PR Category

Operator

Type of Change

New Feature

Description

Add transformer_engine_torch.swiglu&transformer_engine_torch.dswiglu

Issue

Progress

  • Change is properly reviewed (1 reviewer required, 2 recommended).
  • Change is responded to an issue.
  • Change is fully covered by a UT.

Performance

@CLAassistant
Copy link

CLAassistant commented Nov 18, 2025

CLA assistant check
All committers have signed the CLA.

@yy33min
Copy link

yy33min commented Nov 18, 2025

b9a7a83b0c5101b2e59217239a6dd44e 7ba6e59fbde931872a0683fe641f8438 a411b9c4105128fa950161fdf588a361

@0x45f
Copy link
Collaborator

0x45f commented Nov 18, 2025

plz sign CLA

@yy33min
Copy link

yy33min commented Nov 18, 2025

The original code seems to have fluctuations causing errors, which led to the failure of the special-op-test / container-unit-test (pull_request). We don't have the option to rerun it on our end.If convenient, please have the CI rerun this test separately.

@kiddyjinjin kiddyjinjin changed the title Add te.swiglu&dswiglu 【AdvancedCompiler】Add te.swiglu&dswiglu Nov 19, 2025
ctx.quantizer = quantizer

shape = input_tensor.shape
H = shape[-1] // 2
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it necessary to add the assertion: assert H.shape[-1] % 2 == 0

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assertions have currently been added in dswiglu (refer to the implementation in te). Since there is already checking code in swiglu, it is not necessary to repeat the assertions.
c49022eaf04b1bd423aa40dab3576770

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May need to add more test cases to meet the test coverage rate requirements.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The correctness test cases have been increased to 12, resulting in 72 outputs in total: 12 test cases × 3 dtypes × 2 (forward and backward).
e90aff5df2c31cd73fd00b8722c2fcf5

bench.run()


class SwigluBenchmarkResult(BenchmarkResult):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a base class that can be used by TE ops benchmark test such as #1062, #1055, #1052, #1056

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants