Add high-performance GLU activation variants (GLU, GeGLU, ReGLU) with comprehensive benchmarkingAr develop #2998

artem1984A · 2025-06-20T11:53:38Z

High-Performance Core Implementation

GLU: Classic sigmoid-gated activation σ(x_left) ⊙ x_right
GeGLU: GELU-gated variant (transformer standard)
ReGLU: ReLU-gated variant with 10-20x speedup over GeGLU

Performance Excellence

Activation	8192 elements (F32)	Use Case
ReGLU	~4.9 µs	High-speed inference
GLU	~31 µs	Balanced performance
GeGLU	~62 µs	Training quality

Architecture Integration

Dual API Design

// Direct tensor methods (maximum performance)
let output = input.reglu()?;

// Activation enum (configuration-driven)
let config = Config { hidden_act: Activation::GeGlu, .. };

Transformer Integration
Phi-3 native support with configurable GLU variants
Performance-quality tradeoffs for different deployment scenarios
Zero-config defaults (GeGLU standard, ReGLU for speed)

// Mobile/Edge: 10-20x faster inference
Config::with_activation(Activation::ReGlu)

// Research/Training: Maximum expressiveness  
Config::with_activation(Activation::GeGlu)

…e optimizations

Artem Ryzhov added 2 commits June 20, 2025 12:21

feat: Add GLU activation variants (GLU, GeGLU, ReGLU) with performanc…

82dec69

…e optimizations

Add GLU integration to CPU benchmarks and Phi-3 model

8c0429a

artem1984A force-pushed the ar-develop branch from 0097cd2 to 8c0429a Compare July 5, 2025 10:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add high-performance GLU activation variants (GLU, GeGLU, ReGLU) with comprehensive benchmarkingAr develop #2998

Add high-performance GLU activation variants (GLU, GeGLU, ReGLU) with comprehensive benchmarkingAr develop #2998

Uh oh!

artem1984A commented Jun 20, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Add high-performance GLU activation variants (GLU, GeGLU, ReGLU) with comprehensive benchmarkingAr develop #2998

Are you sure you want to change the base?

Add high-performance GLU activation variants (GLU, GeGLU, ReGLU) with comprehensive benchmarkingAr develop #2998

Uh oh!

Conversation

artem1984A commented Jun 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

High-Performance Core Implementation

Performance Excellence

Architecture Integration

Dual API Design

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

artem1984A commented Jun 20, 2025 •

edited

Loading