Skip to content

Commit 7c1b9b2

Browse files
Update
[ghstack-poisoned]
2 parents 6a6ba04 + e8b9828 commit 7c1b9b2

File tree

1 file changed

+27
-23
lines changed

1 file changed

+27
-23
lines changed

backends/apple/metal/runtime/shims/et_metal.h

Lines changed: 27 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -80,16 +80,18 @@ enum class SyncType {
8080

8181
/**
8282
* @class ETMetalShaderLibrary
83-
* @brief Manages Metal shader library compilation and kernel function retrieval.
83+
* @brief Manages Metal shader library compilation and kernel function
84+
* retrieval.
8485
*
85-
* This class provides a high-level interface for compiling Metal shading language
86-
* source code into a Metal library and creating compute pipeline states for
87-
* kernel functions. It handles the creation and caching of Metal compute pipeline
88-
* states and functions, which should be reused across multiple kernel dispatches.
86+
* This class provides a high-level interface for compiling Metal shading
87+
* language source code into a Metal library and creating compute pipeline
88+
* states for kernel functions. It handles the creation and caching of Metal
89+
* compute pipeline states and functions, which should be reused across multiple
90+
* kernel dispatches.
8991
*
90-
* The class automatically compiles the provided shader source code upon construction
91-
* and maintains an internal cache of compute pipeline states for different kernel
92-
* functions to avoid redundant compilation.
92+
* The class automatically compiles the provided shader source code upon
93+
* construction and maintains an internal cache of compute pipeline states for
94+
* different kernel functions to avoid redundant compilation.
9395
*
9496
* Example usage:
9597
* @code
@@ -137,18 +139,18 @@ class ETMetalShaderLibrary {
137139
* @class ETMetalKernelFunction
138140
* @brief Represents a Metal compute kernel function ready for execution.
139141
*
140-
* This class encapsulates a Metal compute pipeline state and function, providing
141-
* a high-level interface for setting kernel arguments and dispatching compute
142-
* work to the GPU. It handles the encoding of compute commands and manages the
143-
* interaction with Metal's compute command encoder.
142+
* This class encapsulates a Metal compute pipeline state and function,
143+
* providing a high-level interface for setting kernel arguments and dispatching
144+
* compute work to the GPU. It handles the encoding of compute commands and
145+
* manages the interaction with Metal's compute command encoder.
144146
*
145147
* The class supports different dispatch patterns:
146148
* - Single-dimension dispatch for linear workloads
147149
* - Multi-dimensional dispatch for grid-based workloads
148150
* - Custom thread group sizes for performance optimization
149151
*
150-
* Kernel arguments can be set using tensors (which will be mapped to Metal buffers)
151-
* or scalar values. The class handles the encoding of these arguments
152+
* Kernel arguments can be set using tensors (which will be mapped to Metal
153+
* buffers) or scalar values. The class handles the encoding of these arguments
152154
* into the compute command encoder.
153155
*
154156
* Example usage:
@@ -203,23 +205,25 @@ class ETMetalKernelFunction {
203205

204206
/**
205207
* @class ETMetalStream
206-
* @brief Manages Metal compute command streams and provides GPU synchronization.
208+
* @brief Manages Metal compute command streams and provides GPU
209+
* synchronization.
207210
*
208-
* This class serves as the central management hub for Metal GPU operations, providing
209-
* a stream-based abstraction similar to CUDA streams. It handles command buffer lifecycle,
210-
* compute command encoder management, and various synchronization patterns required for
211-
* efficient GPU computation.
211+
* This class serves as the central management hub for Metal GPU operations,
212+
* providing a stream-based abstraction similar to CUDA streams. It handles
213+
* command buffer lifecycle, compute command encoder management, and various
214+
* synchronization patterns required for efficient GPU computation.
212215
*
213216
* Key features:
214217
* - Lazy command buffer and encoder creation for optimal resource usage
215218
* - Thread-safe operations using serial dispatch queues
216-
* - Multiple synchronization modes (COMMIT, COMMIT_AND_WAIT, COMMIT_AND_CONTINUE)
219+
* - Multiple synchronization modes (COMMIT, COMMIT_AND_WAIT,
220+
* COMMIT_AND_CONTINUE, etc.)
217221
* - Kernel coalescing to batch multiple operations efficiently
218-
* - MPSGraph integration for high-level neural network operations
222+
* - MPSGraph integration for executing fall back operations (mm, conv, sdpa)
219223
* - Memory operations (copy, fill) with GPU acceleration via blit encoders
220224
*
221-
* The stream follows PyTorch's MPS stream design patterns, providing similar semantics
222-
* for command buffer management and synchronization.
225+
* The stream follows PyTorch's MPS stream design patterns, providing similar
226+
* semantics for command buffer management and synchronization.
223227
*
224228
* Example usage:
225229
* @code

0 commit comments

Comments
 (0)