@@ -80,16 +80,18 @@ enum class SyncType {
8080
8181/* *
8282 * @class ETMetalShaderLibrary
83- * @brief Manages Metal shader library compilation and kernel function retrieval.
83+ * @brief Manages Metal shader library compilation and kernel function
84+ * retrieval.
8485 *
85- * This class provides a high-level interface for compiling Metal shading language
86- * source code into a Metal library and creating compute pipeline states for
87- * kernel functions. It handles the creation and caching of Metal compute pipeline
88- * states and functions, which should be reused across multiple kernel dispatches.
86+ * This class provides a high-level interface for compiling Metal shading
87+ * language source code into a Metal library and creating compute pipeline
88+ * states for kernel functions. It handles the creation and caching of Metal
89+ * compute pipeline states and functions, which should be reused across multiple
90+ * kernel dispatches.
8991 *
90- * The class automatically compiles the provided shader source code upon construction
91- * and maintains an internal cache of compute pipeline states for different kernel
92- * functions to avoid redundant compilation.
92+ * The class automatically compiles the provided shader source code upon
93+ * construction and maintains an internal cache of compute pipeline states for
94+ * different kernel functions to avoid redundant compilation.
9395 *
9496 * Example usage:
9597 * @code
@@ -137,18 +139,18 @@ class ETMetalShaderLibrary {
137139 * @class ETMetalKernelFunction
138140 * @brief Represents a Metal compute kernel function ready for execution.
139141 *
140- * This class encapsulates a Metal compute pipeline state and function, providing
141- * a high-level interface for setting kernel arguments and dispatching compute
142- * work to the GPU. It handles the encoding of compute commands and manages the
143- * interaction with Metal's compute command encoder.
142+ * This class encapsulates a Metal compute pipeline state and function,
143+ * providing a high-level interface for setting kernel arguments and dispatching
144+ * compute work to the GPU. It handles the encoding of compute commands and
145+ * manages the interaction with Metal's compute command encoder.
144146 *
145147 * The class supports different dispatch patterns:
146148 * - Single-dimension dispatch for linear workloads
147149 * - Multi-dimensional dispatch for grid-based workloads
148150 * - Custom thread group sizes for performance optimization
149151 *
150- * Kernel arguments can be set using tensors (which will be mapped to Metal buffers)
151- * or scalar values. The class handles the encoding of these arguments
152+ * Kernel arguments can be set using tensors (which will be mapped to Metal
153+ * buffers) or scalar values. The class handles the encoding of these arguments
152154 * into the compute command encoder.
153155 *
154156 * Example usage:
@@ -203,23 +205,25 @@ class ETMetalKernelFunction {
203205
204206/* *
205207 * @class ETMetalStream
206- * @brief Manages Metal compute command streams and provides GPU synchronization.
208+ * @brief Manages Metal compute command streams and provides GPU
209+ * synchronization.
207210 *
208- * This class serves as the central management hub for Metal GPU operations, providing
209- * a stream-based abstraction similar to CUDA streams. It handles command buffer lifecycle,
210- * compute command encoder management, and various synchronization patterns required for
211- * efficient GPU computation.
211+ * This class serves as the central management hub for Metal GPU operations,
212+ * providing a stream-based abstraction similar to CUDA streams. It handles
213+ * command buffer lifecycle, compute command encoder management, and various
214+ * synchronization patterns required for efficient GPU computation.
212215 *
213216 * Key features:
214217 * - Lazy command buffer and encoder creation for optimal resource usage
215218 * - Thread-safe operations using serial dispatch queues
216- * - Multiple synchronization modes (COMMIT, COMMIT_AND_WAIT, COMMIT_AND_CONTINUE)
219+ * - Multiple synchronization modes (COMMIT, COMMIT_AND_WAIT,
220+ * COMMIT_AND_CONTINUE, etc.)
217221 * - Kernel coalescing to batch multiple operations efficiently
218- * - MPSGraph integration for high-level neural network operations
222+ * - MPSGraph integration for executing fall back operations (mm, conv, sdpa)
219223 * - Memory operations (copy, fill) with GPU acceleration via blit encoders
220224 *
221- * The stream follows PyTorch's MPS stream design patterns, providing similar semantics
222- * for command buffer management and synchronization.
225+ * The stream follows PyTorch's MPS stream design patterns, providing similar
226+ * semantics for command buffer management and synchronization.
223227 *
224228 * Example usage:
225229 * @code
0 commit comments