Make triton_kernels.tensor.convert_layout idempotent#10401
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c80e52647c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if tensor.storage.layout == layout and not layout_transformation_kwargs: | ||
| return tensor |
There was a problem hiding this comment.
Check canonical strides before returning no-op
When wrap_torch_tensor sees a strided view such as base[::2, :], it records only the unit-stride dimension as StridedLayout(-1), even though the storage still has non-dense strides. With this early return, convert_layout(tensor, StridedLayout(-1)) now skips the existing unswizzle/swizzle path that would allocate canonical dense storage, so callers using convert_layout to normalize an already-labeled tensor can pass a view with gaps/padding to downstream kernels or reference paths that expect the layout transform's dense strides.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
The non-dense case is intentional: StridedLayout identifies the packed/contiguous logical dimension, not compact physical strides. Consumers such as downcast_to_mxfp and matmul use the actual strides, so a valid pitched view remains a valid StridedLayout(-1) tensor. I documented that convert_layout does not densify/canonicalize already-valid storage and added test_downcast_to_mxfp_accepts_pitched_strided_input. I also added test_convert_layout_rejects_strided_view_without_contiguous_dimension so a view with no unit-stride dimension is still rejected rather than incorrectly no-oping.
|
|
||
|
|
||
| def convert_layout(tensor: Tensor, layout: Layout, **layout_transformation_kwargs): | ||
| if tensor.storage.layout == layout and not layout_transformation_kwargs: |
There was a problem hiding this comment.
Avoid dataclass equality on ragged layouts
When both the source and target layouts are BlackwellActMXScaleLayout instances with distinct non-None RaggedTensorMetadata, this == comparison descends into the metadata's torch.Tensor fields. Tensor equality produces a boolean tensor rather than a Python bool, so the layout check can raise before conversion instead of re-swizzling or returning a no-op; this affects ragged activation scale tensors whose target layout is reconstructed from equivalent metadata.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Fixed. The no-op decision is now Layout.can_preserve_storage_as(), and BlackwellActMXScaleLayout overrides it with metadata identity (self.ragged_metadata is other.ragged_metadata) instead of tensor-valued dataclass equality. test_act_scale_storage_preservation covers storage reuse with the same metadata object and conversion with independently reconstructed metadata.
triton_kernels.tensor.convert_layout idempotent
Make
triton_kernels.tensor.convert_layoutreturn its input when its storage already represents the requested layout.Motivation
Callers should not need to guard an idempotent layout conversion. This avoids redundant unswizzle/swizzle copies for already-correct MXFP intermediates.
Strided layouts
StridedLayoutidentifies the packed dimension, not compact physical strides. For example:Before this PR, the matching-layout conversion still copied through canonical storage and incidentally densified the tensor:
After this PR, the existing valid storage is preserved:
Both strides are valid
StridedLayout(-1)storage. Kernels and TMA checks consume physical strides separately. Invalid strided storage with no contiguous dimension is still rejected.Swizzled layouts
Matching swizzled layouts are also already in the requested storage encoding. For example:
This avoids an unnecessary unswizzle/swizzle round trip and preserves any layout-specific padded storage. A different parameterized layout still converts normally:
convert_layoutchanges storage encoding. It does not clone, densify, or scrub padding when the existing storage is already valid for the requested layout.