Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 7 additions & 3 deletions mlir/include/mlir/Dialect/Vector/Utils/VectorUtils.h
Original file line number Diff line number Diff line change
Expand Up @@ -219,13 +219,17 @@ bool isLinearizableVector(VectorType type);

/// Creates a TransferReadOp from `source`.
///
/// The shape of the vector to read is specified via `inputVectorSizes`. If the
/// shape of the output vector differs from the shape of the value being read,
/// masking is used to avoid out-of-bounds accesses. Set
/// If the shape of vector to read differs from the shape of the value being
/// read, masking is used to avoid out-of-bounds accesses. Set
/// `useInBoundsInsteadOfMasking` to `true` to use the "in_bounds" attribute
/// instead of explicit masks.
///
/// Note: all read offsets are set to 0.
Value createReadOrMaskedRead(OpBuilder &builder, Location loc, Value source,
const VectorType &vecToReadTy,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it start enforcing users to create VectorType when they don't have to? E.g., I searched the use in IREE, and we have one use. The type is ShapedType; this API change forces the use to create VectorType when they dont have to.

https://github.com/iree-org/iree/blob/fa46601cec6be66f0a2c95b7c95bd18682cc58be/compiler/src/iree/compiler/Codegen/Dialect/GPU/Transforms/Transforms.cpp#L1856-L1858

(I don't have a strong opinion, and I can fix IREE side. I mainly want to provide a data point that users will have to create VectorType when they don't care about scalable flags.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback!

I mainly want to provide a data point that users will have to create VectorType when they don't care about scalable flags

My thinking here was that createReadOrMaskedRead always creates an instance of VectorType, so:

  • When users do require VectorType, we make sure that we create it only once and then sizes are easier to track (i.e. "Where are these sizes coming from?")
  • When users do not require VectorType (like in your case), my PR creates extra burden on the users, but the number of instances of VectorType does not change.

To make it easier for you, how about adding an overload:

Value createReadOrMaskedRead(OpBuilder &builder, Location loc, Value source,
                             ArrayRef<int64_t> inputVectorSizes,                              
                             std::optional<Value> padValue = std::nullopt,
                             bool useInBoundsInsteadOfMasking = false,
                             ArrayRef<bool> inputScalableVecDims = {}) {

      VectorType readVecType =
      VectorType::get(inputVectorSizes, source.getElementType(),
                      readScalableVectorFlags);
                      
      createReadOrMaskedRead(builder, loc, source, readVecType, padValue, useInBoundsInsteadOfMasking);
}

WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sgtm, thanks!

std::optional<Value> padValue = std::nullopt,
bool useInBoundsInsteadOfMasking = false);

Value createReadOrMaskedRead(OpBuilder &builder, Location loc, Value source,
ArrayRef<int64_t> inputVectorSizes,
std::optional<Value> padValue = std::nullopt,
Expand Down
26 changes: 14 additions & 12 deletions mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1890,9 +1890,8 @@ vectorizeAsTensorPackOp(RewriterBase &rewriter, linalg::PackOp packOp,

// Create masked TransferReadOp.
auto maskedRead = vector::createReadOrMaskedRead(
rewriter, loc, packOp.getSource(), readVecType.getShape(), padValue,
useInBoundsInsteadOfMasking,
/*inputScalableVecSizes=*/{});
rewriter, loc, packOp.getSource(), readVecType, padValue,
useInBoundsInsteadOfMasking);

// Create ShapeCastOp.
auto shapeCastOp = vector::ShapeCastOp::create(
Expand Down Expand Up @@ -1977,9 +1976,12 @@ vectorizeAsTensorUnpackOp(RewriterBase &rewriter, linalg::UnPackOp unpackOp,
}

// -- Generate the read operation --
VectorType readVecType =
VectorType::get(readVectorSizes, unpackTensorType.getElementType(),
readScalableVectorFlags);
Value readResult = vector::createReadOrMaskedRead(
rewriter, loc, unpackOp.getSource(), readVectorSizes, std::nullopt,
useInBoundsInsteadOfMasking, readScalableVectorFlags);
rewriter, loc, unpackOp.getSource(), readVecType, std::nullopt,
useInBoundsInsteadOfMasking);

// -- Generate the transpose operation --
PackingMetadata packMetadata;
Expand Down Expand Up @@ -2025,9 +2027,10 @@ vectorizeAsTensorPadOp(RewriterBase &rewriter, tensor::PadOp padOp,
.reifyResultShapes(rewriter, reifiedReturnShapes);
(void)status; // prevent unused variable warning on non-assert builds
assert(succeeded(status) && "failed to reify result shapes");
auto readType = VectorType::get(inputVectorSizes, padValue.getType());
auto maskedRead = vector::createReadOrMaskedRead(
rewriter, loc, padOp.getSource(), inputVectorSizes, padValue,
/*useInBoundsInsteadOfMasking=*/false, /*inputScalableVecSizes=*/{});
rewriter, loc, padOp.getSource(), readType, padValue,
/*useInBoundsInsteadOfMasking=*/false);

// Create Xfer write Op
Value dest = tensor::EmptyOp::create(rewriter, loc, reifiedReturnShapes[0],
Expand Down Expand Up @@ -2222,9 +2225,9 @@ vectorizeAsLinalgContraction(RewriterBase &rewriter, VectorizationState &state,
state.getCanonicalVecType(elemType, readMap.compose(indexingMap));

Value read = mlir::vector::createReadOrMaskedRead(
rewriter, loc, opOperand.get(), readType.getShape(),
rewriter, loc, opOperand.get(), readType,
/*padding=*/arith::getZeroConstant(rewriter, loc, elemType),
/*useInBoundsInsteadOfMasking=*/false, readType.getScalableDims());
/*useInBoundsInsteadOfMasking=*/false);
vecOperands.push_back(read);
}

Expand Down Expand Up @@ -3165,9 +3168,8 @@ vectorizeAsInsertSliceOp(RewriterBase &rewriter, tensor::InsertSliceOp sliceOp,
SmallVector<Value> readIndices(
vecType.getRank(), arith::ConstantIndexOp::create(rewriter, loc, 0));
Value read = mlir::vector::createReadOrMaskedRead(
rewriter, loc, source, vecType.getShape(), padValue,
/*useInBoundsInsteadOfMasking=*/inputVectorSizes.empty(),
/*inputScalableVecSizes=*/{});
rewriter, loc, source, vecType, padValue,
/*useInBoundsInsteadOfMasking=*/inputVectorSizes.empty());

// Create write
auto writeIndices =
Expand Down
43 changes: 29 additions & 14 deletions mlir/lib/Dialect/Vector/Utils/VectorUtils.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -322,46 +322,61 @@ Value vector::createReadOrMaskedRead(OpBuilder &builder, Location loc,
std::optional<Value> padValue,
bool useInBoundsInsteadOfMasking,
ArrayRef<bool> inputScalableVecDims) {
assert(!llvm::is_contained(inputVectorSizes, ShapedType::kDynamic) &&
VectorType vecToReadTy = VectorType::get(
inputVectorSizes, cast<ShapedType>(source.getType()).getElementType(),
inputScalableVecDims);

return createReadOrMaskedRead(builder, loc, source, vecToReadTy, padValue,
useInBoundsInsteadOfMasking);
}

Value vector::createReadOrMaskedRead(OpBuilder &builder, Location loc,
Value source,
const VectorType &vecToReadTy,
std::optional<Value> padValue,
bool useInBoundsInsteadOfMasking) {
assert(!llvm::is_contained(vecToReadTy.getScalableDims(),
ShapedType::kDynamic) &&
"invalid input vector sizes");
auto sourceShapedType = cast<ShapedType>(source.getType());
auto sourceShape = sourceShapedType.getShape();
assert(sourceShape.size() == inputVectorSizes.size() &&

int64_t vecToReadRank = vecToReadTy.getRank();
auto vecToReadShape = vecToReadTy.getShape();

assert(sourceShape.size() == static_cast<size_t>(vecToReadRank) &&
"expected same ranks.");
auto vectorType =
VectorType::get(inputVectorSizes, sourceShapedType.getElementType(),
inputScalableVecDims);
assert((!padValue.has_value() ||
padValue.value().getType() == sourceShapedType.getElementType()) &&
"expected same pad element type to match source element type");
int64_t readRank = inputVectorSizes.size();

auto zero = arith::ConstantIndexOp::create(builder, loc, 0);
SmallVector<bool> inBoundsVal(readRank, true);
SmallVector<bool> inBoundsVal(vecToReadRank, true);

if (useInBoundsInsteadOfMasking) {
// Update the inBounds attribute.
// FIXME: This computation is too weak - it ignores the read indices.
for (unsigned i = 0; i < readRank; i++)
inBoundsVal[i] = (sourceShape[i] == inputVectorSizes[i]) &&
for (unsigned i = 0; i < vecToReadRank; i++)
inBoundsVal[i] = (sourceShape[i] == vecToReadShape[i]) &&
ShapedType::isStatic(sourceShape[i]);
}
auto transferReadOp = vector::TransferReadOp::create(
builder, loc,
/*vectorType=*/vectorType,
/*vectorType=*/vecToReadTy,
/*source=*/source,
/*indices=*/SmallVector<Value>(readRank, zero),
/*indices=*/SmallVector<Value>(vecToReadRank, zero),
/*padding=*/padValue,
/*inBounds=*/inBoundsVal);

if (llvm::equal(inputVectorSizes, sourceShape) || useInBoundsInsteadOfMasking)
if (llvm::equal(vecToReadTy.getShape(), sourceShape) ||
useInBoundsInsteadOfMasking)
return transferReadOp;
SmallVector<OpFoldResult> mixedSourceDims =
isa<MemRefType>(source.getType())
? memref::getMixedSizes(builder, loc, source)
: tensor::getMixedSizes(builder, loc, source);

auto maskType = VectorType::get(inputVectorSizes, builder.getI1Type(),
inputScalableVecDims);
auto maskType = vecToReadTy.cloneWith(/*shape=*/{}, builder.getI1Type());
Value mask =
vector::CreateMaskOp::create(builder, loc, maskType, mixedSourceDims);
return mlir::vector::maskOperation(builder, transferReadOp, mask)
Expand Down