Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ASAN build fixes for TTNNLayoutAttr class #1566

Closed
wants to merge 1 commit into from

Conversation

mmanzoorTT
Copy link
Contributor

  • TTNNLayoutAttr::getShardShape() uses llvm::ArrayRef to create llvm::SmallVector object and crashes for ASAN build due to non-owning nature of llvm::ArrayRef. This change uses llvm::ArrayRef directly to avoid address sanitization error.

fixes #1565

* TTNNLayoutAttr::getShardShape() uses llvm::ArrayRef to create llvm::SmallVector
object and crashes for ASAN build due to non-owning nature of llvm::ArrayRef.
This change uses llvm::ArrayRef directly to avoid address sanitization error.
Copy link
Contributor

@azecevicTT azecevicTT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks rather suspicious to me. All SmallVector does is 'intercepting' ArrayRef and making an owning copy of its content. I don't see how this can cause an error. On the other hand it doesn't hurt changing it to ArrayRef in this case since lifetime of this ArrayRef is the same as the underlying TTNNLayoutAttr object.

azecevicTT added a commit that referenced this pull request Dec 11, 2024
This PR introduces some changes
#1566, but it seems that it
doesn't fix the majority of failing tests. Both bugs are result of
dangling reference of `ArrayRef` objects, because they receive temporary
`SmallVector` objects.

I've rebased this branch on #1562, for the ability to build with ASAN flag.

Fixes #1565
azecevicTT added a commit that referenced this pull request Dec 11, 2024
This PR introduces some changes #1566, but it seems that it
doesn't fix the majority of failing tests. Both bugs are result of
dangling reference of `ArrayRef` objects, because they receive temporary
`SmallVector` objects.

I've rebased this branch on #1562, for the ability to build with ASAN flag.

Fixes #1565
@@ -180,8 +180,8 @@ uint64_t TTNNLayoutAttr::getElementSizeBytes() const {
// Example: memref<2x3!tt.tile<32x32xf32>> -> { 2, 3 }
//
// return The shape of the shard.
llvm::SmallVector<int64_t> TTNNLayoutAttr::getShardShape() const {
Copy link
Contributor

@uazizTT uazizTT Dec 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it requires to use a different constructor to make SmallVector to make an owning SmallVector, the following should resolve it:

llvm::SmallVector<int64_t> TTNNLayoutAttr::getShardShape() const {
  return SmallVector<int64_t>(getMemref().getShape().begin(), getMemref().getShape().end());
}

We saw a similar issue before. https://github.com/tenstorrent/tt-mlir/pull/882/files

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm look in the source code include/llvm/ADT/SmallVector.h:1232 and this doesn't seem like an issue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok you are right. In that case, I wonder the cause of this issue as it appears to be already making an owning copy?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've looked into that PR, difference is that options.meshShape is ListOption, but the same effect could've been achieved with implicitDeviceOptions.meshShape = ::llvm::SmallVector<int64_t>(*options.meshShape); My guess is that the big part behind the decision that all SmallVector constructors are marked as explicit it exactly for the reason where you say "Yes, I know SmallVector will a acquire new memory and make a copy, and I agree to that."

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding issue in this PR, the root cause is in thelib/Dialect/TTNN/Transforms/Passes.cpp:235. In input.shardShape = inputLayoutAttr.getShardShape();, inputLayoutAttr.getShardShape() returns SmallVector. That SmallVector is an unnamed and temporary object, so it dies at the end of the statement, and input.shardShape as an ArrayRef gets a dangling reference.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok thanks, yeah that was my question that how the changes in this PR are resolving the issue as the output for getShardShape() is already owning. But from your explanation it seems the issue propagated from lib/Dialect/TTNN/Transforms/Passes.cpp:235.

@uazizTT
Copy link
Contributor

uazizTT commented Dec 11, 2024

This looks rather suspicious to me. All SmallVector does is 'intercepting' ArrayRef and making an owning copy of its content. I don't see how this can cause an error. On the other hand it doesn't hurt changing it to ArrayRef in this case since lifetime of this ArrayRef is the same as the underlying TTNNLayoutAttr object.

I think it is not really making an owning copy, thus causing the issue. We should use a different constructor #1566 (comment)

azecevicTT added a commit that referenced this pull request Dec 12, 2024
This PR introduces some changes #1566, but it seems that it
doesn't fix the majority of failing tests. Both bugs are result of
dangling reference of `ArrayRef` objects, because they receive temporary
`SmallVector` objects.

I've rebased this branch on #1562, for the ability to build with ASAN flag.

Fixes #1565
azecevicTT added a commit that referenced this pull request Dec 12, 2024
This PR introduces some changes #1566, but it seems that it
doesn't fix the majority of failing tests. Both bugs are result of
dangling reference of `ArrayRef` objects, because they receive temporary
`SmallVector` objects.

I've rebased this branch on #1562, for the ability to build with ASAN flag.

Fixes #1565
azecevicTT added a commit that referenced this pull request Dec 13, 2024
This PR introduces some changes #1566, but it seems that it
doesn't fix the majority of failing tests. Both bugs are result of
dangling reference of `ArrayRef` objects, because they receive temporary
`SmallVector` objects.

I've rebased this branch on #1562, for the ability to build with ASAN flag.

Fixes #1565
@mmanzoorTT
Copy link
Contributor Author

#1569 solved this ASAN bug

@mmanzoorTT mmanzoorTT closed this Dec 13, 2024
@mmanzoorTT mmanzoorTT deleted the mmanzoor/fix-shard-shape-address-sanitizer branch February 10, 2025 18:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Address Sanitizer bug for TTNN Dialect
3 participants