Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ttnn runtime layout comparisons in debug mode, minor runtime build cleanup #2338

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

jnie-TT
Copy link
Contributor

@jnie-TT jnie-TT commented Feb 28, 2025

Ticket

closes #1313
Part of #2125

Problem description

Currently we have don't have a consistency checker that verifies runtime intermediates are aligned with compiler generated descriptors.

What's changed

This PR adds a debug API (only enabled with -DTT_RUNTIME_DEBUG=ON) that compares every runtime tensor with its corresponding flatbuffer descriptor, checking that layouts, memory configs, data types match.

Couple mismatches were found, needed to update the following to fix the mismatches:

  • In TTNNDecomposeLayouts, remove all direct calls to createOp, use the dedicated op creation API instead so that the intermediate layouts are correct
  • For conv2d, since we currently don't populate the conv2dconfig, the output layout at runtime will be tilized by default. Thus in TTNNLayouts, set the default layout of the dps operand of conv2d op to tilized instead of forcing to row major.

Also added some minor updates in the cmake files under runtime

Checklist

  • New/Existing tests provide coverage for changes

Copy link
Contributor

@azecevicTT azecevicTT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dialect changes looks good. I peeked at runtime changes as well, this looks like a really welcome addition!

Copy link
Contributor

@svuckovicTT svuckovicTT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great change!

Comment on lines +1 to +4
if (NOT TTNN_RUNTIME_ENABLED)
add_library(TTRuntimeTTNNOps INTERFACE)
return()
endif()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused, why the NOT in the condition?

Copy link
Contributor Author

@jnie-TT jnie-TT Mar 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If ttnn runtime is not enabled, we won't build TTRuntimeTTNNOps lib, we'll just add a placeholder interface library that's empty. Only if ttnn runtime is enabled will we actually build the static library

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment please?

@jnie-TT jnie-TT force-pushed the jnie/runtime_debug_tensor_compare branch 5 times, most recently from c5cd973 to c4b879e Compare March 3, 2025 17:45
@jnie-TT jnie-TT force-pushed the jnie/runtime_debug_tensor_compare branch from c4b879e to 6d634ab Compare March 3, 2025 17:47
Copy link
Contributor

@kmabeeTT kmabeeTT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thanks @jnie-TT - A lot in here, some of which I don't fully understand, but don't want to block this welcome addition, so approving. Will aim to look closer later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add asserts for each operation in RT and also add check for setting the output
5 participants