Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Afuller/metalium api reorg #16578

Merged
merged 2 commits into from
Jan 16, 2025
Merged

Afuller/metalium api reorg #16578

merged 2 commits into from
Jan 16, 2025

Conversation

afuller-TT
Copy link
Contributor

@afuller-TT afuller-TT commented Jan 9, 2025

Ticket

#16577

Problem description

TT-Metalium (tt_metal in the repo) has no clear distinction between public headers vs private headers. Public headers are scattered everywhere and intermixed with private headers.

Consequences:

Consumers reach for them by many different paths.
Authors and code reviewers are unclear what's exposed to consumers and what is internal to the library.
API refinements (eg: for TT-Distributed) cannot be certain that nobody is using an intended-to-be-private header.
Packaging is blocked.
Customers have a bad experience.

What's changed

I suggest reviewing per-commit:

The first commit is manual changes done to adjust include paths and such in preparation for the new home for public header files.

The second commit is the result of running a script that moves the identified files to new homes and the By The Power Of Greyskull Sed updating the #include directives. The scripts in question are here: https://github.com/tenstorrent/tt-metal/compare/afuller/metalium-api-reorg-scripts?expand=1 No manual edits are done in this commit.

3rd party migrations

To ease migration of other codebases, customers (internal and external) can re-use the reorg-api.consumer.sed script to update their include paths.

TODO:

  • Docs generation needs to be fixed
  • Verify everything runs clean
  • Rebase the first commit on latest main, and re-run the script before merging; I expect many conflicts if attempting to do a Git Merge.

Checklist

@@ -7,7 +7,7 @@
#include <cstdint>
#include "common_values.hpp"
#include "dev_mem_map.h"
#include "noc/noc_parameters.h"
#include <tt-metalium/noc_parameters.h>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't find this header

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah.. I guess it is intentional. I little confusing, but I guess makes sense
Screenshot 2025-01-09 at 2 35 02 PM

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When Archangel @blozano-tt rescues us from ARCH_NAME, this should vanish (at least from the public API -- tucked nicely behind a HAL).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we expose stuff in tt_stl in a different way? Technically not part of "metalium API", and seeing the headers in the same pile seems odd to me. In general, wdyt about adding more structure here, perhaps incrementally?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For sure. I'd rather do it incrementally. This is adding transparency but without changing behaviour. If tt_stl isn't "technically part of Metalium API", then it would follow that it should be outside of Metalium and just another dependency. I consider it a win that it has become clear whereas previously it was obscure that it was coming from Metalium even when it wasn't actually meant to be part of Metalium. 💪

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with the incremental approach for adding structure. For tt_stl though, I think we can already make https://github.com/tenstorrent/tt-metal/tree/main/tt_metal/tt_stl public in the same way as tt_metal/api/tt-metalium/? The library should be self sufficient and not require anything from metal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like it's close.
Except for this and this.

If someone wants to snip those two dependencies, then the only thing remaining is to decide what scope to expose it as, if tt-metalium isn't the right grouping.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, arguably assert.hpp and logger.hpp should also be part of "tt stl". Those 2 headers don't have any metal dependencies.

Incidentally, "tt stl" is not a good name (specifically having namespace stl is a bad idea), I'd propose shortening to "ttsl".

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should look into officially defining some type of util folder
Big part of code in UMD would also fall into this camp

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like where this is going. If someone wants to drive it before this goes in, I'll adjust this PR to accomodate. Else we can do it afterwards. I don't believe we should lump two such changes into one PR, though. That gets messy very quickly.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes we should get this in

#include "tools/profiler/profiler_state.hpp"
#include "tt_metal/impl/dispatch/command_queue_interface.hpp"
#include "tt_metal/impl/kernels/kernel.hpp"
#include <common.hpp>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should move includes with <> pattern before includes with ""

like
*stl includes
*api includes
*internal includes

Somethink like that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I second this comment.

Hopefully there's an automatic way of doing that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree.

My personal approach is the opposite. The "nearby" headers first, and then expand farther and farther out. This is a logic sort order that follows from "include your own header first" <-- the closest one.

It also serves to sus out mistakes in headers where they use something without including it. If everyone includes before including FancyThing.hpp, then if FancyThing's header forgets to include it'll go unnoticed for longer because many times it will accidentally succeed.

I know of tooling that will handle sorting alphabetically. So this "bucketization" needs a blank line to keep the division and alphasort within each. Tooling to enforce proximity buckets is less mature, I believe. But worth digging deeper on.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I though you would say that :), yeah it's even better to reverse the order, but any structure would be better.

Copy link
Contributor

@bbradelTT bbradelTT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. It seems like a lot of files from multiple different directories are being moved to the tt-metalium directory. I just want to double check, that won't cause any issues for working with the code, right?
  2. What is the purpose of adding header files such as tt_metal/api/blackhole/tt-metalium/noc_parameters.h and is there any code duplication?

@afuller-TT afuller-TT force-pushed the afuller/metalium-api-reorg branch from 2df5583 to c23260a Compare January 13, 2025 22:06
@afuller-TT afuller-TT requested a review from a team as a code owner January 13, 2025 22:06
@afuller-TT
Copy link
Contributor Author

  1. It seems like a lot of files from multiple different directories are being moved to the tt-metalium directory. I just want to double check, that won't cause any issues for working with the code, right?
  2. What is the purpose of adding header files such as tt_metal/api/blackhole/tt-metalium/noc_parameters.h and is there any code duplication?

@bbradelTT:

  1. No technical issues. Only friction that the header is not adjacent to the implementation. This is arguably beneficial friction as it drives awareness of what's part of the public API and what's not, and appropriate design decisions to have a clean separation between public API vs. private impl. (eg: RIMPL).
  2. The purpose is to not break the build -- our public API is leaking those details. At least until @blozano-tt 's last PR merges. After that I'll rebase this PR on top and I should be able to remove any ARCH_NAME-dependent includes from the public API. A glorious day that will be, indeed!

${PROJECT_SOURCE_DIR}/tt_metal/common
${CMAKE_CURRENT_SOURCE_DIR}
)
target_include_directories(${TEST_TARGET} PRIVATE ${CMAKE_CURRENT_SOURCE_DIR})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

YESSSSSSSSS

@afuller-TT afuller-TT force-pushed the afuller/metalium-api-reorg branch 2 times, most recently from 6722c4d to 32d3843 Compare January 15, 2025 21:58
@afuller-TT afuller-TT force-pushed the afuller/metalium-api-reorg branch from 32d3843 to e61b12d Compare January 16, 2025 00:04
@afuller-TT afuller-TT force-pushed the afuller/metalium-api-reorg branch from e61b12d to d6f013b Compare January 16, 2025 02:14
@afuller-TT afuller-TT merged commit fa45b9b into main Jan 16, 2025
214 of 216 checks passed
@afuller-TT afuller-TT deleted the afuller/metalium-api-reorg branch January 16, 2025 06:15
@giordano giordano mentioned this pull request Jan 16, 2025
6 tasks
tt-rkim pushed a commit that referenced this pull request Jan 16, 2025
### Problem description

The change link is currently broken.  I believe it was broken in #16578.

### What's changed

Link now works.

### Checklist
- [ ] Post commit CI passes
- [ ] Blackhole Post commit (if applicable)
- [ ] Model regression CI testing passes (if applicable)
- [ ] Device performance regression CI testing passes (if applicable)
- [ ] **(For models and ops writers)** Full [new
models](https://github.com/tenstorrent/tt-metal/actions/workflows/full-new-models-suite.yaml)
tests passes
- [ ] New/Existing tests provide coverage for changes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants