Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test_bmm dprint corruption #16434

Closed
tt-dma opened this issue Jan 3, 2025 · 2 comments
Closed

test_bmm dprint corruption #16434

tt-dma opened this issue Jan 3, 2025 · 2 comments
Assignees

Comments

@tt-dma
Copy link
Contributor

tt-dma commented Jan 3, 2025

No description provided.

@tt-dma tt-dma self-assigned this Jan 3, 2025
@miawangTT
Copy link
Contributor

Instructions to reproduce the issue we discussed in the week of Dec. 16 are included here.

  • branch tmp/wmang/exp, based on main before Nov. 15

  • branch tmp/mwang/exp-new, based on main at the week of Dec. 9

  • to reproduce

    • ./build_metal.sh --build-metal-tests
    • TT_METAL_DPRINT_CORES=all TT_METAL_DPRINT_ONE_FILE_PER_RISC=1 TT_METAL_SLOW_DISPATCH_MODE=1 ./build/test/tt_metal/test_bmm
  • Example output is in the figure below.
    Image

  • NOTE:

    • you may need to use tt-smi to reset the chip after the exception occurs. Otherwise the next run will hang.
    • the problem does not exist in debug build./build_metal.sh --build-metal-tests --debug

tt-dma added a commit that referenced this issue Jan 9, 2025
Previously read twice (with the first read being eight bytes). Switching
to single read appears to alleviate some specific race conditions where
both dprint and risc-to-risc dependencies are present.
johanna-rock-tt pushed a commit that referenced this issue Jan 16, 2025
Previously read twice (with the first read being eight bytes). Switching
to single read appears to alleviate some specific race conditions where
both dprint and risc-to-risc dependencies are present.
tt-dma added a commit that referenced this issue Jan 16, 2025
Previously read twice (with the first read being eight bytes). Switching
to single read appears to alleviate some specific race conditions where
both dprint and risc-to-risc dependencies are present.
tt-dma added a commit that referenced this issue Jan 17, 2025
Previously read twice (with the first read being eight bytes). Switching
to single read appears to alleviate some specific race conditions where
both dprint and risc-to-risc dependencies are present.
tt-dma added a commit that referenced this issue Jan 17, 2025
Previously read twice (with the first read being eight bytes). Switching
to single read appears to alleviate some specific race conditions where
both dprint and risc-to-risc dependencies are present.
@tt-dma
Copy link
Contributor Author

tt-dma commented Jan 17, 2025

PR #16586 is merged with the fix for this, let me know if the problem persists

@tt-dma tt-dma closed this as completed Jan 17, 2025
johanna-rock-tt added a commit that referenced this issue Jan 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants