Skip to content

pickup specialization microkernels for gfx950 3.8.0rc20250909#2242

Closed
dezhiAmd wants to merge 9 commits into
nod-ai:mainfrom
dezhiAmd:gfx950_ukernel
Closed

pickup specialization microkernels for gfx950 3.8.0rc20250909#2242
dezhiAmd wants to merge 9 commits into
nod-ai:mainfrom
dezhiAmd:gfx950_ukernel

Conversation

@dezhiAmd
Copy link
Copy Markdown
Contributor

@dezhiAmd dezhiAmd commented Sep 12, 2025

pickup specialization microkernels for gfx950.
Refer to this IREE commit

Test result on gfx950 shows including the below compiling option when using iree-compile get better performance:
--iree-hip-enable-tensor-ukernels

Signed-off-by: dezhliao <dezhi.liao@amd.com>
…Perplexity[False] tests

Signed-off-by: dezhliao <dezhi.liao@amd.com>
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Sep 13, 2025

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

✅ All modified and coverable lines are covered by tests.
⚠️ Please upload report for BASE (main@0ce9072). Learn more about missing BASE report.
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #2242   +/-   ##
=======================================
  Coverage        ?   78.01%           
=======================================
  Files           ?      228           
  Lines           ?    22032           
  Branches        ?        0           
=======================================
  Hits            ?    17188           
  Misses          ?     4844           
  Partials        ?        0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Signed-off-by: dezhliao <dezhliao@amd.com>
dezhiAmd and others added 4 commits September 16, 2025 10:23
Signed-off-by: dezhliao <dezhi.liao@amd.com>
Signed-off-by: dezhliao <dezhi.liao@amd.com>
Signed-off-by: dezhliao <dezhliao@amd.com>
Signed-off-by: dezhliao <dezhliao@amd.com>
@dezhiAmd dezhiAmd changed the title test 3.8.0rc20250909 pickup specialization microkernels for gfx950 3.8.0rc20250909 Sep 16, 2025
@dezhiAmd dezhiAmd marked this pull request as ready for review September 16, 2025 17:47
@dezhiAmd dezhiAmd requested a review from rsuderman September 16, 2025 17:58
@dezhiAmd dezhiAmd enabled auto-merge (squash) September 16, 2025 17:58
),
),
False,
pytest.param(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not be bumping if this perplexity is failing - this needs more details if it is going to be xfailed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The related iree issue is here

fail-fast: false
matrix:
include:
- name: cpu
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't be removed - smoke test on CPU is still important. Only the batcher tests make sense to be removed.

Copy link
Copy Markdown
Contributor Author

@dezhiAmd dezhiAmd Sep 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same iree issue iree-org/iree#22007 break smoke test on CPU.

I am curious about the scenarios where compiling MLIR to a VMFB for a CPU target would be beneficial. From my understanding, AMD's strengths lie in GPU hardware, and AI inference workloads are typically GPU-accelerated. So I'm trying to better understand the rationale or use cases behind targeting the CPU in this context

Comment thread .github/workflows/pkgci_shark_ai.yml
Signed-off-by: dezhliao <dezhi.liao@amd.com>
@dezhiAmd dezhiAmd requested a review from rsuderman September 17, 2025 00:08
Signed-off-by: dezhliao <dezhi.liao@amd.com>
@dezhiAmd
Copy link
Copy Markdown
Contributor Author

Replace this PR with #2205

@dezhiAmd dezhiAmd closed this Sep 17, 2025
auto-merge was automatically disabled September 17, 2025 21:57

Pull request was closed

@dezhiAmd dezhiAmd deleted the gfx950_ukernel branch September 17, 2025 21:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants