fix: Add tests for the AutoTuner and fix bug in _find_nearest_profile#2617
fix: Add tests for the AutoTuner and fix bug in _find_nearest_profile#2617yzh119 merged 7 commits intoflashinfer-ai:mainfrom
Conversation
Signed-off-by: Daniel Serebrenik <daserebrenik@nvidia.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds two new test modules: a parametrized FP8 GEMM autotuning test targeting Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello @danisereb, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the test coverage for the Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
The pull request introduces new unit tests for the AutoTuner functionality, specifically for bmm_fp8 and core autotuner logic. This is a positive step towards improving code quality and reliability, especially given the identified bug in _find_nearest_profile. The tests cover various scenarios like cache hits/misses, dynamic dimension bucketization, and constraint application. The use of pytest.mark.xfail for known bugs is appropriate. Overall, the changes enhance the test coverage and provide a solid foundation for future bug fixes and feature development.
| @XFAIL_LINKED_DYNAMIC | ||
| def test_find_nearest_profile_moe_same_bucket_same_profile(): | ||
| """MoE inputs mapping to the same bucket should share an identical profile.""" | ||
| config = TuningConfig( |
There was a problem hiding this comment.
This xfail mark indicates a known bug. While it's good to acknowledge it, the test itself is verifying a behavior that is currently incorrect. It might be more beneficial to have a separate test that explicitly fails due to the bug, and then this test could be uncommented/enabled once the bug is fixed. This approach makes the bug's impact clearer and tracks its resolution more directly.
There was a problem hiding this comment.
We plan to fix the bug soon and remove the xfail.
| @XFAIL_LINKED_DYNAMIC | ||
| def test_find_nearest_profile_maps_all_linked_dims(): | ||
| """One logical dynamic axis should update every linked tensor/dimension.""" | ||
| tuning_config = TuningConfig( |
There was a problem hiding this comment.
There was a problem hiding this comment.
The test should be fixed along with the bugfix in the AutoTuner class.
| pytest.param(1000, 512, marks=XFAIL_LINKED_DYNAMIC), | ||
| pytest.param(4000, 2048, marks=XFAIL_LINKED_DYNAMIC), | ||
| pytest.param(8000, 4096, marks=XFAIL_LINKED_DYNAMIC), | ||
| pytest.param(12000, 8192, marks=XFAIL_LINKED_DYNAMIC), |
There was a problem hiding this comment.
@aleozlx these test cases triggered using fallback tactic when I used the trtllm MoE NVFP4 in vLLM.
The using fallback tactic happens only when the num tokens is not a power of 2 (I used FLASHINFER_LOGGING_LEVEL=DEBUG to see these prints).
|
@wenscarl can you review this PR ? |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (3)
tests/autotuner/test_autotuner_bmm_fp8.py (1)
55-55:inputshadows the Python built-in.Rename to
mat1orinput_matto avoid masking the built-in.✏️ Suggested rename
- input = torch.randn([b, m, k], device="cuda", dtype=torch.bfloat16) - input_fp8, input_inv_s = to_float8(input, dtype=input_dtype) + mat1 = torch.randn([b, m, k], device="cuda", dtype=torch.bfloat16) + input_fp8, input_inv_s = to_float8(mat1, dtype=input_dtype)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/autotuner/test_autotuner_bmm_fp8.py` at line 55, The variable named input in tests/autotuner/test_autotuner_bmm_fp8.py shadows the Python built-in; rename that variable (e.g., to mat1 or input_mat) wherever it is defined and subsequently referenced (the torch.randn assignment and all uses) so all occurrences are updated consistently to avoid masking the built-in name.tests/autotuner/test_autotuner_core.py (2)
253-254: Silence RuffARG001/ARG002on the monkeypatched stub.Same pattern as
DummyRunner— all four parameters are required by the method signature being patched but unused.✏️ Proposed fix
- def fake_profile(self, runner_obj, prof_inputs, tactic, **kwargs): + def fake_profile(_self, _runner_obj, _prof_inputs, tactic, **_kwargs): return {0: 5.0, 1: 1.0, 2: 3.0}[tactic]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/autotuner/test_autotuner_core.py` around lines 253 - 254, The monkeypatched stub fake_profile defines parameters runner_obj, prof_inputs, tactic, **kwargs but only uses tactic, triggering Ruff ARG001/ARG002; rename the unused parameters to start with an underscore (e.g., _runner_obj, _prof_inputs, **_kwargs) in the fake_profile definition so the signature stays compatible but linters know those args are intentionally unused while leaving tactic unchanged.
39-47: Silence RuffARG002warnings on stub interface methods.The unused parameters are interface requirements, but Ruff flags them. Prefix with
_to satisfy the linter without disabling rules globally.✏️ Proposed fix
class DummyRunner(TunableRunner): def __init__(self, valid_tactics=(0, 1, 2)): self.valid_tactics = valid_tactics - def get_valid_tactics(self, inputs, profile): + def get_valid_tactics(self, _inputs, _profile): return self.valid_tactics - def forward(self, inputs, tactic: int = -1, do_preparation: bool = False, **kwargs): + def forward(self, inputs, tactic: int = -1, do_preparation: bool = False, **_kwargs): return inputs[0]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/autotuner/test_autotuner_core.py` around lines 39 - 47, Rename unused interface parameters to start with an underscore to silence Ruff ARG002: in class DummyRunner change get_valid_tactics(self, inputs, profile) to get_valid_tactics(self, _inputs, _profile) and change forward(self, inputs, tactic: int = -1, do_preparation: bool = False, **kwargs) to forward(self, inputs, tactic: int = -1, _do_preparation: bool = False, **_kwargs). Keep the method bodies the same; reference DummyRunner, get_valid_tactics, and forward when making the edits.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/autotuner/test_autotuner_core.py`:
- Around line 101-142: The test
test_find_nearest_profile_moe_shared_num_tokens_axis is giving false confidence
because exact powers-of-two cases pass even when linkage logic is broken; update
the test so it actually verifies linked-dimension behavior in
AutoTuner._find_nearest_profile/DynamicTensorSpec: either (A) split out the
exact-power-of-two param sets into a separate test that only links a single
tensor (e.g., input_idx=(0,), shapes = (_moe_input_shapes(num_tokens)[0],) ) to
validate single-tensor bucketing, or (B) keep the linked six-tensor spec but add
an explicit assertion after calling AutoTuner._find_nearest_profile that checks
tensors 1..5 were adjusted (their first-dimension != the original
shapes[1..5][0] or equal to expected_bucket) to ensure linkage actually updated
all linked indices rather than relying on coincidental original values.
---
Nitpick comments:
In `@tests/autotuner/test_autotuner_bmm_fp8.py`:
- Line 55: The variable named input in tests/autotuner/test_autotuner_bmm_fp8.py
shadows the Python built-in; rename that variable (e.g., to mat1 or input_mat)
wherever it is defined and subsequently referenced (the torch.randn assignment
and all uses) so all occurrences are updated consistently to avoid masking the
built-in name.
In `@tests/autotuner/test_autotuner_core.py`:
- Around line 253-254: The monkeypatched stub fake_profile defines parameters
runner_obj, prof_inputs, tactic, **kwargs but only uses tactic, triggering Ruff
ARG001/ARG002; rename the unused parameters to start with an underscore (e.g.,
_runner_obj, _prof_inputs, **_kwargs) in the fake_profile definition so the
signature stays compatible but linters know those args are intentionally unused
while leaving tactic unchanged.
- Around line 39-47: Rename unused interface parameters to start with an
underscore to silence Ruff ARG002: in class DummyRunner change
get_valid_tactics(self, inputs, profile) to get_valid_tactics(self, _inputs,
_profile) and change forward(self, inputs, tactic: int = -1, do_preparation:
bool = False, **kwargs) to forward(self, inputs, tactic: int = -1,
_do_preparation: bool = False, **_kwargs). Keep the method bodies the same;
reference DummyRunner, get_valid_tactics, and forward when making the edits.
There was a problem hiding this comment.
🧹 Nitpick comments (3)
tests/autotuner/test_autotuner_core.py (2)
39-47: Silence Ruff ARG002 with@typing.override(or_-prefixed params).Per Ruff's ARG002 rule, intentionally unused arguments should be prefixed with
_. However, Ruff also exempts methods decorated with@typing.override— a cleaner choice for an interface implementation since it avoids renaming the public parameter names:♻️ Proposed fix using `@typing.override`
+from typing import override class DummyRunner(TunableRunner): def __init__(self, valid_tactics=(0, 1, 2)): self.valid_tactics = valid_tactics + `@override` def get_valid_tactics(self, inputs, profile): return self.valid_tactics + `@override` def forward(self, inputs, tactic: int = -1, do_preparation: bool = False, **kwargs): return inputs[0]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/autotuner/test_autotuner_core.py` around lines 39 - 47, The test DummyRunner class implements TunableRunner methods but leaves some parameters unused which triggers Ruff ARG002; update the DummyRunner methods (specifically get_valid_tactics and forward) to either decorate them with `@typing.override` from typing (preferred) to signal intentional interface implementation, or rename intentionally unused parameters by prefixing them with an underscore (e.g., inputs -> _inputs or profile -> _profile) so Ruff stops flagging ARG002; ensure the decorator is imported and applied to the method definitions rather than changing public parameter names if you want to keep the original signatures.
272-273: Silence Ruff ARG001 by prefixing unused params with_.
tacticis the only used parameter; the rest need the_prefix:♻️ Proposed fix
- def fake_profile(self, runner_obj, prof_inputs, tactic, **kwargs): + def fake_profile(_self, _runner_obj, _prof_inputs, tactic, **_kwargs): return {0: 5.0, 1: 1.0, 2: 3.0}[tactic]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/autotuner/test_autotuner_core.py` around lines 272 - 273, The test helper fake_profile currently uses only tactic, leaving runner_obj, prof_inputs and kwargs unused which triggers Ruff ARG001; update the function signature for fake_profile to prefix the unused parameters (runner_obj, prof_inputs, kwargs) with underscores (e.g., _runner_obj, _prof_inputs, **_kwargs) while keeping tactic as-is so the linter warning is silenced.tests/autotuner/test_autotuner_bmm_fp8.py (1)
92-108: Expose the"fp8_gemm"op-name as a constant fromflashinfer.gemm.gemm_baseto avoid fragile coupling.The hard-coded
"fp8_gemm"string at line 97 creates maintenance risk. If the internal op name ingemm_base.py:949ever changes, the cache lookup will silently miss and cause theis_cache_hitassertion at line 110 to fail. Following the existing pattern of exposing tuning configs like_FP8_GEMM_SM100_TUNING_CONFIG, define an op-name constant ingemm_baseand import it in the test.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/autotuner/test_autotuner_bmm_fp8.py` around lines 92 - 108, Replace the hard-coded "fp8_gemm" string in the test with a shared constant exported from flashinfer.gemm.gemm_base to avoid fragile coupling: add a descriptive constant (e.g., FP8_GEMM_OP_NAME) in gemm_base.py adjacent to where the op is registered (around the code at gemm_base:949), export it from the module, then update tests/autotuner/test_autotuner_bmm_fp8.py to import that constant and use it in the autotuner.search_cache call instead of the literal "fp8_gemm" so cache lookups remain correct if the internal op name changes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@tests/autotuner/test_autotuner_core.py`:
- Around line 101-161: The review contains a duplicate approval comment; remove
the redundant duplicate_comment/duplicate approval metadata so there's a single
approval for the change, keeping the tests
test_find_nearest_profile_single_tensor_bucketization_exact_powers and
test_find_nearest_profile_moe_shared_num_tokens_axis as-is and ensuring no
duplicate review entries remain associated with this PR.
---
Nitpick comments:
In `@tests/autotuner/test_autotuner_bmm_fp8.py`:
- Around line 92-108: Replace the hard-coded "fp8_gemm" string in the test with
a shared constant exported from flashinfer.gemm.gemm_base to avoid fragile
coupling: add a descriptive constant (e.g., FP8_GEMM_OP_NAME) in gemm_base.py
adjacent to where the op is registered (around the code at gemm_base:949),
export it from the module, then update tests/autotuner/test_autotuner_bmm_fp8.py
to import that constant and use it in the autotuner.search_cache call instead of
the literal "fp8_gemm" so cache lookups remain correct if the internal op name
changes.
In `@tests/autotuner/test_autotuner_core.py`:
- Around line 39-47: The test DummyRunner class implements TunableRunner methods
but leaves some parameters unused which triggers Ruff ARG002; update the
DummyRunner methods (specifically get_valid_tactics and forward) to either
decorate them with `@typing.override` from typing (preferred) to signal
intentional interface implementation, or rename intentionally unused parameters
by prefixing them with an underscore (e.g., inputs -> _inputs or profile ->
_profile) so Ruff stops flagging ARG002; ensure the decorator is imported and
applied to the method definitions rather than changing public parameter names if
you want to keep the original signatures.
- Around line 272-273: The test helper fake_profile currently uses only tactic,
leaving runner_obj, prof_inputs and kwargs unused which triggers Ruff ARG001;
update the function signature for fake_profile to prefix the unused parameters
(runner_obj, prof_inputs, kwargs) with underscores (e.g., _runner_obj,
_prof_inputs, **_kwargs) while keeping tactic as-is so the linter warning is
silenced.
There was a problem hiding this comment.
🧹 Nitpick comments (6)
tests/autotuner/test_autotuner_core.py (6)
39-47: Silence Ruff ARG002 warnings on abstract-interface arguments.
get_valid_tacticsandforwardcarry unused parameters because they satisfyTunableRunner's abstract interface. Prefix intentionally-ignored parameters with_to signal that intent and suppress the linter warnings.♻️ Proposed refactor
class DummyRunner(TunableRunner): def __init__(self, valid_tactics=(0, 1, 2)): self.valid_tactics = valid_tactics - def get_valid_tactics(self, inputs, profile): + def get_valid_tactics(self, _inputs, _profile): return self.valid_tactics - def forward(self, inputs, tactic: int = -1, do_preparation: bool = False, **kwargs): + def forward(self, inputs, tactic: int = -1, do_preparation: bool = False, **_kwargs): return inputs[0]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/autotuner/test_autotuner_core.py` around lines 39 - 47, The DummyRunner implementations of get_valid_tactics and forward have intentionally-unused parameters triggering Ruff ARG002; rename those parameters to start with an underscore to signal intentional ignoring and silence the linter: update get_valid_tactics(self, _inputs, _profile) and forward(self, _inputs, _tactic: int = -1, _do_preparation: bool = False, **_kwargs) while keeping the methods on the DummyRunner class to satisfy the TunableRunner abstract interface.
272-273: Silence Ruff ARG002 warnings onfake_profile.♻️ Proposed refactor
- def fake_profile(self, runner_obj, prof_inputs, tactic, **kwargs): + def fake_profile(_self, _runner_obj, _prof_inputs, tactic, **_kwargs): return {0: 5.0, 1: 1.0, 2: 3.0}[tactic]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/autotuner/test_autotuner_core.py` around lines 272 - 273, The fake_profile function currently has an unused parameter **kwargs which triggers Ruff ARG002; to silence the warning, rename the parameter to **_kwargs or explicitly consume it (e.g., assign _ = kwargs) in fake_profile so the linter recognizes it as intentionally unused; update the function signature and any internal references in fake_profile accordingly.
310-319: Profile order assumption makes the assertions fragile.Lines 312–313 address
profiles[0]andprofiles[1]by position, implicitly assuming_generate_optimization_profilesemits profiles ingen_tuning_buckets=(8, 16)insertion order. If that implementation detail changes, the test silently validates the wrong profile (or asserts the wrong shape) without a clear failure message.♻️ Proposed refactor — key by bucket value instead of index
- assert len(profiles) == 2 - assert profiles[0].get_opt_shapes()[0] == (8, 4) - assert profiles[1].get_opt_shapes()[0] == (16, 8) - - prepared = tuner._prepare_input_tensors(profiles[0], inputs) + assert len(profiles) == 2 + profile_by_bucket = {p.get_opt_shapes()[0][0]: p for p in profiles} + assert set(profile_by_bucket) == {8, 16} + assert profile_by_bucket[8].get_opt_shapes()[0] == (8, 4) + assert profile_by_bucket[16].get_opt_shapes()[0] == (16, 8) + + prepared = tuner._prepare_input_tensors(profile_by_bucket[8], inputs)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/autotuner/test_autotuner_core.py` around lines 310 - 319, The test assumes ordering of profiles which is fragile; instead map profiles by their bucket size and use that to assert shapes and prepare tensors: call tuner._generate_optimization_profiles(config, inputs), build a dict like {p.get_opt_shapes()[0][0]: p for p in profiles} to lookup the profile for bucket 8 and 16, then use the looked-up profile when calling tuner._prepare_input_tensors and when asserting shapes and identity (replace all uses of profiles[0]/profiles[1] with the dict lookups); this keeps the test correct regardless of profile emission order and still exercises _generate_optimization_profiles and _prepare_input_tensors.
39-47: Silence Ruff ARG002 warnings on abstract-interface stub arguments.
get_valid_tacticsandforwardcarry parameters required by theTunableRunnerabstract interface but unused in this minimal mock. Prefix them with_to communicate intent and suppress the linter.♻️ Proposed refactor
- def get_valid_tactics(self, inputs, profile): + def get_valid_tactics(self, _inputs, _profile): return self.valid_tactics - def forward(self, inputs, tactic: int = -1, do_preparation: bool = False, **kwargs): + def forward(self, inputs, tactic: int = -1, do_preparation: bool = False, **_kwargs): return inputs[0]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/autotuner/test_autotuner_core.py` around lines 39 - 47, The DummyRunner mock methods are triggering Ruff ARG002 because parameters required by the TunableRunner interface are unused; rename those unused parameters to start with an underscore to signal intentional non-use: in DummyRunner.get_valid_tactics rename inputs and profile to _inputs and _profile, and in DummyRunner.forward rename inputs, tactic, do_preparation and kwargs to _inputs, _tactic, _do_preparation and **_kwargs respectively, keeping return behavior and the valid_tactics attribute unchanged.
311-319: Profile ordering assumption makes the test fragile.Lines 312–313 assume
profiles[0]andprofiles[1]correspond to buckets 8 and 16 respectively, relying on_generate_optimization_profilespreserving the insertion order ofgen_tuning_buckets. If the implementation changes (e.g., sorts profiles by size), the wrong profile is validated without any test failure.Consider keying on the actual opt-shape instead of positional index:
♻️ Proposed refactor
- assert len(profiles) == 2 - assert profiles[0].get_opt_shapes()[0] == (8, 4) - assert profiles[1].get_opt_shapes()[0] == (16, 8) - - prepared = tuner._prepare_input_tensors(profiles[0], inputs) + assert len(profiles) == 2 + profile_by_bucket = {p.get_opt_shapes()[0][0]: p for p in profiles} + assert set(profile_by_bucket) == {8, 16} + assert profile_by_bucket[8].get_opt_shapes()[0] == (8, 4) + assert profile_by_bucket[16].get_opt_shapes()[0] == (16, 8) + + prepared = tuner._prepare_input_tensors(profile_by_bucket[8], inputs)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/autotuner/test_autotuner_core.py` around lines 311 - 319, The test assumes profiles are in a specific order which is fragile; instead locate the profile by its opt-shape and use it for preparation and assertions. Change the assertions to find the profile where profile.get_opt_shapes()[0] == (8, 4) (and similarly (16, 8) if needed) and call tuner._prepare_input_tensors(found_profile, inputs), then assert tuple(prepared[0].shape) == (8, 4) and identity checks against inputs; this uses the actual opt-shape as the key rather than relying on profiles[0]/profiles[1]. Ensure references to _prepare_input_tensors, get_opt_shapes, and gen_tuning_buckets/_generate_optimization_profiles are used to locate the relevant code.
272-273: Silence Ruff ARG002 warnings onfake_profile.♻️ Proposed refactor
- def fake_profile(self, runner_obj, prof_inputs, tactic, **kwargs): + def fake_profile(_self, _runner_obj, _prof_inputs, tactic, **_kwargs): return {0: 5.0, 1: 1.0, 2: 3.0}[tactic]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/autotuner/test_autotuner_core.py` around lines 272 - 273, The test helper fake_profile is triggering Ruff ARG002 for unused parameters; update its signature in test_autotuner_core.py (function fake_profile) to mark unused args with a leading underscore (e.g., _runner_obj, _prof_inputs, and **_kwargs) so the linter ignores them, preserving the tactic parameter and return behavior unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@tests/autotuner/test_autotuner_core.py`:
- Around line 39-47: The DummyRunner implementations of get_valid_tactics and
forward have intentionally-unused parameters triggering Ruff ARG002; rename
those parameters to start with an underscore to signal intentional ignoring and
silence the linter: update get_valid_tactics(self, _inputs, _profile) and
forward(self, _inputs, _tactic: int = -1, _do_preparation: bool = False,
**_kwargs) while keeping the methods on the DummyRunner class to satisfy the
TunableRunner abstract interface.
- Around line 272-273: The fake_profile function currently has an unused
parameter **kwargs which triggers Ruff ARG002; to silence the warning, rename
the parameter to **_kwargs or explicitly consume it (e.g., assign _ = kwargs) in
fake_profile so the linter recognizes it as intentionally unused; update the
function signature and any internal references in fake_profile accordingly.
- Around line 310-319: The test assumes ordering of profiles which is fragile;
instead map profiles by their bucket size and use that to assert shapes and
prepare tensors: call tuner._generate_optimization_profiles(config, inputs),
build a dict like {p.get_opt_shapes()[0][0]: p for p in profiles} to lookup the
profile for bucket 8 and 16, then use the looked-up profile when calling
tuner._prepare_input_tensors and when asserting shapes and identity (replace all
uses of profiles[0]/profiles[1] with the dict lookups); this keeps the test
correct regardless of profile emission order and still exercises
_generate_optimization_profiles and _prepare_input_tensors.
- Around line 39-47: The DummyRunner mock methods are triggering Ruff ARG002
because parameters required by the TunableRunner interface are unused; rename
those unused parameters to start with an underscore to signal intentional
non-use: in DummyRunner.get_valid_tactics rename inputs and profile to _inputs
and _profile, and in DummyRunner.forward rename inputs, tactic, do_preparation
and kwargs to _inputs, _tactic, _do_preparation and **_kwargs respectively,
keeping return behavior and the valid_tactics attribute unchanged.
- Around line 311-319: The test assumes profiles are in a specific order which
is fragile; instead locate the profile by its opt-shape and use it for
preparation and assertions. Change the assertions to find the profile where
profile.get_opt_shapes()[0] == (8, 4) (and similarly (16, 8) if needed) and call
tuner._prepare_input_tensors(found_profile, inputs), then assert
tuple(prepared[0].shape) == (8, 4) and identity checks against inputs; this uses
the actual opt-shape as the key rather than relying on profiles[0]/profiles[1].
Ensure references to _prepare_input_tensors, get_opt_shapes, and
gen_tuning_buckets/_generate_optimization_profiles are used to locate the
relevant code.
- Around line 272-273: The test helper fake_profile is triggering Ruff ARG002
for unused parameters; update its signature in test_autotuner_core.py (function
fake_profile) to mark unused args with a leading underscore (e.g., _runner_obj,
_prof_inputs, and **_kwargs) so the linter ignores them, preserving the tactic
parameter and return behavior unchanged.
There was a problem hiding this comment.
🧹 Nitpick comments (2)
tests/autotuner/test_autotuner_core.py (2)
272-273: Prefix unusedfake_profilestub arguments with_to silence Ruff ARG001.Both monkeypatch stubs leave
self,runner_obj,prof_inputs, andkwargsunused, triggering ARG001 at lines 272 and 326. Ruff's ARG001 rule requires intentionally-unused function arguments to be prefixed with an underscore.♻️ Proposed fix (apply to both occurrences)
- def fake_profile(self, runner_obj, prof_inputs, tactic, **kwargs): + def fake_profile(_self, _runner_obj, _prof_inputs, tactic, **_kwargs): return {0: 5.0, 1: 1.0, 2: 3.0}[tactic]Also applies to: 326-327
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/autotuner/test_autotuner_core.py` around lines 272 - 273, The stub function fake_profile currently defines unused parameters (self, runner_obj, prof_inputs, **kwargs) which triggers Ruff ARG001; rename those parameters by prefixing them with an underscore (e.g., _self, _runner_obj, _prof_inputs, **_kwargs) for both occurrences of the fake_profile stub so the function signature clearly marks them as intentionally unused while leaving the body ({0: 5.0, 1: 1.0, 2: 3.0}[tactic]) unchanged.
39-47: Prefix unused interface-required arguments with_to silence Ruff ARG002.
DummyRunneroverridesTunableRunnermethods but leaves several required signature parameters unused, which Ruff flags as ARG002. Ruff requires intentionally-unused arguments to be prefixed with an underscore (or match the configureddummy-variable-rgxpattern).♻️ Proposed fix
- def get_valid_tactics(self, inputs, profile): + def get_valid_tactics(self, _inputs, _profile): return self.valid_tactics - def forward(self, inputs, tactic: int = -1, do_preparation: bool = False, **kwargs): + def forward(self, inputs, _tactic: int = -1, _do_preparation: bool = False, **_kwargs): return inputs[0]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/autotuner/test_autotuner_core.py` around lines 39 - 47, The DummyRunner methods declare interface-required parameters that are unused and trigger Ruff ARG002; update the signatures to prefix unused args with an underscore (e.g., change get_valid_tactics(self, inputs, profile) to get_valid_tactics(self, _inputs, _profile)) and similarly in forward rename any unused params (e.g., _tactic, _do_preparation) and convert **kwargs to **_kwargs so the linter recognizes them as intentionally unused while preserving the required method signatures on DummyRunner and TunableRunner.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@tests/autotuner/test_autotuner_core.py`:
- Around line 272-273: The stub function fake_profile currently defines unused
parameters (self, runner_obj, prof_inputs, **kwargs) which triggers Ruff ARG001;
rename those parameters by prefixing them with an underscore (e.g., _self,
_runner_obj, _prof_inputs, **_kwargs) for both occurrences of the fake_profile
stub so the function signature clearly marks them as intentionally unused while
leaving the body ({0: 5.0, 1: 1.0, 2: 3.0}[tactic]) unchanged.
- Around line 39-47: The DummyRunner methods declare interface-required
parameters that are unused and trigger Ruff ARG002; update the signatures to
prefix unused args with an underscore (e.g., change get_valid_tactics(self,
inputs, profile) to get_valid_tactics(self, _inputs, _profile)) and similarly in
forward rename any unused params (e.g., _tactic, _do_preparation) and convert
**kwargs to **_kwargs so the linter recognizes them as intentionally unused
while preserving the required method signatures on DummyRunner and
TunableRunner.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
tests/autotuner/test_autotuner_bmm_fp8.py (1)
110-114:runner_id == 0couples the test to runner list ordering inbmm_fp8internals.If
bmm_fp8's internal runner list is ever reordered (e.g., a new runner is prepended), this assertion silently breaks the test's correctness guarantee. Consider adding a comment explaining why index 0 is expected, or asserting on a named/constant rather than a positional index.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/autotuner/test_autotuner_bmm_fp8.py` around lines 110 - 114, The test currently asserts runner_id == 0 which couples it to internal runner ordering; change this to reference a stable identifier instead: either assert runner_id equals a named constant (e.g., EXPECTED_RUNNER_INDEX) or, better, assert on the runner name or other stable property returned by the bmm_fp8 autotuner (e.g., compare runner_name or runner_identifier to an expected string/constant) and add a short comment explaining why that specific runner is expected; update the assertion around runner_id in test_autotuner_bmm_fp8.py to use that stable symbol (or replace it with a comment if positional index must remain).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@tests/autotuner/test_autotuner_bmm_fp8.py`:
- Around line 110-114: The test currently asserts runner_id == 0 which couples
it to internal runner ordering; change this to reference a stable identifier
instead: either assert runner_id equals a named constant (e.g.,
EXPECTED_RUNNER_INDEX) or, better, assert on the runner name or other stable
property returned by the bmm_fp8 autotuner (e.g., compare runner_name or
runner_identifier to an expected string/constant) and add a short comment
explaining why that specific runner is expected; update the assertion around
runner_id in test_autotuner_bmm_fp8.py to use that stable symbol (or replace it
with a comment if positional index must remain).
| ) | ||
| p1 = AutoTuner._find_nearest_profile(_moe_input_shapes(1000), config) | ||
| p2 = AutoTuner._find_nearest_profile(_moe_input_shapes(1023), config) | ||
| assert p1 == p2 |
There was a problem hiding this comment.
The values of p1 and p2:
test_autotuner_core.py::test_find_nearest_profile_moe_same_bucket_same_profile
p1=((512, 4096), (1000, 256), (1000, 8), (1000, 8), (1000, 4096), (1000, 32))
p2=((512, 4096), (1023, 256), (1023, 8), (1023, 8), (1023, 4096), (1023, 32))
^ ^ ^ ^ ^ ^
Only the first tuple had its first dim rounded to 512 (last_positive_power_of_2).
The correct result should be:
p1=((512, 4096), (512, 256), (512, 8), (512, 8), (512, 4096), (512, 32))
p2=((512, 4096), (512, 256), (512, 8), (512, 8), (512, 4096), (512, 32))
The dim0 is the num_tokens, the relevant code can be found in flashinfer function get_trtllm_moe_sm100_module:
# their first dimension is num_tokens which will be tuned
tuning_config_with_hidden_states_scales = TuningConfig(
dynamic_tensor_specs=(
DynamicTensorSpec(
(0, 1, 2, 3, 4, 5),
(0, 0, 0, 0, 0, 0),
get_last_power_of_2_num_tokens_buckets(8192, 1),
lambda x: min(last_positive_power_of_2(x), 8192),
dynamic_tensor_initializers,
),
)
)
|
I added the actual fix to xfail removed from the tests, now all the tests pass: And now we can see the following behavior: |
wenscarl
left a comment
There was a problem hiding this comment.
LGTM. In the future, we may want to add a guard to ensure the tensor is 2D and that all tensors share the same 0th dimension.
aleozlx
left a comment
There was a problem hiding this comment.
lgtm
pls run pre-commit (see instructions in the PR desc template)
there seems to be "whitespace-only" changes in init.py that shouldn't be possible going thru formatters in pre-commit hooks
|
Maybe update the title of this PR to include words like "fix". |
|
@aleozlx I ran pre-commit: I don't see an issue with the new No diff between the files: |
|
/bot run |
|
[FAILED] Pipeline #44647621: 14/20 passed |
|
tagged as ready to merge |
<!-- .github/pull_request_template.md --> ## 📌 Description PR #2617 added a fix that solves "using fallback tactic" for TRTLLM MoE kernels. But after running more tests (`lm_eval`) with flashinfer v0.6.5 another issue was found - an error from C++ file `csrc/trtllm_fused_moe_kernel_launcher.cu` (key not found in `launchers_map.at(tile_N)`). Fixing this is probably not simple, more details in this draft PR (**NOT** for v0.6.6): #2695 In order to prevent the crash, the change in `_find_nearest_profile` will be reverted (to match flashinfer v0.6.4). The relevant AutoTuner tests were marked with "skip": ``` tests/autotuner/test_autotuner_core.py::test_find_nearest_profile_moe_shared_num_tokens_axis[1000-512] SKIPPED (_find_nearest_profile linked-dimension mapping was reverted;...) tests/autotuner/test_autotuner_core.py::test_find_nearest_profile_moe_shared_num_tokens_axis[4000-2048] SKIPPED (_find_nearest_profile linked-dimension mapping was reverted...) tests/autotuner/test_autotuner_core.py::test_find_nearest_profile_moe_shared_num_tokens_axis[8000-4096] SKIPPED (_find_nearest_profile linked-dimension mapping was reverted...) tests/autotuner/test_autotuner_core.py::test_find_nearest_profile_moe_shared_num_tokens_axis[12000-8192] SKIPPED (_find_nearest_profile linked-dimension mapping was reverte...) tests/autotuner/test_autotuner_core.py::test_find_nearest_profile_moe_same_bucket_same_profile SKIPPED (_find_nearest_profile linked-dimension mapping was reverted; re-enab...) tests/autotuner/test_autotuner_core.py::test_find_nearest_profile_maps_all_linked_dims SKIPPED (_find_nearest_profile linked-dimension mapping was reverted; re-enable when ...) ``` The AutoTuner rest of the tests are all successful: ``` pytest --tb short tests/autotuner/ ================================================================================= test session starts ================================================================================== platform linux -- Python 3.12.3, pytest-9.0.2, pluggy-1.6.0 rootdir: /my_home/workspace/dani_flashinfer configfile: pytest.ini plugins: anyio-4.12.1 collected 39 items tests/autotuner/test_autotuner_bmm_fp8.py ............ [ 30%] tests/autotuner/test_autotuner_core.py ...........ssssss.......... [100%] ============================================================================ 33 passed, 6 skipped in 0.95s ============================================================================= ``` **Using this branch, the failure from `trtllm_fused_moe_kernel_launcher.cu` does not happen.** **vLLM main still uses flashinfer v0.6.4 (that does not include PR #2617).** **This change should be included in flashinfer v0.6.6 (for use by vLLM).** <!-- What does this PR do? Briefly describe the changes and why they’re needed. --> ## 🔍 Related Issues <!-- Link any related issues here --> ## 🚀 Pull Request Checklist Thank you for contributing to FlashInfer! Before we review your pull request, please make sure the following items are complete. ### ✅ Pre-commit Checks - [x] I have installed `pre-commit` by running `pip install pre-commit` (or used your preferred method). - [x] I have installed the hooks with `pre-commit install`. - [x] I have run the hooks manually with `pre-commit run --all-files` and fixed any reported issues. > If you are unsure about how to set up `pre-commit`, see [the pre-commit documentation](https://pre-commit.com/). ## 🧪 Tests - [x] Tests have been added or updated as needed. - [x] All tests are passing (`unittest`, etc.). ## Reviewer Notes <!-- Optional: anything you'd like reviewers to focus on, concerns, etc. --> <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Tests** * Temporarily disabled three autotuner tests pending restoration of linked-dimension bucket propagation functionality. Tests will be re-enabled once related features are restored. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
…lashinfer-ai#2617 Signed-off-by: amitz-nv <203509407+amitz-nv@users.noreply.github.com>
<!-- .github/pull_request_template.md --> ## 📌 Description PR flashinfer-ai#2617 added a fix that solves "using fallback tactic" for TRTLLM MoE kernels. But after running more tests (`lm_eval`) with flashinfer v0.6.5 another issue was found - an error from C++ file `csrc/trtllm_fused_moe_kernel_launcher.cu` (key not found in `launchers_map.at(tile_N)`). Fixing this is probably not simple, more details in this draft PR (**NOT** for v0.6.6): flashinfer-ai#2695 In order to prevent the crash, the change in `_find_nearest_profile` will be reverted (to match flashinfer v0.6.4). The relevant AutoTuner tests were marked with "skip": ``` tests/autotuner/test_autotuner_core.py::test_find_nearest_profile_moe_shared_num_tokens_axis[1000-512] SKIPPED (_find_nearest_profile linked-dimension mapping was reverted;...) tests/autotuner/test_autotuner_core.py::test_find_nearest_profile_moe_shared_num_tokens_axis[4000-2048] SKIPPED (_find_nearest_profile linked-dimension mapping was reverted...) tests/autotuner/test_autotuner_core.py::test_find_nearest_profile_moe_shared_num_tokens_axis[8000-4096] SKIPPED (_find_nearest_profile linked-dimension mapping was reverted...) tests/autotuner/test_autotuner_core.py::test_find_nearest_profile_moe_shared_num_tokens_axis[12000-8192] SKIPPED (_find_nearest_profile linked-dimension mapping was reverte...) tests/autotuner/test_autotuner_core.py::test_find_nearest_profile_moe_same_bucket_same_profile SKIPPED (_find_nearest_profile linked-dimension mapping was reverted; re-enab...) tests/autotuner/test_autotuner_core.py::test_find_nearest_profile_maps_all_linked_dims SKIPPED (_find_nearest_profile linked-dimension mapping was reverted; re-enable when ...) ``` The AutoTuner rest of the tests are all successful: ``` pytest --tb short tests/autotuner/ ================================================================================= test session starts ================================================================================== platform linux -- Python 3.12.3, pytest-9.0.2, pluggy-1.6.0 rootdir: /my_home/workspace/dani_flashinfer configfile: pytest.ini plugins: anyio-4.12.1 collected 39 items tests/autotuner/test_autotuner_bmm_fp8.py ............ [ 30%] tests/autotuner/test_autotuner_core.py ...........ssssss.......... [100%] ============================================================================ 33 passed, 6 skipped in 0.95s ============================================================================= ``` **Using this branch, the failure from `trtllm_fused_moe_kernel_launcher.cu` does not happen.** **vLLM main still uses flashinfer v0.6.4 (that does not include PR flashinfer-ai#2617).** **This change should be included in flashinfer v0.6.6 (for use by vLLM).** <!-- What does this PR do? Briefly describe the changes and why they’re needed. --> ## 🔍 Related Issues <!-- Link any related issues here --> ## 🚀 Pull Request Checklist Thank you for contributing to FlashInfer! Before we review your pull request, please make sure the following items are complete. ### ✅ Pre-commit Checks - [x] I have installed `pre-commit` by running `pip install pre-commit` (or used your preferred method). - [x] I have installed the hooks with `pre-commit install`. - [x] I have run the hooks manually with `pre-commit run --all-files` and fixed any reported issues. > If you are unsure about how to set up `pre-commit`, see [the pre-commit documentation](https://pre-commit.com/). ## 🧪 Tests - [x] Tests have been added or updated as needed. - [x] All tests are passing (`unittest`, etc.). ## Reviewer Notes <!-- Optional: anything you'd like reviewers to focus on, concerns, etc. --> <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Tests** * Temporarily disabled three autotuner tests pending restoration of linked-dimension bucket propagation functionality. Tests will be re-enabled once related features are restored. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
…flashinfer-ai#2617) <!-- .github/pull_request_template.md --> ## 📌 Description I've found a possible bug in the AutoTuner (in function `_find_nearest_profile`). Before attempting to fix the bug, we need unit test for the AutoTuner to verify its functionality. The new tests pass (with xfails for the known bug): ``` cd tests/autotuner/ pytest --tb=short . ======================================================== test session starts ========================================================= platform linux -- Python 3.12.3, pytest-8.4.2, pluggy-1.6.0 rootdir: ... configfile: pytest.ini plugins: anyio-4.11.0 collected 39 items test_autotuner_bmm_fp8.py ............ [ 30%] test_autotuner_core.py ...........xxxxxx.......... [100%] ``` The `xxxxxx` are the **xfail** tests. <!-- What does this PR do? Briefly describe the changes and why they’re needed. --> ## 🔍 Related Issues <!-- Link any related issues here --> ## 🚀 Pull Request Checklist Thank you for contributing to FlashInfer! Before we review your pull request, please make sure the following items are complete. ### ✅ Pre-commit Checks - [x] I have installed `pre-commit` by running `pip install pre-commit` (or used your preferred method). - [x] I have installed the hooks with `pre-commit install`. - [x] I have run the hooks manually with `pre-commit run --all-files` and fixed any reported issues. > If you are unsure about how to set up `pre-commit`, see [the pre-commit documentation](https://pre-commit.com/). ## 🧪 Tests - [x] Tests have been added or updated as needed. - [ ] All tests are passing (`unittest`, etc.). ## Reviewer Notes <!-- Optional: anything you'd like reviewers to focus on, concerns, etc. --> <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Tests** * Added tests validating FP8 GEMM autotuning: hardware capability checks, autotuner cache behavior across cold/tune/warm scenarios, optional pre-tune flow, and FP8→FP16 correctness with finiteness assertions. * Added extensive AutoTuner core tests covering cache management, profile generation and selection, input-preparation behavior, tuning workflows, dynamic bucketing, edge cases, and expected failures to ensure robust autotuner logic. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Signed-off-by: Daniel Serebrenik <daserebrenik@nvidia.com> Signed-off-by: Amey Naik <212485788+ameynaik-hub@users.noreply.github.com>
<!-- .github/pull_request_template.md --> ## 📌 Description PR flashinfer-ai#2617 added a fix that solves "using fallback tactic" for TRTLLM MoE kernels. But after running more tests (`lm_eval`) with flashinfer v0.6.5 another issue was found - an error from C++ file `csrc/trtllm_fused_moe_kernel_launcher.cu` (key not found in `launchers_map.at(tile_N)`). Fixing this is probably not simple, more details in this draft PR (**NOT** for v0.6.6): flashinfer-ai#2695 In order to prevent the crash, the change in `_find_nearest_profile` will be reverted (to match flashinfer v0.6.4). The relevant AutoTuner tests were marked with "skip": ``` tests/autotuner/test_autotuner_core.py::test_find_nearest_profile_moe_shared_num_tokens_axis[1000-512] SKIPPED (_find_nearest_profile linked-dimension mapping was reverted;...) tests/autotuner/test_autotuner_core.py::test_find_nearest_profile_moe_shared_num_tokens_axis[4000-2048] SKIPPED (_find_nearest_profile linked-dimension mapping was reverted...) tests/autotuner/test_autotuner_core.py::test_find_nearest_profile_moe_shared_num_tokens_axis[8000-4096] SKIPPED (_find_nearest_profile linked-dimension mapping was reverted...) tests/autotuner/test_autotuner_core.py::test_find_nearest_profile_moe_shared_num_tokens_axis[12000-8192] SKIPPED (_find_nearest_profile linked-dimension mapping was reverte...) tests/autotuner/test_autotuner_core.py::test_find_nearest_profile_moe_same_bucket_same_profile SKIPPED (_find_nearest_profile linked-dimension mapping was reverted; re-enab...) tests/autotuner/test_autotuner_core.py::test_find_nearest_profile_maps_all_linked_dims SKIPPED (_find_nearest_profile linked-dimension mapping was reverted; re-enable when ...) ``` The AutoTuner rest of the tests are all successful: ``` pytest --tb short tests/autotuner/ ================================================================================= test session starts ================================================================================== platform linux -- Python 3.12.3, pytest-9.0.2, pluggy-1.6.0 rootdir: /my_home/workspace/dani_flashinfer configfile: pytest.ini plugins: anyio-4.12.1 collected 39 items tests/autotuner/test_autotuner_bmm_fp8.py ............ [ 30%] tests/autotuner/test_autotuner_core.py ...........ssssss.......... [100%] ============================================================================ 33 passed, 6 skipped in 0.95s ============================================================================= ``` **Using this branch, the failure from `trtllm_fused_moe_kernel_launcher.cu` does not happen.** **vLLM main still uses flashinfer v0.6.4 (that does not include PR flashinfer-ai#2617).** **This change should be included in flashinfer v0.6.6 (for use by vLLM).** <!-- What does this PR do? Briefly describe the changes and why they’re needed. --> ## 🔍 Related Issues <!-- Link any related issues here --> ## 🚀 Pull Request Checklist Thank you for contributing to FlashInfer! Before we review your pull request, please make sure the following items are complete. ### ✅ Pre-commit Checks - [x] I have installed `pre-commit` by running `pip install pre-commit` (or used your preferred method). - [x] I have installed the hooks with `pre-commit install`. - [x] I have run the hooks manually with `pre-commit run --all-files` and fixed any reported issues. > If you are unsure about how to set up `pre-commit`, see [the pre-commit documentation](https://pre-commit.com/). ## 🧪 Tests - [x] Tests have been added or updated as needed. - [x] All tests are passing (`unittest`, etc.). ## Reviewer Notes <!-- Optional: anything you'd like reviewers to focus on, concerns, etc. --> <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Tests** * Temporarily disabled three autotuner tests pending restoration of linked-dimension bucket propagation functionality. Tests will be re-enabled once related features are restored. <!-- end of auto-generated comment: release notes by coderabbit.ai --> Signed-off-by: Amey Naik <212485788+ameynaik-hub@users.noreply.github.com>
…lashinfer-ai#2617 Signed-off-by: amitz-nv <203509407+amitz-nv@users.noreply.github.com>
📌 Description
I've found a possible bug in the AutoTuner (in function
_find_nearest_profile).Before attempting to fix the bug, we need unit test for the AutoTuner to verify its functionality.
The new tests pass (with xfails for the known bug):
The
xxxxxxare the xfail tests.🔍 Related Issues
🚀 Pull Request Checklist
Thank you for contributing to FlashInfer! Before we review your pull request, please make sure the following items are complete.
✅ Pre-commit Checks
pre-commitby runningpip install pre-commit(or used your preferred method).pre-commit install.pre-commit run --all-filesand fixed any reported issues.🧪 Tests
unittest, etc.).Reviewer Notes
Summary by CodeRabbit