Skip to content

Conversation

@alexcovington
Copy link
Contributor

This is a small change that removes the redundant PopCountOperator.Vectorizable guards from the TrailingZeroCount.Invoke paths for Vector256 and Vector512. The Invoke functions inlines better by removing these guards and results in a performance gain.

| Type                                      | Method            | Job        | Toolchain                   | BufferLength | Mean       | Error     | StdDev    | Median     | Min        | Max        | Ratio | Allocated | Alloc Ratio |
|------------------------------------------ |------------------ |----------- |---------------------------- |------------- |-----------:|----------:|----------:|-----------:|-----------:|-----------:|------:|----------:|------------:|
| Perf_BinaryIntegerTensorPrimitives<Byte>  | TrailingZeroCount | Job-OWIWOI | \base\Core_Root\corerun.exe | 128          |  39.416 ns | 0.3226 ns | 0.3017 ns |  39.400 ns |  38.873 ns |  39.953 ns |  1.00 |         - |          NA |
| Perf_BinaryIntegerTensorPrimitives<Byte>  | TrailingZeroCount | Job-VUQMWI | \diff\Core_Root\corerun.exe | 128          |   5.588 ns | 0.0560 ns | 0.0467 ns |   5.603 ns |   5.494 ns |   5.647 ns |  0.14 |         - |          NA |
|                                           |                   |            |                             |              |            |           |           |            |            |            |       |           |             |
| Perf_BinaryIntegerTensorPrimitives<Int32> | TrailingZeroCount | Job-OWIWOI | \base\Core_Root\corerun.exe | 128          | 110.643 ns | 0.2528 ns | 0.2365 ns | 110.560 ns | 110.404 ns | 111.067 ns |  1.00 |         - |          NA |
| Perf_BinaryIntegerTensorPrimitives<Int32> | TrailingZeroCount | Job-VUQMWI | \diff\Core_Root\corerun.exe | 128          |  14.518 ns | 0.0879 ns | 0.0822 ns |  14.492 ns |  14.430 ns |  14.677 ns |  0.13 |         - |          NA |
|                                           |                   |            |                             |              |            |           |           |            |            |            |       |           |             |
| Perf_BinaryIntegerTensorPrimitives<Byte>  | TrailingZeroCount | Job-OWIWOI | \base\Core_Root\corerun.exe | 3079         | 204.174 ns | 0.5751 ns | 0.5098 ns | 204.422 ns | 202.991 ns | 204.618 ns |  1.00 |         - |          NA |
| Perf_BinaryIntegerTensorPrimitives<Byte>  | TrailingZeroCount | Job-VUQMWI | \diff\Core_Root\corerun.exe | 3079         |  62.966 ns | 0.2158 ns | 0.1802 ns |  62.942 ns |  62.719 ns |  63.306 ns |  0.31 |         - |          NA |
|                                           |                   |            |                             |              |            |           |           |            |            |            |       |           |             |
| Perf_BinaryIntegerTensorPrimitives<Int32> | TrailingZeroCount | Job-OWIWOI | \base\Core_Root\corerun.exe | 3079         | 800.225 ns | 7.7147 ns | 7.2163 ns | 800.139 ns | 790.602 ns | 814.042 ns |  1.00 |         - |          NA |
| Perf_BinaryIntegerTensorPrimitives<Int32> | TrailingZeroCount | Job-VUQMWI | \diff\Core_Root\corerun.exe | 3079         | 301.138 ns | 1.0371 ns | 0.8097 ns | 300.876 ns | 300.098 ns | 302.453 ns |  0.38 |         - |          NA |

@github-actions github-actions bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Jun 16, 2025
@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Jun 16, 2025
@MihaZupan MihaZupan added area-System.Numerics.Tensors and removed needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners labels Jun 16, 2025
@dotnet-policy-service
Copy link
Contributor

Tagging subscribers to this area: @dotnet/area-system-numerics-tensors
See info in area-owners.md if you want to be subscribed.

@tannergooding
Copy link
Member

/ba-g unrelated infra issues

@tannergooding tannergooding merged commit 3ae4edb into dotnet:main Jun 18, 2025
86 of 90 checks passed
@tannergooding
Copy link
Member

Thanks for the improvement!

@MihaZupan
Copy link
Member

Should we have a JIT tracking issue for this? Seems like quite the difference for something that should have been stripped out already.

@tannergooding
Copy link
Member

I don't think there's much the JIT can do here. It's namely due to it requiring 3 total inlining steps to see that it ultimately becomes a constant. It's a bit of a niche case and the removed check here wasn't actually needed anyways

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

area-System.Numerics.Tensors community-contribution Indicates that the PR has been added by a community member

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants