Skip to content

remove HAS_FAST_INTEGER_VECTORS checks from panama support.#14901

Merged
rmuir merged 3 commits intoapache:mainfrom
rmuir:panama_avx2
Jul 11, 2025
Merged

remove HAS_FAST_INTEGER_VECTORS checks from panama support.#14901
rmuir merged 3 commits intoapache:mainfrom
rmuir:panama_avx2

Conversation

@rmuir
Copy link
Member

@rmuir rmuir commented Jul 6, 2025

For any integer vectorization code, the developer must remember to make this check, or suffer a 10x+ slowdown if AVX2 is unavailable. This can happen in virtual environments (default QEMU, virtualbox, etc).

It isn't worth the benefit of supporting floating point vectors on such machines, just remove this trap completely.

For any integer vectorization code, the developer must remember to make
this check, or suffer a 10x+ slowdown if AVX2 is unavailable. This can
happen in virtual environments (default QEMU, virtualbox, etc).

It isn't worth the benefit of supporting floating point vectors on such
machines, just remove this trap completely.
@github-actions
Copy link
Contributor

github-actions bot commented Jul 6, 2025

This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR.

@rmuir rmuir requested a review from uschindler July 6, 2025 16:24
@rmuir
Copy link
Member Author

rmuir commented Jul 6, 2025

@uschindler this one may impact jenkins randomization (simplify it, I think).

Copy link
Contributor

@ChrisHegarty ChrisHegarty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@jpountz jpountz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the simplification.

@msokolov
Copy link
Contributor

msokolov commented Jul 7, 2025

so now we throw an exception instead to guard the unwary from falling into the trap? Makes sense to me.

@rmuir
Copy link
Member Author

rmuir commented Jul 7, 2025

@msokolov previously, on x86 machines without AVX2 (e.g. AVX1 or SSSE), the vector API was still "enabled". But it is a trap, as many operations are not supported, and run 10x slower (or more).

With this change, we just disable vectors completely on such setups (floating point too). It means developers don't need to guard all the integer functions with if (HAS_FAST_INTEGER_VECTORS).

Goal is to remove the trap: these days you are most likely to encounter such a setup via virtualization, that isn't setup in a fully optimized way (e.g. casual desktop user with virtualbox or qemu or something).

@rmuir
Copy link
Member Author

rmuir commented Jul 7, 2025

The bug Uwe found is truly crazy. Need to step back here and fix SOMETHING to fail on it...

No java linter finds the issue at compile time, which is really surprising to me. With clang the problem is caught immediately by -Wformat-insufficient-args, it is the kind of silly mistake I'd expect to be caught at compile time.

Screen_Shot_2025-07-07_at_09 15 24

At runtime, the test passes, UNLESS you run it with -Dtests.verbose, then it fails.

Screen_Shot_2025-07-07_at_09 20 49

In the short-term, I think this is the easiest to address. I know we don't want tests to print but maybe we can hack the test so it fails without -Dtests.verbose

@rmuir
Copy link
Member Author

rmuir commented Jul 7, 2025

Theres a subfolder of related checks in error-prone to investigate: https://github.com/google/error-prone/tree/d17e312d982badf2070404dde11b88c2538a5222/core/src/main/java/com/google/errorprone/bugpatterns/formatstring

Need to reaudit all the available checks anyway (as clearly these were missed by me before), but this might be a good compile-time way to find it.

@uschindler
Copy link
Contributor

@uschindler
Copy link
Contributor

At runtime, the test passes, UNLESS you run it with -Dtests.verbose, then it fails.

Actually it doe snot throw the exception visible to the enduser. Just the provider interface's ctor throws it so the SPI implementation falls back to default implementation. This disables Panama Vectors completely.

Previously the downstream code in the Panama SPI had to do the constant check at many places.

@uschindler
Copy link
Contributor

uschindler commented Jul 7, 2025

At runtime, the test passes, UNLESS you run it with -Dtests.verbose, then it fails.

I still don't understand why it does not fail the tests. The info loggig is always executed. Maybe it fails before due to randomization sometimes?

We want to test Panama Vector code also on non-supported hardware, so it creates a "Test instance" of the provider which is slow, but not on production.

I will check close later, just update code first. It is a good thing to remove complexity for "seldom" CPU flag combinations. Actually in KVM w/ QEMU (when using libvirt) it never disables if you have selected CPU type "Host Passthrough" (like Policeman Jenkins).

@uschindler
Copy link
Contributor

@uschindler this one may impact jenkins randomization (simplify it, I think).

Should not affect Jenkins, we don't randomize that flag, it is/was done by Gradle itsself.

@dweiss
Copy link
Contributor

dweiss commented Jul 7, 2025

At runtime, the test passes, UNLESS you run it with -Dtests.verbose, then it fails.

What's the repro line for this? I don't think this should matter (gradle-wise); LuceneTestCase uses this flag - to set the verbose constant, maybe something depends on this.

public static final boolean VERBOSE = systemPropertyAsBoolean("tests.verbose", false);

}

if (PanamaVectorConstants.HAS_FAST_INTEGER_VECTORS == false) {
throw new UnsupportedOperationException("No integer vector support");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd change this message a bit more verbose as this is logged to enduser in a warning (see VectorizationProvider around line 169).

Maybe "CPU flags do not guarantee fast integer vectors".

@uschindler
Copy link
Contributor

At runtime, the test passes, UNLESS you run it with -Dtests.verbose, then it fails.

What's the repro line for this? I don't think this should matter (gradle-wise); LuceneTestCase uses this flag - to set the verbose constant, maybe something depends on this.

public static final boolean VERBOSE = systemPropertyAsBoolean("tests.verbose", false);

I think the test should fail due to the IllegalArgumentException anyways. There is some stupid bullshit going on. Verbose or not should both create the same result.

As said before, I think this did not fail on the github runner, because the virtualization does not enable panama at all. And a "test only" enforcement of bitsize is done randomly, so it won't fail all the time.

The problem with the above String#format() is that it is only executed if everything is sane with your CPU/virtualization and Hotspot flags.

@uschindler
Copy link
Contributor

Hi for me the test failed without verbose:

$ gradlew :lucene:core:test --tests TestVectorUtil

> Task :lucene:core:test
WARNING: Using incubator modules: jdk.incubator.vector

TestVectorUtil > classMethod FAILED
    java.util.MissingFormatArgumentException: Format specifier '%s'
        at java.base/java.util.Formatter.format(Formatter.java:2760)
        at java.base/java.util.Formatter.format(Formatter.java:2698)
        at java.base/java.lang.String.format(String.java:4509)
        at org.apache.lucene.internal.vectorization.PanamaVectorizationProvider.<init>(PanamaVectorizationProvider.java:56)
        at org.apache.lucene.internal.vectorization.VectorizationProvider.lookup(VectorizationProvider.java:168)
        at org.apache.lucene.internal.vectorization.BaseVectorizationTestCase.maybePanamaProvider(BaseVectorizationTestCase.java:39)
        at org.apache.lucene.internal.vectorization.BaseVectorizationTestCase.<clinit>(BaseVectorizationTestCase.java:25)
        at org.apache.lucene.util.TestVectorUtil.<clinit>(TestVectorUtil.java:404)
        at java.base/java.lang.Class.forName0(Native Method)
        at java.base/java.lang.Class.forName(Class.java:543)
        at com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:631)

org.apache.lucene.util.TestVectorUtil > test suite's output saved to C:\Users\Uwe Schindler\Projects\lucene\lucene\lucene\core\build\test-results\test\outputs\OUTPUT-org.apache.lucene.util.TestVectorUtil.txt, copied below:
   >     java.util.MissingFormatArgumentException: Format specifier '%s'
   >         at java.base/java.util.Formatter.format(Formatter.java:2760)
   >         at java.base/java.util.Formatter.format(Formatter.java:2698)
   >         at java.base/java.lang.String.format(String.java:4509)
   >         at org.apache.lucene.internal.vectorization.PanamaVectorizationProvider.<init>(PanamaVectorizationProvider.java:56)
   >         at org.apache.lucene.internal.vectorization.VectorizationProvider.lookup(VectorizationProvider.java:168)
   >         at org.apache.lucene.internal.vectorization.BaseVectorizationTestCase.maybePanamaProvider(BaseVectorizationTestCase.java:39)
   >         at org.apache.lucene.internal.vectorization.BaseVectorizationTestCase.<clinit>(BaseVectorizationTestCase.java:25)
   >         at org.apache.lucene.util.TestVectorUtil.<clinit>(TestVectorUtil.java:404)
   >         at java.base/java.lang.Class.forName0(Native Method)
   >         at java.base/java.lang.Class.forName(Class.java:543)
   >         at com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:631)

:lucene:core:test (FAILURE): 1 test(s), 1 failure(s)

@dweiss
Copy link
Contributor

dweiss commented Jul 7, 2025

I think the test should fail due to the IllegalArgumentException anyways. There is some stupid bullshit going on. Verbose or not should both create the same result.

It's not about tests.verbose - it's about the seed. This code is either hit or not, depending on the main tests.seed - some other randomized, derived, parameters are affecting it.

@dweiss
Copy link
Contributor

dweiss commented Jul 7, 2025

This:

    // hotspot misses some SSE intrinsics, workaround it
    // to be fair, they do document this thing only works well with AVX2/AVX3 and Neon
    boolean isAMD64withoutAVX2 =
        Constants.OS_ARCH.equals("amd64") && PREFERRED_VECTOR_BITSIZE < 256;
    HAS_FAST_INTEGER_VECTORS = isAMD64withoutAVX2 == false;

the vector size is randomized from this pool:

    String randomVectorSize =
        RandomPicks.randomFrom(
            new Random(buildGlobals.getProjectSeedAsLong().get()),
            List.of("default", "128", "256", "512"));

if it's 128, that code path is never reached.

@uschindler
Copy link
Contributor

I think the test should fail due to the IllegalArgumentException anyways. There is some stupid bullshit going on. Verbose or not should both create the same result.

It's not about tests.verbose - it's about the seed. This code is either hit or not, depending on the main tests.seed - some other randomized, derived, parameters are affecting it.

Exactly! When running it multiple times it fails 1/3 of it.... If the randomly assigned vector bitsize does not match expectations it fails.

The problem with this PR is: We don't test the integer vectors anymore because Robert removed the "forceIntegerVectors" flag. I think it should stay alive (so please revert removal of the sysprop).

@dweiss
Copy link
Contributor

dweiss commented Jul 7, 2025

So this will never fail:

./gradlew -p lucene/core test  --tests TestVectorUtil -Ptests.seed=6EA802BC0CAA73E2

but this will always fail:

./gradlew -p lucene/core test  --tests TestVectorUtil -Ptests.seed=6EA802BC0CAA73E2 -Ptests.vectorsize=256

@uschindler
Copy link
Contributor

Ok, so the vectorSize is missing in the test reproducer line.

Anyways, thats an old issue. My problem with Robert's patch is that he removed the second system property tests.forceintegervectors. This allowed us to test integer vectors correctness.

To keep the sysprop away, my proposal would be: Whenever somebody sets the "bitsize" we should NOT bail out and execute the tests in TestVectorUtil.

@uschindler
Copy link
Contributor

We should only apply this PR on main branch, because I don't want people to get sudden slowdowns in 10.x branch.

@rmuir
Copy link
Member Author

rmuir commented Jul 7, 2025

We should only apply this PR on main branch, because I don't want people to get sudden slowdowns in 10.x branch.

Who would get a slowdown? If they are impacted by this PR, they've probably got a 10x slowdown happening somewhere already :)

I think the ongoing risk would be, that developer adds a new function to PanamaVectorUtilSupport.java, backports it, and then it causes a new 10x-20x slowdown.

That's the motivation behind this PR anyway. It shouldn't slow anyone down, just requires AVX2 on amd64 machines before we try to vectorize. IMO best to disable vectorization completely in such cases, it will actually prevent a 10x slowdown. "supporting" some SSSE-based 2x speedup for a couple floating point methods isn't worth it?

Constants.OS_ARCH.equals("amd64") && PREFERRED_VECTOR_BITSIZE < 256;
HAS_FAST_INTEGER_VECTORS =
VectorizationProvider.TESTS_FORCE_INTEGER_VECTORS || (isAMD64withoutAVX2 == false);
HAS_FAST_INTEGER_VECTORS = isAMD64withoutAVX2 == false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above and let's do it like that:

HAS_FAST_INTEGER_VECTORS = isAMD64withoutAVX2 == false || TESTS_VECTOR_SIZE.isPresent()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the special test mode I'd like to test our code, although CPU is wrong.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we can better contain this constant we can also give it a better name. The idea is: if you are on x86, you need AVX2 as a minimum baseline for us to support vectors.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's OK. I just want the condition to stay alive here.

Copy link
Contributor

@uschindler uschindler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see my comments

@uschindler
Copy link
Contributor

uschindler commented Jul 7, 2025

if it's 128, that code path is never reached.

That's the reason why I want the test mode keep testing integer vectors also on Intel CPUs without AVX2.

@rmuir
Copy link
Member Author

rmuir commented Jul 7, 2025

There's nothing to test. This change makes it so developers only support avx2 and avx512. That's the idea.

I will fix the name of the confusing constant and consider inverting the logic to call it something like "ancientIntel". It's not related to anything about integers anymore. If you want vectors on x86 you need AVX256 or AVX512.

@uschindler
Copy link
Contributor

There's nothing to test. This change makes it so developers only support avx2 and avx512. That's the idea.

I will fix the name of the confusing constant and consider inverting the logic to call it something like "ancientIntel". It's not related to anything about integers anymore. If you want vectors on x86 you need AVX256 or AVX512.

Please read again what I said: I want the test mode (where we randomize also preferred bitsize) to also always execute the correctness of the integer code. Testing can be slow, but the integer code has to be tested. Therefor please remove the UOE in test mode. That's plain simple.

If you don't agree with that let's remove all test-mode code. Now it is incomplete.

@rmuir
Copy link
Member Author

rmuir commented Jul 7, 2025

I understand but the terminology is killing us. There's no more integer-specific stuff anymore, just 128bit case. And we should test it... Slowly, and maybe throw in 64 too!

But the purpose of this change is to remove the confusion around integers and floats. Instead it's just 128, 256, 512. If you are on Intel and only have 128, you get no special sauce.

@uschindler
Copy link
Contributor

Thats all fine, then remove the constant. It is no longer risky because the bail out is at one single place: In production

Maybe add a constant called "PANAMA_TEST_MODE" that is enabled once you set bitsize. Let's remove the second sysprop, just keep the "bitsize".

Once any bitsize is enabled, lets stay in test mode and therefor the tests for integers should run. That is a one-line change, so it does not add confusion, just rename the constant.

I am just waiting for you to finish and then add my changes, I did not want to interfere with you.

Sorry: I wrote the original code with the systemprops. It's a bit complicated, so I only try to help. And because of the problem with the formatter we noticed the problem at all: Github runner never executes Panama Tests anymore, because it has no AVX2.

@rmuir
Copy link
Member Author

rmuir commented Jul 7, 2025

Its helpful, it points out the confusion. The constant needs renaming, to really improve this.

Separately testing needs "backdoor" (which may not be working correctly with this PR in its current state), otherwise nothing will reproduce for anyone, because everyone has different hardware.

@uschindler
Copy link
Contributor

I can apply my modifications. I'd suggest to not add a constant and remove the property as you did.

I will just change the code to test integer vectors when a enforced bitsize is also given by -Ptests.vectorsize=...

@uschindler
Copy link
Contributor

I would also remove the constant so people don't even think about checking HAS_FAST_INTEGER_VECTORS.

@rmuir
Copy link
Member Author

rmuir commented Jul 7, 2025

if you have the time, please feel free. I won't get to this until later.

@github-actions
Copy link
Contributor

github-actions bot commented Jul 7, 2025

This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR.

@uschindler
Copy link
Contributor

OK changes pushed.

@uschindler
Copy link
Contributor

When verifying the PanamaVectorConstants file I found out that PRERERRED_LONG_SPECIES is unused field. Should we remove?

@rmuir
Copy link
Member Author

rmuir commented Jul 8, 2025

When verifying the PanamaVectorConstants file I found out that PRERERRED_LONG_SPECIES is unused field. Should we remove?

+1 to nuke it. this is the last place we want to have dead code...

@uschindler
Copy link
Contributor

When verifying the PanamaVectorConstants file I found out that PRERERRED_LONG_SPECIES is unused field. Should we remove?

+1 to nuke it. this is the last place we want to have dead code...

done.

@github-actions
Copy link
Contributor

github-actions bot commented Jul 8, 2025

This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR.

@rmuir rmuir merged commit 41a4a1d into apache:main Jul 11, 2025
8 checks passed
@rmuir
Copy link
Member Author

rmuir commented Jul 11, 2025

I merged this (hopefully not prematurely) because I'm looking into options to enforce correct checks are happening and this makes the code confusing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants