Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MutableHttpFields.asImmutable avoids copy #10651

Merged
merged 19 commits into from
Oct 16, 2023

Conversation

gregw
Copy link
Contributor

@gregw gregw commented Oct 2, 2023

Avoid a copy in MutableHttpFields.asImmutable if the mutable is never mutated again.
Other minor improvements are included:

  • Better iterator usage
  • Use StringUtil.asciiEqualsIgnoreCase for HttpHeader#is(String)
  • Case sensitive HttpMethod#is(String)

Avoid a copy in MutableHttpFields.asImmutable if the mutable is never mutated again.
@gregw gregw requested review from sbordet and lorban October 2, 2023 12:03
@@ -59,7 +58,7 @@ public HttpFields asImmutable()
public int hashCode()
{
int hash = 0;
Copy link
Contributor

@joakime joakime Oct 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't the hash function start with an arbitrary prime number? (not 0)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, since we do xor's in the loop, starting with a hash of 31 would improve the spreading.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gregw this should be a prime number.

@lorban
Copy link
Contributor

lorban commented Oct 2, 2023

@gregw FYI the testBasicAuthenticationWithAuthenticationRemoved test failures are caused by your MutableHttpFields.asImmutable() changes.

Avoid a copy in MutableHttpFields.asImmutable if the mutable is never mutated again.
Do not persist a defaulted charset used in the request.
Throw UnsupportedEncodingException from getReader
Improve performance with asciiEqualsIgnoreCase
HttpMethod is case-insensitive
* @return true if the string ends with the substring, ignoring {@link StandardCharsets#US_ASCII} case differences.
* @deprecated Use {@link #asciiEndsWithIgnoreCase(String, String)}
*/
@Deprecated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Deprecated(since = "12.0.3") ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not against the since attribute, but either we are consistent, run through the code add it to all deprecations, and we have a policy, otherwise it's not so useful and inconsistent.

@gregw gregw requested review from lorban and joakime October 3, 2023 12:27
Copy link
Contributor

@joakime joakime left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some open thoughts

lorban
lorban previously approved these changes Oct 4, 2023
@gregw gregw requested review from sbordet and joakime October 4, 2023 12:57
joakime
joakime previously approved these changes Oct 4, 2023
Copy link
Contributor

@joakime joakime left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hashCode() implementation in ImmutableHttpFields is non-ideal.
But that hints at a larger issue in the codebase.
Probably should review that across the codebase in a future PR.

@@ -59,7 +58,7 @@ public HttpFields asImmutable()
public int hashCode()
{
int hash = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gregw this should be a prime number.

@gregw
Copy link
Contributor Author

gregw commented Oct 4, 2023

@joakime this is already a bit of an omnibus PR, so I have fixed the other hash calculations I can see that are based on 0.

}
else if (fields != null)
{
_fields = new HttpField[fields.size() + INITIAL_SIZE];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
_fields = new HttpField[fields.size() + INITIAL_SIZE];
_fields = new HttpField[fields.size() + SIZE_INCREMENT];

@lorban
Copy link
Contributor

lorban commented Oct 5, 2023

@gregw I have an extra set of small but potentially controversial changes worth 1-2% of perf in my perf/jetty-12-HttpURI-violations branch. Would you like me to push them here?

@@ -479,10 +479,10 @@ private static class RetainedBucket
private RetainedBucket(int capacity, int poolSize)
{
if (poolSize <= ConcurrentPool.OPTIMAL_MAX_SIZE)
_pool = new ConcurrentPool<>(ConcurrentPool.StrategyType.THREAD_ID, poolSize, true);
_pool = new ConcurrentPool<>(ConcurrentPool.StrategyType.THREAD_ID, poolSize, false);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lorban I like this change as THREAD_ID and the ThreadLocal cache are kind of trying to do the same thing anyway.
However, I wonder if we could improve the THREAD_ID algorithm a little bit. Currently it is used as:

            case THREAD_ID -> (int)(Thread.currentThread().getId() % size);

The problem here is that typically the threads in use will be consecutive IDs, so if there are 50 threads in use and the pool is 100 in size, it may be that 50 consecutive entries are all in use and that an acquire by a thread that already has used one buffer will now need to scan over 49 others to find an empty slot. Perhaps we should be treating it a little more like an initial hash and spreading the thread ID over the available size more. Perhaps the following would work a little bit better:

            case THREAD_ID -> (int)((Thread.currentThread().getId() * 31) % size);

(31 selected because it is prime)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I modified the THREAD_ID strategy to multiply by 31, ran the benchmark, and measured no noticeable change.

Using the thread id as an index combined with the randomness of the OS' scheduler is already spreading the threads sufficiently well, it seems.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still it might be best to keep the * 31. It is probably rare that all threads need buffers at exactly the same time, but I also think it could occur. It may be the benchmark is not the right pattern to make it happen. So you saw no cost to doing this, so let's keep it.

Copy link
Contributor

@lorban lorban Oct 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wrote a micro benchmark to see exactly what's going on and the * 31 costs exactly two cheap instructions: shift left by 5, then subtract once.

No big deal, so I'm okay to keep this change.

@gregw gregw requested a review from sbordet October 10, 2023 01:46
@gregw
Copy link
Contributor Author

gregw commented Oct 11, 2023

@sbordet nudge

@gregw gregw dismissed stale reviews from joakime and lorban via dc97245 October 13, 2023 13:29
@gregw gregw merged commit ffe80cd into jetty-12.0.x Oct 16, 2023
2 checks passed
@gregw gregw deleted the experiment/12/HttpFields_asImmutable branch October 16, 2023 21:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Status: ✅ Done
Development

Successfully merging this pull request may close these issues.

4 participants