Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue #8769 - Introduce new Compression Handler with support for gzip, brotli, and zstandard #12075

Merged
merged 1 commit into from
Oct 30, 2024

Conversation

joakime
Copy link
Contributor

@joakime joakime commented Jul 22, 2024

Introduces new jetty-compression libs.

  • jetty-compression-common is the common location for Compression classes (neither server or client specific, just compression)
  • jetty-compression-gzip is the implementation of the generic compression lib to support Content-Encoding: gzip
  • jetty-compression-brotli is the implementation of the generic compression lib to support Content-Encoding: br (brotli)
  • jetty-compression-zstandard is the implementation of the generic compression lib to support Content-Encoding: zstd (zstandard)
  • jetty-compression-server is the implementation of the DynamicCompressionHandler (and associated Request/Response wrappers)
  • jetty-compression-tests is the integration tests of features.

Fixes: #8769

@joakime joakime changed the title Issue #8769 - Implement Dynamic Compression Handler with support for gzip, brotli, and zstd Issue #8769 - Implement Dynamic Compression Handler with support for gzip, brotli, and zstandard Jul 22, 2024
@gregw
Copy link
Contributor

gregw commented Aug 20, 2024

@joakime what is the status of this? Should we be looking closer at it?

Copy link
Contributor

@gregw gregw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some initial thoughts... all reading well so far!

@joakime joakime changed the title Issue #8769 - Implement Dynamic Compression Handler with support for gzip, brotli, and zstandard Issue #8769 - Introduce new Compression Handler with support for gzip, brotli, and zstandard Aug 21, 2024
@joakime joakime marked this pull request as ready for review October 2, 2024 21:10
@joakime joakime requested a review from gregw October 3, 2024 11:59
Copy link
Contributor

@lorban lorban left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is my first batch of comments.

private final String etagSuffixQuote;
private ByteBufferPool byteBufferPool;
private Container container;
private int bufferSize = 2048;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a specific reason to use 2K? That looks a bit small to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a few buffer size calculations present here.
There's the default size, typically used for non-compression related stuff, this is the 2kb you are seeing here.
There's Encoder specific buffer sizes.
There's Decoder specific buffer sizes.

See:

  • gzip - GzipDecoderConfig - 512 bytes as default - taken from java.util.zip.GZIPInputStream codebase
  • gzip - GzipEncoderConfig - 512 bytes as default - taken from java.util.zip.GZIPOutputStream codebase
  • zstd - ZstandardDecoderConfig - 64k typical - calculated from ZstdInputStreamNoFinalizer.recommendedDOutSize()
  • zstd - ZstandardEncoderConfig - 256k typical - calculated from ZstdOutputStreamNoFinalizer.recommendedCOutSize()
  • brotli - BrotliDecoderConfig - 16384 bytes - taken from Brotli4j codebase
  • brotli - BrotliEncoderConfig - 4096 bytes - taken from Brotli4j codebase.

As you can see, the value for "buffer size" is quite varied, depends on the implementation, and if you are encoding or decoding.

* @param length the requested size of the buffer
* @return the ByteBuffer suitable for this compression implementation.
*/
public abstract RetainableByteBuffer acquireByteBuffer(int length);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think internally wrapping a given ByteBufferPool.Sized with one that does all the magic about byte order, directness and size would be clearer instead of introducing these acquire methods.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did that originally, but the sheer amount of code to do that was just off-putting.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, the zero length acquire rules with retainable were quite confusing, and the likely reason that we do a zero sized check in so many places we use ByteBufferPool.acquire still exists.

private final String encodingName;
private final String etagSuffix;
private final String etagSuffixQuote;
private ByteBufferPool byteBufferPool;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you want to use a ByteBufferPool.Sized (which is a new Jetty 12.1 concept)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can only talk about the decision to use this vs ByteBufferPool.Sized at the moment.
I'll change it to ByteBufferPool.Sized now, and see what happens.
The history: The ByteBufferPool.Sized was quite buggy when I wrote this. It didn't behave properly at that point in time.
Also, at the time, (no longer true), the raw ByteBuffer was going through internals of the compression libs and using the RetainableByteBuffer concepts it made it difficult to track the specific buffer that was held in the compression lib to the buffer from this pool.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also aligns with the default ArrayByteBufferPool from the server, which isn't a ByteBufferPool.Sized.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a ByteBufferPool is unconfigured here, then the default ByteBufferPool.NON_POOLING is used, and since not all ByteBufferPool implementations are ByteBufferPool.Sized, this was the best and most flexible solution. If the implementation wants to use ByteBufferPool.Sized, then a call to .setByteBufferPool(new ByteBufferPool.Sized()) will still work here.

* @return the {@link InputStream} implementation for this compression.
* @throws IOException if unable to create InputStream
*/
public abstract InputStream newDecoderInputStream(InputStream in, DecoderConfig config) throws IOException;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you should expose any method offering InputStream usage as Content.Source already offers a InputStream asInputStream(Content.Source source) static helper for this purpose.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This uses the compression implementation version of the InputStream, not the wrapped version of a wrapped of an extension. It's mostly used for tests and demonstration purposes. It will come to be especially handy for Jetty Client users in the future (when this is integrated there) as well.


// Can Brotli4J use direct byte buffers?
RetainableByteBuffer buffer = getByteBufferPool().acquire(length, false);
buffer.getByteBuffer().order(getByteOrder());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you change the pooled buffers' byte ordering, this should be reset once the buffer is released.

Either that's a job for yet another wrapping pool, or maybe build that into our existing pools.

Copy link
Contributor Author

@joakime joakime Oct 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our existing pools do that by calling BufferUtil.reset() - tho the entire new set of pools should probably have beefed up testing to ensure that this is the case. (btw, this order change behavior is also present in GzipHandler's GzipResponseAndCallback)

if (_buffer == null)
{
_buffer = getRequest().getComponents().getByteBufferPool().acquire(_bufferSize, false);
ByteBuffer byteBuffer = _buffer.getByteBuffer();
// Per RFC-1952, GZIP is LITTLE_ENDIAN
byteBuffer.order(ByteOrder.LITTLE_ENDIAN);
BufferUtil.flipToFill(byteBuffer);
// Add GZIP Header
byteBuffer.put(GZIP_HEADER, 0, GZIP_HEADER.length);

@Override
protected void release()
{
decoder.destroy();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like making DecoderSource Closeable would be more representative as you're not releasing reusable resources but cleaning up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, but then the users of this Decoder.Source have to be mindful to call close() on it, a new concept that Content.Source doesn't have. Also, only brotli has this requirement, gzip and zstd do not. (well, gzip has a variation of this requirement)

throw new IllegalStateException("ByteBufferPool does not return zstd-jni required direct ByteBuffer");
}
// We rely on the ByteBufferPool.release(ByteBuffer) performing a ByteBuffer order reset to default (big-endian).
// Typically, this is done with a BufferUtil.reset(ByteBuffer) call on.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a test to the current pool's suite to ensure byte ordering is properly reset sounds advisable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since GzipHandler already does this ByteBuffer.order change (after acquire), I don't see this as a big deal.

If we decide to address this with more testing, then we should do it for all of our new ByteBufferPool implementations and also review other places where this order change occurs.

That would make for a good (but separate) PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened a separate (initial) PR for this at #12432

Copy link
Contributor

@gregw gregw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm primarily reviewing for naming and structure at the moment.
Mostly good, but a few comments so far...

Copy link
Contributor

@gregw gregw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is good enough to merge as an experimental handler.

with gzip, brotli, and zstandard.

* New experimental CompressionHandler that is
  an eventual replacement for GzipHandler
@joakime joakime force-pushed the fix/12.1.x/dynamic-compression-handler branch from 045c094 to c3565d5 Compare October 30, 2024 12:27
@joakime joakime merged commit 8dbd278 into jetty-12.1.x Oct 30, 2024
5 of 8 checks passed
@joakime joakime deleted the fix/12.1.x/dynamic-compression-handler branch October 30, 2024 12:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: ✅ Done
Development

Successfully merging this pull request may close these issues.

Introduce CompressionHandler to support compression from gzip, brotli, and zstandard
3 participants