Add method to drop bytes from head of a `BigArray` #27221

Tim-Brooks · 2017-11-02T04:03:56Z

This is related to #27051. Essentially, we want to use big arrays for
byte reusage when reading and writing to channels. Unfortunately, big
arrays are currently designed to expand forever. This does not fit with
reading from a channel where from time to time you will want to drop
bytes when a message is fully read.

This commit refactors big arrays to only have one array for both
pages and recyclers. This is a change from the current situation where
these are stored in two different arrays. This change simplifies the
array manipulation that is necessary when dropping bytes from the head.

Second, it adds a method dropFromHead that will remove and release
pages up until the provided index. This change also requires the
introduction of potential "offsets" for big arrays.

Tim-Brooks · 2017-11-02T04:05:54Z

I implemented a test for dropping bytes from a ByteArray. I still need to replicate this test for other array types. But I wanted to go ahead and push this PR up to get some feedback in case anyone had an issue with my approach.

jpountz · 2017-11-02T09:34:44Z

To me big arrays are a bit too complex already so I'm a bit on the fence about making them even more complex. I'm wondering whether we should build another abstraction on top of the page cache recycler instead of reusing big arrays here?

s1monw · 2017-11-02T10:15:28Z

To me big arrays are a bit too complex already so I'm a bit on the fence about making them even more complex. I'm wondering whether we should build another abstraction on top of the page cache recycler instead of reusing big arrays here?

I am ok with this, @tbrooks8 WDYT?

Tim-Brooks · 2017-11-02T17:49:19Z

To me big arrays are a bit too complex already so I'm a bit on the fence about making them even more complex. I'm wondering whether we should build another abstraction on top of the page cache recycler instead of reusing big arrays here?

I have a few thoughts.

The PageCacheRecycler is not really something that we expose. We pass BigArrays to the transport upon creation, so that is what I have access to. We can change that, but that is what exists now.
I imagine that we are still going to want to want to use BigArrays for the outbound byte messages for right now. Which means that we still need access to BigArrays in the transport. Although we can reconsider that at some point. One limitation of passing a ByteReference as the message is that we must wait until the message is completely written before we start releasing pages. Obviously, with a specialized nio transport and a different data structure, we can release pages incrementally as we write a message (maybe an improvement for large messages).
I'm not sure I completely follow when you say this makes BigArrays more complicated. There are two parts of this PR: 1. Some unification between the different big array types (same code paths for page allocation, for resizing, same page array, etc) in AbstractBigArray. I'm not really sure that is more complicated opposed to different. 2. Introduction of offsets and dropping pages from the front of an array. That is more complicated.

I guess what I'm seeing is that I can create a different data structure. But I there is still some stuff in AbstractBigArray that is valuable (all of the power of two indexing work). Do we prefer that I essentially rip out the common offset and "drop from head" work and implement a new AbstractBigArray (hypothetically CircularBigByteArray) that has the specialized offset and dropping logic? In this scenario I would need to keep the unification of the pages and recycler arrays as the old AbstractBigArray is not designed to release pages in a world where the pages have moved around. Essentially I would want to keep some of the work related to part 1 of this PR.

Or should I create an accessor for the PageCacheRecycler from BigArrays and when BigArrays is passed to a transport I access the PageCacheRecycler and create a new allocator thing (hypothetically ChannelBuffers). In this scenario I would probably look to extract a new super class of AbstractBigArray that shares the power of two logic (hypothetically PowerOfTwoArray).

jpountz · 2017-11-02T18:39:42Z

I'm not sure I completely follow when you say this makes BigArrays more complicated.

I think my main concern is the introduction of a new way of consuming big arrays via the addition of dropFromHead and offset/size to AbstractBigArray.

Do we prefer that I essentially rip out the common offset and "drop from head" work and implement a new AbstractBigArray (hypothetically CircularBigByteArray) that has the specialized offset and dropping logic?

That would work for me.

Or should I create an accessor for the PageCacheRecycler from BigArrays and when BigArrays is passed to a transport I access the PageCacheRecycler and create a new allocator thing (hypothetically ChannelBuffers).

I don't like exposing the internals of BigArrays, could we pass the PageCacheRecycler in addition to BigArrays to the transport? If yes, then it would work for me too.

In this scenario I would probably look to extract a new super class of AbstractBigArray that shares the power of two logic (hypothetically PowerOfTwoArray).

That logic is simple enough that I wouldn't mind it to be duplicated.

Tim-Brooks · 2017-11-03T01:56:54Z

Thanks @jpountz. Your last comment gives me some approaches to work with.

s1monw · 2017-11-22T13:43:31Z

@tbrooks8 is this still on or can we close it?

Tim-Brooks · 2017-11-22T14:58:30Z

Closed. We are going to go with a different approach.

Tim-Brooks added 4 commits November 1, 2017 16:04

WIP

0d6e219

WIP

00ec1dc

Work on tests. Fix a few issues

0b806b7

Expand tests

c45f99f

Tim-Brooks added :Core/Infra/Core Core issues without another label >non-issue review v6.1.0 v7.0.0 labels Nov 2, 2017

Tim-Brooks requested review from dakrone, jasontedor, jpountz and s1monw November 2, 2017 04:03

Tim-Brooks closed this Nov 22, 2017

This was referenced Nov 27, 2017

Introduce resizable inbound byte buffer #27551

Merged

Implement byte array reuse in nio transport #27563

Closed

Tim-Brooks mentioned this pull request Dec 7, 2017

Implement byte array reusage in NioTransport #27696

Merged

Tim-Brooks deleted the shrink_array branch November 14, 2018 14:50

colings86 added v7.0.0-beta1 and removed v7.0.0 labels Feb 7, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add method to drop bytes from head of a `BigArray` #27221

Add method to drop bytes from head of a `BigArray` #27221

Uh oh!

Tim-Brooks commented Nov 2, 2017

Uh oh!

Tim-Brooks commented Nov 2, 2017

Uh oh!

jpountz commented Nov 2, 2017 •

edited

Loading

Uh oh!

s1monw commented Nov 2, 2017

Uh oh!

Tim-Brooks commented Nov 2, 2017

Uh oh!

jpountz commented Nov 2, 2017

Uh oh!

Tim-Brooks commented Nov 3, 2017

Uh oh!

s1monw commented Nov 22, 2017

Uh oh!

Tim-Brooks commented Nov 22, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Add method to drop bytes from head of a BigArray #27221

Add method to drop bytes from head of a BigArray #27221

Uh oh!

Conversation

Tim-Brooks commented Nov 2, 2017

Uh oh!

Tim-Brooks commented Nov 2, 2017

Uh oh!

jpountz commented Nov 2, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

s1monw commented Nov 2, 2017

Uh oh!

Tim-Brooks commented Nov 2, 2017

Uh oh!

jpountz commented Nov 2, 2017

Uh oh!

Tim-Brooks commented Nov 3, 2017

Uh oh!

s1monw commented Nov 22, 2017

Uh oh!

Tim-Brooks commented Nov 22, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Add method to drop bytes from head of a `BigArray` #27221

Add method to drop bytes from head of a `BigArray` #27221

jpountz commented Nov 2, 2017 •

edited

Loading