Skip to content
This repository has been archived by the owner on Aug 2, 2023. It is now read-only.

Removed Old BufferReader #2025

Merged
merged 20 commits into from
Jan 9, 2018

Conversation

KrzysztofCwalina
Copy link
Member

@KrzysztofCwalina KrzysztofCwalina commented Jan 5, 2018

I changed all members of the old reader into extension methods over the pipelines reader, and deleted the old reader.

@davidfowl, @pakrym

bytes = default;
return false;
}
bytes = new ReadOnlyBuffer(reader.Position, position.Value);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still not correct.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as we discussed, we need to make it correct, or remove the ctor.

@@ -42,7 +44,9 @@ public BufferReader(TSequence buffer)

public Position Position => _currentPosition + _index;

public ReadOnlySpan<byte> Span => _currentSpan;
public ReadOnlySpan<byte> CurrentSegment => _currentSpan;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment about Span. We don't call it Segment in IMemoryList why call it segment here.

Copy link
Member Author

@KrzysztofCwalina KrzysztofCwalina Jan 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cause IMemoryList is a list of Memory<T> instances. ReadOnlyBuffer is a sequence of segments, not sequence of spans.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a list of things you can get Memory from, same as here, list of things you can get Span from.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should call it Span

_index = 0;
}

public Position? PositionOf(byte value)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is same logic in ROB.Seek

Copy link
Member Author

@KrzysztofCwalina KrzysztofCwalina Jan 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, but I don't have access to End

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That doesn't mean we have to duplicate everything on reader now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. We should keep this API minimal.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to eliminate the old reader. To do this I need a method roughly like the following:

        public static bool TryReadUntill(ref this BufferReader<ReadOnlyBuffer> reader, out ReadOnlyBuffer bytes, byte delimiter)
        {
            var position = reader.PositionOf(delimiter);
            if (position == null)
            {
                bytes = default;
                return false;
            }
            bytes = new ReadOnlyBuffer(reader.Position, position.Value);
            reader.SkipTo(position.Value);
            reader.Skip(1);
            return true;
        }

Even if I change the this parameter to reader over ROB, I am not sure how to implement it. Can you suggest a way to implement it without adding PositionOf to the reader?

Copy link
Contributor

@pakrym pakrym Jan 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

var start = reader.Position;
while (!reader.IsEnd || reader.Read() != delimiter) {}
return new ROB(start, reader.Position);

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method is a Try, but Read has side effects.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy reader to local and copy back if needed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's ok as a workaround for missing APIs, but it's not a good long term solution. The current implementation is super fast: does not copy large structs, uses vectorized IndexOf to find matches, in common cases where the match is found in the first segment, it simply adjusts the _index (once). Your implementation (workaround) is not even in the same ballpark.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know, I'm not claiming that it is efficient. But we have fast one in ROB and we need to decide how to reuse it and avoid code duplication.

}
}

public static int CopyTo(BufferReader<TSequence> bytes, Span<byte> buffer)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is same logic on ROB. Lets not duplicate it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, but I don't have access to ROB. I agree that it would be good to somehow refactor to reuse the logic. Any ideas for how to refactor?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We decided some time ago that we won't try to implement everything on top of ISequence and now we are trying to do it.

We should expose Sequence and so allow access to all these implementations on ROB. We can expand ISequence in the future to accommodate new features but we shouldn't go into rabbit hole right now. Especially considering these implementations are not used in production code and may not be thoroughly tested.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can expose SliceBefore() and SliceAfter() extension methods in the way I talked in other thread so the consumer can get ROB of either and call existing methods on it.

Copy link
Member Author

@KrzysztofCwalina KrzysztofCwalina Jan 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure how it helps here. I don't have BufferReader<ROB>; I have BufferReader<TSequence>

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I said, in a previous meeting we decided not to try building everything on top of ISequence, especially while we have only one real implementation. We can live with supporting these things only on ROB and see what other things would implement ISequence<ReadOnlyMemory<byte>> if there would be any other at all.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe moving the reader to ISequence was a mistake too.

Copy link
Member Author

@KrzysztofCwalina KrzysztofCwalina Jan 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have several problems in various abstractions: ISequence, Position, ROB. But I would not throw baby with the bathwater. I implemented many extensions over the APIs we have now (the no merge PR) and it seems to be almost working. I think we need to solve a couple of small problems to enable both reading over sequences and slicing. Let's chat more about it today.

public ReadOnlySpan<byte> Span => _currentSpan;
public ReadOnlySpan<byte> CurrentSegment => _currentSpan;

public ReadOnlySpan<byte> UnreadSegment => _currentSpan.Slice(_index);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a useful property 😄 . I think we should expose this as the RemainingSpan and we should also have RemainingBytes

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remaining sounds like you are talking about entire buffer, especially when RemainingBytes is absolute.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in general calling these two concepts "Span" has the same issue: it's not clear what the span contains. Segment is very clear: the type is a list of segments and it's one of these.

return default;
}

public Position? PositionOf(ReadOnlySpan<byte> value)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove this for now? I'd like to keep this API minimal and fully tested. Expose whatever needs to be exposed to implement this in the experimental APIs

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it can be removed, and I don't think we can sweep the issue under the rug. The remaining abstraction is not powerful enough to implement reading of variable length data (based on searches). We would have to expose many fields of the reader (in addition to the sequence) to implement for example ReadUntill, at which point using the reader would be very error prone as the callers could mess up its state too easily.

@pakrym
Copy link
Contributor

pakrym commented Jan 5, 2018

I think moving reader to use ISequence and treating it as an abstraction was a mistake, we should concentrate our time on finishing things that around ROB, not trying to add more stuff that supports ISequence especially considering we have exactly one implementation of it.

We may also want to consider not shipping interface at all until we find a real need and at least another implementer for it because at the current state it clearly does not allow all the scenarios we are trying to use it for (lack of Slice is one of the bigger problems).

@KrzysztofCwalina
Copy link
Member Author

OK, I removed all but Skip(Position) and CopyTo. Are we good with these two?

@pakrym
Copy link
Contributor

pakrym commented Jan 5, 2018

They both exist in ROB in forms of Slice and CopyTo, why duplicate?

@pakrym
Copy link
Contributor

pakrym commented Jan 5, 2018

There are also no tests for them.

@KrzysztofCwalina
Copy link
Member Author

KrzysztofCwalina commented Jan 8, 2018

They both exist in ROB in forms of Slice and CopyTo, why duplicate?

Because I believe we cannot develop hundreds of BCL APIs operating over ROB; it's too policy heavy. I think we need a ROB abstraction. There are many ways to design a multi-segment buffer and ROB is just one of them. In the BCL we don't like to take dependencies on one particular design (unless the design is super simple). Whenever we do, we endup having to redesign and reimplement these APIs operating over such policy heavy types every other release.

There are also no tests for them.

Thus "NO MERGE". I said in the PR message that I will add tests once we agree on the APIs.

@KrzysztofCwalina
Copy link
Member Author

OK, I added tests to CopyTo, the only method we agreed to keep on the reader.

@KrzysztofCwalina KrzysztofCwalina changed the title Removed Old BufferReader [DO NOT MERGE] Removed Old BufferReader Jan 8, 2018
@@ -17,6 +17,8 @@ namespace System.Buffers
internal readonly Position BufferStart;
internal readonly Position BufferEnd;

public static readonly ReadOnlyBuffer Empty = new ReadOnlyBuffer(new byte[0]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Array.Empty<>?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not available in NS 1.1

{
public static BufferReader<TSequence> Create<TSequence>(TSequence buffer) where TSequence : ISequence<ReadOnlyMemory<byte>>
{
return new BufferReader<TSequence>(buffer);
}

public static int CopyTo<TSequence>(BufferReader<TSequence> reader, Span<byte> destination)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need it here and not on the instance?

Copy link
Member Author

@KrzysztofCwalina KrzysztofCwalina Jan 8, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a limitation of the C# compiler. The Span can be stack allocated but the instance CopyTo method could store the passed in span into the _currentSegment field. See dotnet/roslyn#23433. It's quite unfortunate though.

cc: @VSadov, @jaredpar

{
ThrowHelper.ThrowArgumentOutOfRangeException(ExceptionArgument.length);
}
}

internal static int CopyTo(BufferReader<TSequence> bytes, Span<byte> destination)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't follow usual CopyTo conventions about destination sizes. Maybe we should name it differently? Read(span)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not advance the index, so it cannot be called Read, but I will rename it to Peek

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it should advance? Wouldn't it be the most common use case?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or TryRead(Span).

@KrzysztofCwalina
Copy link
Member Author

@dotnet-bot test Innerloop OSX10.12 Debug Build and Test

@KrzysztofCwalina
Copy link
Member Author

@dotnet-bot test Innerloop Ubuntu16.04 Debug Build and Test

private TSequence _sequence;
private Position _currentPosition;
private Position _nextPosition;

private int _consumedBytes;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Long?

@KrzysztofCwalina KrzysztofCwalina merged commit 1fecfe7 into dotnet:master Jan 9, 2018
@KrzysztofCwalina KrzysztofCwalina deleted the OldReaderRemoved branch January 10, 2018 01:08
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants