Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Adding the helper method to read the array of events #189

Open
oskardudycz opened this issue Jul 29, 2021 · 7 comments
Open
Labels
kind/enhancement Issues which are a new feature

Comments

@oskardudycz
Copy link
Contributor

After changing readStream and readAll methods to return the NodeJS stream instead of an array, it's impossible to get an array of events without using the handcrafted helper method or external library (e.g. https://www.npmjs.com/package/iter-tools).

Currently you either have to iterate through it, e.g.:

const resultEvents = client.readStream(streamName, {
  fromRevision: START,
  direction: FORWARDS,
  maxCount: 10,
});

const events = [];
for await (const event of resultEvents) {
  events.push(event);
}
return events;

It's impossible to get the array immediately or use methods like map or other transformations without using, e.g. NodeJS stream transformation.

I think that we should add the helper method toArray or collect or other name that will wrap the

const events = [];
for await (const event of resultEvents) {
  events.push(event);
}
return events;

(or do it the smarter way).

The other option is to have the take method with the number of events we'd like to take, but I don't see that being useful. From my experience, 99% of the time, you just want to read the whole stream to rebuild the state. You won't know what's the number of events you have in the stream. Plus, we already removed this need by making maxCount optional (as in other clients). If we add the limitation, people would still need to write their own helpers to page the event streams.

I understand that we should have the memory effective option to fulfil the needs of people that have longer streams. However, I also believe that we should also enable people who have proper short streams to do it easily. Of course, other packages do it more efficiently. We can mention them in the docs or use them internally. I think that we should not optimise our API for edge cases. The entry point to using EventStoreDB is already high enough. We should lower it.

This feature request is not a breaking change. People can use the NodeJS streams, as they're now in v2. It's just to make the usage more accessible.

@oskardudycz oskardudycz added the kind/enhancement Issues which are a new feature label Jul 29, 2021
@George-Payne
Copy link
Member

The four options are:

Separate method

await client.readStreamToArray('my-stream');

Extension 1

await client.readStream('my-stream').toArray();

Helper

import { streamToArray } from '@eventstore/db-client'; 
const eventStream = client.readStream('my-stream');
const eventArray = await streamToArray(eventStream);

Use a library (current)

import { asyncToArray } from 'iter-tools'; 
const events = client.readStream('my-stream');
const eventArray = await asyncToArray(events);

1: We need to be careful to name it something that wont break with future versions of JS (so not toArray). https://github.com/tc39/proposal-iterator-helpers#toarray

@George-Payne
Copy link
Member

Proposal:

Extend streaming read with collect:

export interface StreamingRead<E> extends Readable {
  // ...
  collect(): Promise<E[]>;
  collect<T>(
    reducer: (acc: T, event: E, i: number, self: this) => T,
    initialValue: T
  ): Promise<T>;
}

This allows you to:

Easily collect to an array:

const eventArray = await client.readStream("my-stream").collect();
Implementation with current api
const eventArray = [];
for await (const event of client.readStream("my-stream")) { 
    eventArray.push(event);   
}

Collect to something other than array:

const eventSet = await client
  .readStream("my-stream")
  .collect((acc, event) => acc.add(event), new Set());
Implementation with current api
const eventSet = new Set();
for await (const event of client.readStream("my-stream")) { 
    eventSet.add(event);   
}

Lazy Map:

const ids = await client
  .readStream("my-stream")
  .collect((acc, { event }) => [...acc, event.id], []);
Implementation with current api
const ids = [];
for await (const { event } of client.readStream("my-stream")) { 
    ids.push(event.id);   
}

Lazy Filter:

const wanted = await client.readStream("my-stream").collect((acc, { event }) => {
  if (event.data.isWanted) {
    acc.push(event);
  }
  return acc;
}, []);
Implementation with current api
const wanted = [];
for await (const { event } of client.readStream("my-stream")) { 
    if (event.data.isWanted) {
        wanted.push(event);
    }
}

@oskardudycz
Copy link
Contributor Author

oskardudycz commented Jul 29, 2021

I'm good with the collect method proposed above 👍

@yordis
Copy link

yordis commented Feb 5, 2022

I would suggest trying to leverage Array.from since it is meant to be used for such cases if possible. Otherwise, I would rather read toArray that takes no arguments.

And also, add .reduce as well if you want to keep people the opportunity to reduce the events. Feels odd to reuse collect or use the word collect for such a use case.

But Array.from already allows you to do such a thing as well, if possible.

@yordis
Copy link

yordis commented Feb 5, 2022

Also, it seems to be that there was some interest at some point about .toArray being a thing in JavaScript (Stage 2)

@George-Payne
Copy link
Member

Hi @yordis ,

Thanks for the input, we're still discussing this so any feedback is helpful / appreciated.

I would suggest trying to leverage Array.from since it is meant to be used for such cases if possible.

As far as I am aware, Array.from is synchronous only, so takes either something with a length attribute, or with an @@iterator method. readStream returns a nodejs ReadableStream, which provides an asyncIterator, so won't work with Array.from. Unless I am missing something?

Feels odd to [..] use the word collect [..]

[..] .toArray being a thing in JavaScript (Stage 2)

This was the reason for suggesting collect as a term, as mentioned in the footnote of this comment. I don't want to clash with any apis that end up being added to ReadableStream at some point. collect is maybe a bit of a rust-ism, perhaps fold would be a better choice.

@yordis
Copy link

yordis commented Feb 7, 2022

If I am not mistaken, a lot of conversations around making async everything lately, the information from TC39 is all over the place sometimes, so excuse if I don't link enough info.

Collect does the work, honestly, I don't mind collect.

They are opening the APIs and extending it just so people align around it, so could be worth seeing what TC39 is up to and maybe following the guidelines so far, it is stage 2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/enhancement Issues which are a new feature
Projects
None yet
Development

No branches or pull requests

3 participants