-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to mark double-ended iterator #5
Comments
If so, how do we mark a higher-order iterator? |
Higher-order iterator need to set its It seems we can't achieve that using generator function directly, we need to wrap it. Object.assign(IteratorProto, {
map(mapper) {
const result = this._map(mapper)
result.back = this.back
return result
},
*_map(mapper) {
for (;;) {
const {done, value} = this.next(function.sent)
if (done) return
yield mapper(value)
}
},
}) Again, decorator may help us. |
An alternative: rather than overloading This makes it a little harder to use with generators, but I don't think it really makes sense to try to make generators double-ended: the values yielded by generators follow the control flow of the body of the function, and you can't usually move to an earlier point in the control flow in JavaScript. |
My problem with that as a library maintainer is that if I want to write a In addition, I'd have to create an iterator to have any idea if double ended iteration is supported, and creating an iterator is allowed to have side effects, making it difficult to impossible to write good defensive code around this feature. |
Disposable resources have even more problems than that too. As mentioned above creating an iterator is allowed to create a disposable resource, which can then be disposed in one of two ways, either abruptly when |
Also as regards the idea of quietly failing over requests for reverse iteration with forwards iteration, please oh please no. The longer I do this the less I know why anyone would think the following behavior is desirable in a program: "What you asked for wasn't possible, so I quietly did something else. Oh and by the way you won't even be able to tell if I did what you asked for or not!". As a programmer and as a user, no!! If I asked to do something that can't be done, the correct response is an error. |
@bakkot You may misunderstand what double-ended mean. It doesn't mean move to early point (that is bidirectional). Double-ended means u can consume the sequence in both end, it's more close to the concept of deque (double-ended queue), deiter could be seen as a deque only support removing items, So in many cases (for example, Array), it's as easy as normal generator to write a double-ended generator. |
@conartist6 First, pls notice my previous comment, deiter is not bidirectional, it will never be "calling [Symbol.iterator] is really two iterators", and with current design, u could just use generator (with
Yes I agree this is so bad, this is why we have this issue: "How to mark double-ended iterator". |
@hax the problem is that the way that generators represent sequences is with control flow. So you write So unless I'm missing something, it still seems to me that generators are not a natural fit for double-ended iterators. |
@bakkot Yeah, the point is the control flow . The control flow work in particular way to represent the logic of how to generate the next value. So yes, if the iterator/generator is intend to be double-ended, the developer of coz need to deal with the parameter which represent which end (front-end or back-end) will be needed for the next value and yield the next value according to that. And |
@bakkot Array.prototype.values = function *values() {
let next = 0
while (next < this.length) {
yield this[next++]
}
} This is a double-ended version Array.prototype.values = function *values() {
let next = 0, nextBack = this.length
while (next < nextBack) {
if (function.sent === 'back') yield this[--nextBack]
else yield this[next++]
}
} |
// this is a special case which already double-ended
function *repeat(x) {
for(;;) yield x
} // not double-ended
function *range(start, end) {
for (let v = start; v < end; ++v) {
yield v
}
}
// double-ended version
function *range(start, end) {
while (start < end) {
if (function.sent === 'back') yield --end
else yield start++
}
} // string code points iterator
function *codepoints(s) {
for (let i = 0; i < s.length; ++i) {
const v = s.codePointAt(i)
yield v
if (v > 0xffff) ++i
}
}
// double-ended version of string code points iterator
// a little bit tricky, but not so hard (not tested, hope I write it correctly :-)
function *codepoints(s) {
let start = 0, end = s.length
while (start < end) {
if (function.sent === 'back') {
let v = s.codePointAt(--end)
if (v >= 0xdc00 && v < 0xe000) v = s.codePointAt(--end)
yield v
} else {
const v = s.codePointAt(start)
yield v
start += v > 0xffff ? 2 : 1
}
}
} |
As a developer why would I want to have to to write the body of my codepoints generator twice? Again I'd just want to write My experience with code suggests that asking developers to write everything twice to prepare for a use case they don't have yet will lead to a lot of very poor quality code. |
Actually with deiter, u don't write it twice, u only write On the other side, Also, [1] It's very unlikely JS will have such api, because generally speaking reverse a string doesn't make sense. |
What would a A current forward iterator can use this implementation: function* map(source, mapper) {
for (const value of source) yield mapper(value);
} How would you express a map generator over a deiter? |
You could say that map just always works on forwards iterators, but then the map implementation would need to be: function* map(source, mapper) {
for (const value of source) {
if (function.sent === 'back') throw new IterationError('unsupported');
yield mapper(value);
}
} All the existing copies of the map function implemented in the naive way would cause silent failures. |
You don't need to throw error, because it just work (if keep passing param semantic). See #1 (comment) .
This is why this issue "How to mark double-ended iterator" is open, we need a both backward and forward compatible way to make it work. It's hard but possible. |
As the table of #1 (comment) , it's clear that we need to mark the "type" of a iterators/generators:
Solution 1: Introducing new protocol ("interface" as current spec) and well-known symbolsCurrently, we have As this solution, we add Note, It would be better to have first-class protocol which could provide default methods based on other methods. protocol Iterable {
[Symbol.iterator]
}
protocol ReverseIterable {
[Symbol.reverseIterator]
}
protocol DoubleEndedIterable extends Iterable, ReverseIterable {
[Symbol.doubleEndedIterable]
get [Symbol.iterator]() { return this[Symbol.doubleEndedIterable] }
[Symbol.reverseIterator]() { return this[Symbol.doubleEndedIterable]().reversed() }
}
class UserlandDeIterable implements DoubleEndedIterable {
[Symbol.doubleEndedIterator]() {
return {
__proto__: %DoubleEndedIteratorPrototype%,
next(dir) { ... }
}
}
// no need to implement [Symbol.iterator] and [Symbol.reverseIterator] methods,
// use default methods from DoubleEndedIterable protocol
} // current %IteratorPrototype%, for reference
%IteratorPrototype% = {
[@@iterator]() { return this },
map() {},
filter() {},
...
}
%DoubleEndedIteratorPrototype% = {
__proto__: %IteratorPrototype%
[@@doubleEndedIterator]() { return this }
[@@reverseIterator]() { return this.reversed() }
reversed() {},
takeLast(n) {},
...
}
There are two possible algorithms for destructing. A. let [x, ...] = a // $it = a[@@iterator](); $it.next(); $it.return()
let [..., y] = a // $it = a[@@reverseIterator](); $it.next(); $it.return()
let [x, ..., y] = a // $it = a[@@doubleEndedIterator](); $it.next(); $it.next('back'); $it.return() or, it could let [x, ...] = a // $it = (a[@@doubleEndedIterator] ?? a[@@iterator])()
let [..., y] = a // if (a[@@doubleEndedIterator]) { ... } else { $it = a[@@reverseIterator](); $it.next(); $it.return() }
let [x, ..., y] = a // $it = a[@@doubleEndedIterator]() Prefer B ( Iterator helpers need to be updated to:
The left question is how generator express it's a doubleEndedIterator, seems double star syntax could be sufficient . Solution 2 (to be continued...) |
I don't think You'd also need The thing I worry about is if there's ever any other way the iteration protocol needs to be changed (and personally I have another change in mind already) then you'd end up with 9 protocols, and then some unforeseen change might make it 18, and eventually you have a real mess. |
The additional protocols I am imagining all stem from the idea of reducing overhead where it is not needed in order to enable processing iterables of characters. Characters are obviously very small and documents made of characters may be very large, so loops which work with characters must be efficient. Yet |
No it's not a real exist For example: let x = [backOnly, frontOnly]
let iter = x.values().flatMap(x => x)
// neither next() nor next('back') could work, so any next() invoke just throw.
I can't say that there will be no new iteration protocol (but as I understand, not likely to happen, if double-ended iteration won't advance, I can't imagine any other could meet). But, I am sure syncAndAsyncItertor have no chance to be accepted by the committee. Mixing of sync and async is forbidden from the first day of introducing promise, or even node.js callback convention deny that and name it as Zalgo. |
In my opinion one specific use case can not prove that it need to be added into language. Unless such use case is very strong. For your specific case, iteration of huge of characters, IMO can't be solved by adding any new iteration protocol. You should first figure out what task you want to do for long text. Because text is complex, many tasks need high-level api like grapheme cluster, word, sentence segment... and it's by nature heavy task and iteration itself do not cost much. On the other hand, if u just want to do byte-level or unicode code point level operations, the essential issue of processing string in JS, is the mistake in the early days that JS use UTF-16 encoding. So what u need may be operate buffer or buffer-based CodePointIterator.
We can't change |
It's an interesting case with |
As for sync and async iteration, I do not think it angers Zalgo. In fact that article talks about exactly what I'm discussing. It has its own subheading: "If the result is usually available right now, and performance matters a lot:". It says:
Well, there you go. This is talking about callbacks, but when you're working with promises it's even simpler. Either return the value, or return a Promise so that the caller knows that it must wait for the value. As he notes in the next section:
|
I also don't buy your argument that I need to know one specific operation to optimize. Indeed optimizing one specific operation is already possible. What we're discussing is creating a way to define operations once and have them work efficiently and correctly for any input that implements iteration protocols. Right now someone working with chunked data would have two choices:
|
We need to mark whether an iterator is a double-ended iterator. So
let [..., last] = singleEndedIterator
could throw (or fallback to normal iteration --- this should be a separate issue).As current README,
Symbol.deIterator
is introduced (followSymbol.asyncIterator
). For built-in iterators/generators, we could just spec them, but userland generators by default do not have it. Programmers need add it manually:But it shows that
Symbol.deIterator
may be unnecessary because in almost every cases a double-ended iterator should have same value for[Symbol.deIterator]
and[Symbol.iterator]
. (A double-ended iterator always can be used as single-ended iterator.)Another way is let
g
haveDoubleEndedIteratorPrototype
in its prototype chain. But this also requireDoubleEndedGenerateFunction
which seems not a good idea.So it seems the simplest solution is just a boolean property on iterator.
In either cases, such boilerplate code could be solved by decorator in the future:
Or we could introduce a new syntax for it:
The text was updated successfully, but these errors were encountered: