Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

std/lists: O(1) concatenation of singly- and doubly linked lists. #16362

Merged
merged 35 commits into from
Dec 20, 2020

Conversation

salvipeter
Copy link
Contributor

There is also new toSinglyLinkedList and toDoublyLinkedList
functions for conversion from openArrays, similarly
to toHashSet or toTable.

@Araq
Copy link
Member

Araq commented Dec 15, 2020

Thanks! Please add .since annotations, tests (edit tests/stdlib/tlists.nim) and a changelog entry.

@Clyybber
Copy link
Contributor

I think these should be overloads of append/prepend.

@salvipeter
Copy link
Contributor Author

Added the annotations (I hope they're OK), the tests, and the changelog.
As for the naming... I also first searched for this functionality as append (and yes, a prepend version can also be done), but on the other hand, sequtils has concat.

@Clyybber
Copy link
Contributor

Clyybber commented Dec 15, 2020

Yes but sequtils.concat is not an inplace operation (and has varargs).

@salvipeter
Copy link
Contributor Author

Okay, I renamed it. But thinking about it, I found that prepend is not really a good match, because always the first member changes, and it would be strange to say l1.prepend(l2), modifying l2 while l1 stays the same. We should just use l2.append(l1).

@Araq
Copy link
Member

Araq commented Dec 15, 2020

Instead of "concat" you can use "&", but I don't like using prepend/append for it.

@salvipeter
Copy link
Contributor Author

I personally like the append idea, it is very straightforward naming, and consistent with other languages. But I'm open to other suggestions, and an operator (even if only as an alias) is useful. How are these questions decided? By votes?

@Araq
Copy link
Member

Araq commented Dec 15, 2020

Well & is Nim's concat operator, nothing controversial about it. We tend to vote for controversial things.

@salvipeter
Copy link
Contributor Author

The problem with & that it gives the wrong impression - that it is a pure function returning a new instance. I would certainly not expect l1 & l2 to change l1 (but I would expect it to return a value!).

@timotheecour
Copy link
Member

timotheecour commented Dec 15, 2020

Instead of "concat" you can use "&", but I don't like using prepend/append for it.

& is outplace, not inplace, so you probably mean &=, but it's much less standard than add and an operator has no benefit here.

Almost all of stdlib uses add for inplace append, so we should use that instead of this PR's append.

ropes: add(a: var Rope; b: Rope)
ropes: add(a: var Rope; b: string)
system: add[T](x: var seq[T]; y: openArray[T])
system: add(x: var string; y: cstring)
unicode: add(s: var string; c: Rune)
macros: add(father, child: NimNode): NimNode
macros: add(father: NimNode; children: varargs[NimNode]): NimNode
...

it's also what's recommended in nep1 explicitly: https://nim-lang.github.io/Nim/nep1.html
use add instead of append

(note: append is currently only used inside lists module and the (AFAIK unused) std/chains module, we can {.deprecated: [append: add].} in future work)

@salvipeter
Copy link
Contributor Author

@timotheecour has a point there... if the others agree, I'll rename it to add then.

@Araq
Copy link
Member

Araq commented Dec 16, 2020

& is outplace, not inplace, so you probably mean &=, but it's much less standard than add and an operator has no benefit here.

Indeed. However, why not instead add & which is outplace. People who are into lists are likely more into FP so "consistency with the rest of Nim" where we have inplace + dup doesn't really convince me.

@salvipeter
Copy link
Contributor Author

why not instead add & which is outplace

Because the whole point of this PR is to create an O(1) concatenation :)
We can add an outplace & operator, but it will be O(n).

@salvipeter
Copy link
Contributor Author

salvipeter commented Dec 16, 2020

By the way, dup does not work on lists, because it is just an assignment under the hood, IIUC, not a deepCopy, but we could have

import lists, sugar
func `&`(L1, L2: SinglyLinkedList): SinglyLinkedList = L1.deepCopy.dup(add(L2.deepCopy))

... or L2 may be without deepCopy, which would be what I would expect (as someone coming from Lisp), but it does not match the functionality of & elsewhere in Nim.

@Araq
Copy link
Member

Araq commented Dec 16, 2020

Because the whole point of this PR is to create an O(1) concatenation :)
We can add an outplace & operator, but it will be O(n).

Ah, good one.

@salvipeter
Copy link
Contributor Author

I've just realized that this operation is not really meaningful for doubly linked lists, as it will always leave one in an inconsistent state. Consider

var
  a = [1, 2, 3].toDoublyLinkedList
  b = [4, 5, 6].toDoublyLinkedList
  c = a.add(b)

If I set the prev pointer of the first node in b to the last node of a, after the add operation c will be consistent, but b won't be. OTOH if b is not changed (as in my implementation), then c will be inconsistent.

I can't see a really good solution, so I removed the DoublyLinkedList implementation.

@@ -81,3 +81,39 @@ block tlistsToString:
l.append('2')
l.append('3')
doAssert $l == """['1', '2', '3']"""

block SinglyLinkedListConversion:
Copy link
Member

@timotheecour timotheecour Dec 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

name the block after the symbols you're testing: toSinglyLinkedList

it makes it easier to search for tests for a given symbol
(note that merely using a symbol in some test isn't the same as a dedicated block testing it)

if a block tests multiple symbols, use:

block: # foo1, foo2
  

@@ -81,3 +81,39 @@ block tlistsToString:
l.append('2')
l.append('3')
doAssert $l == """['1', '2', '3']"""

block SinglyLinkedListConversion:
let l: seq[int] = @[]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let l = seq[int].default


block SinglyLinkedListConversion:
let l: seq[int] = @[]
doAssert $l.toSinglyLinkedList == "[]"
Copy link
Member

@timotheecour timotheecour Dec 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use toSeq for tests; in general it's more robust and independent of how something is rendered

doAssert [1].toSinglyLinkedList.toSeq == [1]

@timotheecour
Copy link
Member

I've just realized that this operation is not really meaningful for doubly linked lists, as it will always leave one in an inconsistent state.

Actually there's a similar problem with SinglyLinkedList: b is left in inconsistent state after being added twice, and other problems, see RFC nim-lang/RFCs#303

I can't see a really good solution, so I removed the DoublyLinkedList implementation.

I have one, see RFC

But thinking about it, I found that prepend is not really a good match, because always the first member changes, and it would be strange to say l1.prepend(l2), modifying l2 while l1 stays the same. We should just use l2.append(l1).

see RFC, which explains that can't work and prepend still needs to be implemented

@salvipeter
Copy link
Contributor Author

Revised the tests according to @timotheecour's comments.

var a = [1, 2, 3].toSinglyLinkedList
let b = [4, 5].toSinglyLinkedList
let a = [1, 2, 3].toSinglyLinkedList
assert $a == $a.copy
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you expand on this example to illustrate the semantics of shallow copying? eg using a ref object as in nim-lang/RFCs#303 (comment) or something better

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK


proc add*[T](L1, L2: var SinglyLinkedList[T]) {.since: (1, 5).} =
## Moves `L2` to the end of `L1`. Efficiency: O(1).
## Note that `L2` becomes empty after the operation.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs to explain what happens for a.add a

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It becomes empty, with the current implementation.

Copy link
Member

@timotheecour timotheecour Dec 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

empty sucks though;
how about raising instead? ideally would be disallowed at CT (like rust) but for now raising might be best we can do

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can do that, but I don't know ... somehow I feel that we concentrate on a special case too much. A lot worse than empty can happen when someone starts shuffling heads and tails, even when sticking with the API's append. (You can do a.append(a.head), after all, which results in a list of just the first element.)
I'm just wondering how far we should intervene... I'm all for liberty :)

b = a.copy
f.x = 42
assert a.head.value.x == 42
assert b.head.value.x == 42
Copy link
Member

@timotheecour timotheecour Dec 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add this or similar to emphasize the copy semantics. Makes behavior even more crystal clear.
(requiers changing a to var)

a.add [f].toSinglyLinkedList
doAssert a.toSeq == [f, f]
doAssert b.toSeq == [f] # b isn't modified...

f.x = 42
assert a.head.value.x == 42
assert b.head.value.x == 42 # ... but elements are not deep copied

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you insist... but (at least for me) the immutable argument in the declaration is enough to be sure that b is not modified.

Copy link
Member

@timotheecour timotheecour Dec 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the immutable argument in the declaration is enough to be sure that b is not modified

unfortunately that's not the case, see #16312 (comment)

Copy link
Contributor Author

@salvipeter salvipeter Dec 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand ... from that thread I see that an assignment can make a (bad) shallow copy, and mess up things. But surely a func such as this copy cannot do that?
... or maybe it can. But - to quote you - this is really bad.

l2.add(l1)
doAssert l3.toSeq == [4, 5, 6]
doAssert l2.toSeq == [2, 3, 1]
block: # DoublyLinkedList
Copy link
Member

@timotheecour timotheecour Dec 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's a ton of duplication here. instead use this:

template testCommon(initList, toList) =
  var
    l0 = initDoublyLinkedList[int]()
    l1 = [1].toList
    l2 = [2, 3].toList
    l3 = [4, 5, 6].toList
    l4 = l3.copy
    l5 = l3.copy
  l0.add(l3)
  l1.add(l4)
  l2.add(l5)

testCommon(initDoublyLinkedList, toDoublyLinkedList)
testCommon(initSinglyLinkedList, toSinglyLinkedList)

it guarantees equal coverage, and simplifies maintenance etc

tests specific to doubly or singly can be in a different block, or, if it's a small diff, you can always use:

when type(l3) is DoublyLinkedList: ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, that will improve the code a lot, I was also bothered by the code duplication, but templates didn't occur to me.

doAssert l3.toSeq == [4, 5, 6]
doAssert l2.toSeq == [2, 3, 1]
testCommon(initSinglyLinkedList, toSinglyLinkedList)
Copy link
Member

@timotheecour timotheecour Dec 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: you'd have less boilerplate with a single testCommon, and sub-blocks for each common proc

block: # SinglyLinkedList, DoublyLinkedList
  template testCommon(initList, toList) =
    block copy: ...
    block add: ...
    ...
  testCommon(initSinglyLinkedList, toSinglyLinkedList)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also considered that, but I assumed that a single top-level block for each tested procedure was the standard :)

changelog.md Outdated
@@ -65,6 +65,27 @@

- `echo` and `debugEcho` will now raise `IOError` if writing to stdout fails. Previous behavior
silently ignored errors. See #16366. Use `-d:nimLegacyEchoNoRaise` for previous behavior.
- Added `lists.toSinglyLinkedList` and `lists.toDoublyLinkedList` for conversion from
Copy link
Member

@timotheecour timotheecour Dec 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • changelog needs to be fixed (reason for duplication is /changelog.md merge=union, but it's there for a good reason)

  • group everything for this PR under just 1 entry

  • newline between preceding entry (echo and debugEcho...) and your new entry

@@ -178,6 +179,28 @@ proc newSinglyLinkedNode*[T](value: T): <//>(SinglyLinkedNode[T]) =
new(result)
result.value = value

func toSinglyLinkedList*[T](elems: openArray[T]): SinglyLinkedList[T] {.since: (1, 5, 1).} =
## Creates a new `SinglyLinkedList` with the members of the
## given collection (seq, array, or string) `elems`.
Copy link
Member

@timotheecour timotheecour Dec 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's not the place to describe what openArray is; it would violate "one definition rule" for docs; note that openArray definition is 1 click away thanks to docgen
=> simply:

Creates a new `SinglyLinkedList` from members of `elems`

(+ elsewhere)

doAssert [1].toList.toSeq == [1]
doAssert [1, 2, 3].toList.toSeq == [1, 2, 3]

block copy:
Copy link
Member

@timotheecour timotheecour Dec 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this test doesn't actually test correctness of copy (2 parts: source container is not modified; but elements are references)

(Remember that runnableExamples is only run for 1 specific backend, unlike this which will be tested on all backends (eventually, once #16384 is fixed))

EDIT: how about this:

  type Foo = ref object
    x: int
  let f0 = Foo(x: 0)
  let f1 = Foo(x: 1)
  var a = [f0]. toList
  var b = a.copy
  b.append f1
  doAssert a.toSeq == [f0]
  doAssert b.toSeq == [f0, f1]

a.tail = b.tail
if a.head == nil:
a.head = b.head
b.head = nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change to:

  if a.addr != b.addr:
    b.head = nil
    b.tail = nil

and add this test:

  import lists, sequtils
  import std/enumerate
  import std/sugar
  var a = [0,1].toSinglyLinkedList # actualy, `toList` in your test
  a.addMoved a
  let s = collect:
    for i, ai in enumerate(a):
      if i >= 6: break
      ai
  doAssert s == [0,1,0,1,0,1]

and edit doc comment to:

  ## Note that `b` becomes empty after the operation.
  ## Self-adding results in an empty list.

=>

  ## Note that `b` becomes empty after the operation unless it has same address as `a`.

that's IMO a saner behavior

  • ditto with SinglyLinkedList

let c = [1, 2, 3].toSinglyLinkedList
assert $c == $c.copy
result = initSinglyLinkedList[T]()
for x in a:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@salvipeter
Copy link
Contributor Author

Done... btw, why can't I write a.enumerate instead of enumerate(a) ? This seems to be against the general logic of the language.

@timotheecour
Copy link
Member

btw, why can't I write a.enumerate instead of enumerate(a) ? This seems to be against the general logic of the language.

=> followup in timotheecour#473 to avoid hijacking this

Copy link
Member

@timotheecour timotheecour left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. thanks for your patience!

@salvipeter
Copy link
Contributor Author

And thank you for your comments, I learned a lot in the process :)

@Araq Araq merged commit 051477b into nim-lang:devel Dec 20, 2020
@timotheecour timotheecour changed the title O(1) concatenation of singly- and doubly linked lists. std/lists: O(1) concatenation of singly- and doubly linked lists. Jan 6, 2021
mildred pushed a commit to mildred/Nim that referenced this pull request Jan 11, 2021
* O(1) concatenation of singly- and doubly linked lists.

There is also new `toSinglyLinkedList` and `toDoublyLinkedList`
functions for conversion from `openArray`s, similarly
to `toHashSet` or `toTable`.

* Add `sequtils` import to runnable examples with `toSeq`.

* Added missing call to runnable examples.

* Add .since annotation, changelog, and tests.

* Rename `lists.concat` as an overload to `lists.append`.

* Renamed `append` to `add` in lists.

* Remove faulty `add` for `DoublyLinkedList`s.

* Improved tests for lists.

* `lists.add` moves the second list; added `lists.copy` for shallow copy.

* More tests for `lists.add` and `lists.copy`.

* More compact tests for lists with templates.

* List concatenation with copying (`add`) and moving (tentatively `addMove`)

* Renamed `addMove` to `addMoved`; renamed arguments according to the style guide.

* Added extended example to `lists.copy`.

* Corrected .since annotations to 1.6

* Add .since annotation, changelog, and tests.

* Rename `lists.concat` as an overload to `lists.append`.

* Renamed `append` to `add` in lists.

* Remove faulty `add` for `DoublyLinkedList`s.

* `lists.add` moves the second list; added `lists.copy` for shallow copy.

* More tests for `lists.add` and `lists.copy`.

* List concatenation with copying (`add`) and moving (tentatively `addMove`)

* Renamed `addMove` to `addMoved`; renamed arguments according to the style guide.

* Since declarations changed to (1,5,1).

* Add .since annotation, changelog, and tests.

* Rename `lists.concat` as an overload to `lists.append`.

* Renamed `append` to `add` in lists.

* Remove faulty `add` for `DoublyLinkedList`s.

* `lists.add` moves the second list; added `lists.copy` for shallow copy.

* More tests for `lists.add` and `lists.copy`.

* List concatenation with copying (`add`) and moving (tentatively `addMove`)

* Renamed `addMove` to `addMoved`; renamed arguments according to the style guide.

* Changelog update.

* Fix rebasing errors.

* Self-adding with `lists.addMove` results in a cycle now.
Added some extra tests.
@salvipeter salvipeter deleted the list-concat branch February 8, 2021 21:27
ardek66 pushed a commit to ardek66/Nim that referenced this pull request Mar 26, 2021
* O(1) concatenation of singly- and doubly linked lists.

There is also new `toSinglyLinkedList` and `toDoublyLinkedList`
functions for conversion from `openArray`s, similarly
to `toHashSet` or `toTable`.

* Add `sequtils` import to runnable examples with `toSeq`.

* Added missing call to runnable examples.

* Add .since annotation, changelog, and tests.

* Rename `lists.concat` as an overload to `lists.append`.

* Renamed `append` to `add` in lists.

* Remove faulty `add` for `DoublyLinkedList`s.

* Improved tests for lists.

* `lists.add` moves the second list; added `lists.copy` for shallow copy.

* More tests for `lists.add` and `lists.copy`.

* More compact tests for lists with templates.

* List concatenation with copying (`add`) and moving (tentatively `addMove`)

* Renamed `addMove` to `addMoved`; renamed arguments according to the style guide.

* Added extended example to `lists.copy`.

* Corrected .since annotations to 1.6

* Add .since annotation, changelog, and tests.

* Rename `lists.concat` as an overload to `lists.append`.

* Renamed `append` to `add` in lists.

* Remove faulty `add` for `DoublyLinkedList`s.

* `lists.add` moves the second list; added `lists.copy` for shallow copy.

* More tests for `lists.add` and `lists.copy`.

* List concatenation with copying (`add`) and moving (tentatively `addMove`)

* Renamed `addMove` to `addMoved`; renamed arguments according to the style guide.

* Since declarations changed to (1,5,1).

* Add .since annotation, changelog, and tests.

* Rename `lists.concat` as an overload to `lists.append`.

* Renamed `append` to `add` in lists.

* Remove faulty `add` for `DoublyLinkedList`s.

* `lists.add` moves the second list; added `lists.copy` for shallow copy.

* More tests for `lists.add` and `lists.copy`.

* List concatenation with copying (`add`) and moving (tentatively `addMove`)

* Renamed `addMove` to `addMoved`; renamed arguments according to the style guide.

* Changelog update.

* Fix rebasing errors.

* Self-adding with `lists.addMove` results in a cycle now.
Added some extra tests.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants