bisect.insort should return the index #96571

Serpens66 · 2022-09-05T04:46:05Z

Feature or enhancement

https://docs.python.org/3/library/bisect.html#bisect.insort
bisect.insort (and of course also insort_left/right) should return the index they inserted the "x" in "a".

Pitch

Usecase:
One wants to insort something and also needs the index it was insorted.

Problem:
The current way would be to use "insort" (which finds the index and inserts it, but does not return the index) and afterwards use bisect_right to find the index again. So this is ~double of the work that is needed.
I assume the user could also instead call bisect_right and afterwards insert "x" in "a" themself and return the index, but I'm sure this also would be much slower than the C implementation of bisect.insort.

Solution:
Simply let "insort" return the index it already found. This should be backwards compatible (since it returns None currently) and not cost any extra time. So the usecase would be fullfilled in ~half of the time that is needed right now.

Disadvantage:
I see no harm in doing so, only advantage for everyone who needs this.

Projects I know which would profit:
grantjenks/python-sortedcontainers#195

edit::
I was a bit confused because the python source code has a python and a C version of bisect... But both versions lack the return of the index, so this request is valid for both.

rhettinger · 2022-09-05T05:26:40Z

The norm for Python is for mutating methods is to return None. This is different from the norms in other languages. We don't really want to deviate from our norms without good reason.

The current way would be to use "insort" (which finds the index and inserts it, but does not return the index) and afterwards use bisect_right to find the index again.

That doesn't make sense to me. Why wouldn't you just do the bisection first and then insert?

i = bisect(arr, val)
arr.insert(i, val)           # Or perhaps arr[i:i] = val

I'll send @grantjenks a note to find out more about what he was thinking.

Serpens66 · 2022-09-05T07:01:42Z

Thank you for your response.

Yes, one can do replace insort with these two steps (like I already wrote, maybe I edited that line after you read it), but without measuring I think I can safely assume that doing this in python is much slower than the C implementation of insort.
...
And the whole purpose of bisect (and using C) is to optimize the insertion process.
So to argue "you can do this in two steps" sounds for me like "why implement something like bisect, you can iterate over every single value in the list and compare it one by one to find the index". So why did python implement bisect and even in C at all?
I think this invalidates your argument about the "two steps".

So the only argument left is "we always return None for mutating methods".
Ok, there is nothing I can do to invalidate this. It is just not efficient and quite sad.

edit:
But I admit I'm no professional programmer, I just need this functionality as fast as possible and thought there is not a single disadvantage in returning the index. But maybe better programmers/grantjenks want to add something to my view..

rhettinger · 2022-09-05T08:13:39Z

So this is ~double of the work that is needed

If insort was followed by bisect, that would be double work. In contrast, running bisect followed by list.insert (not insort) is not double work. There is no overlap between the two steps. The only extra work is a bit of method call overhead.

Serpens66 · 2022-09-05T12:05:16Z

So this is ~double of the work that is needed

If insort was followed by bisect, that would be double work. In contrast, running bisect followed by list.insert (not insort) is not double work. There is no overlap between the two steps. The only extra work is a bit of method call overhead.

Yes, but just as I already wrote twice, I assume that it still would be slower than the C implementation of insort.
If list.insert(...) is exactly as fast (or faster) as/than the insertion code used in the C implementation of insort, then everything is fine and no need to change anything.
But I doubt it and I'm sure grantjenks has more insights in regards to performance, so lets wait what he says...

But does it even matter? If everything I write was correct, would this be enough to break the rule to always return None?

pochmann · 2022-09-05T13:05:45Z

What if there's code that relies on the None result, for example similar to this one relying on set.add returning None? You'd break that.

Wouldn't it create (and at some point delete) a Python int object, which it doesn't now, and that would make it slower? I.e., hurt everyone who doesn't want the index?

rhettinger · 2022-09-05T14:18:44Z

If list.insert(...) is exactly as fast (or faster) as/than the insertion code
used in the C implementation of insort, then everything is fine and no need to change anything.

They both share the same underlying code, ins1(). Aside from calling overhead, the speed is identical.

Yes, but just as I already wrote twice, I assume that it still would be slower than the C implementation of insort.

Your mental model seems to be inaccurate. The Python call list.insert() is implemented in C and runs at C speed.

Serpens66 · 2022-09-05T14:55:13Z

Okay, thank you very much for your responses. You convinced me.
Lets wait a day or two if grantjenks wants to add something. If not I will close this request.

Thank you for your time and explanations.

matthiasgoergens · 2022-09-06T07:55:10Z

I would suggest, you implement a version of what you like as a third party library. Cython is a really convenient way to write C-extensions, if you need that.

You can call your library 'better-bisect' if you want to.

Look at 'more-itertools' for inspiration.

grantjenks · 2022-09-06T18:39:23Z

@rhettinger wrote:

That doesn't make sense to me. Why wouldn't you just do the bisection first and then insert?

And indeed, that's what the code in SortedKeyList.add() does (sortedlist.py):

                idx = bisect_right(_keys[pos], key)
                _lists[pos].insert(idx, value)
                _keys[pos].insert(idx, key)

SortedKeyList.add() is a very hot function and shows up in nearly every benchmark and real-world use case. So removing a function call would be helpful. But it is only the function call overhead cost.

@pochmann raises two points:

What if there's code that relies on the None result, for example similar to this one relying on set.add returning None? You'd break that.

That's a good point. I'm not sure it's worth a breaking change.

Wouldn't it create (and at some point delete) a Python int object, which it doesn't now, and that would make it slower? I.e., hurt everyone who doesn't want the index?

Also a good point. And in SortedContainers, maintaining the positional index is much more expensive which is why it's not done by default.

Thanks @Serpens66 for asking here and to all the helpful replies.

Serpens66 added the type-feature A feature request or enhancement label Sep 5, 2022

Serpens66 mentioned this issue Sep 5, 2022

Feature request: parameterize SortedList used in SortedDict through inheritance grantjenks/python-sortedcontainers#204

Open

rhettinger self-assigned this Sep 5, 2022

hauntsaninja added the pending The issue will be closed if no feedback is provided label Sep 6, 2022

rhettinger closed this as not planned Won't fix, can't repro, duplicate, stale Sep 6, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bisect.insort should return the index #96571

bisect.insort should return the index #96571

Serpens66 commented Sep 5, 2022 •

edited

Loading

rhettinger commented Sep 5, 2022 •

edited

Loading

Serpens66 commented Sep 5, 2022 •

edited

Loading

rhettinger commented Sep 5, 2022

Serpens66 commented Sep 5, 2022

pochmann commented Sep 5, 2022 •

edited

Loading

rhettinger commented Sep 5, 2022

Serpens66 commented Sep 5, 2022

matthiasgoergens commented Sep 6, 2022

grantjenks commented Sep 6, 2022

bisect.insort should return the index #96571

bisect.insort should return the index #96571

Comments

Serpens66 commented Sep 5, 2022 • edited Loading

Feature or enhancement

Pitch

rhettinger commented Sep 5, 2022 • edited Loading

Serpens66 commented Sep 5, 2022 • edited Loading

rhettinger commented Sep 5, 2022

Serpens66 commented Sep 5, 2022

pochmann commented Sep 5, 2022 • edited Loading

rhettinger commented Sep 5, 2022

Serpens66 commented Sep 5, 2022

matthiasgoergens commented Sep 6, 2022

grantjenks commented Sep 6, 2022

Serpens66 commented Sep 5, 2022 •

edited

Loading

rhettinger commented Sep 5, 2022 •

edited

Loading

Serpens66 commented Sep 5, 2022 •

edited

Loading

pochmann commented Sep 5, 2022 •

edited

Loading