Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression in overallocation for literal list initialization in v3.9+ #87740

Closed
chadnetzer mannequin opened this issue Mar 21, 2021 · 6 comments
Closed

Regression in overallocation for literal list initialization in v3.9+ #87740

chadnetzer mannequin opened this issue Mar 21, 2021 · 6 comments
Labels
3.10 only security fixes 3.11 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage

Comments

@chadnetzer
Copy link
Mannequin

chadnetzer mannequin commented Mar 21, 2021

BPO 43574
Nosy @rhettinger, @gpshead, @methane, @serhiy-storchaka, @chadnetzer, @corona10, @brandtbucher, @Zeturic
PRs
  • bpo-43574: Dont overallocate list literals #24954
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2021-03-21.02:31:54.743>
    labels = ['interpreter-core', '3.11', '3.10', 'performance']
    title = 'Regression in overallocation for literal list initialization in v3.9+'
    updated_at = <Date 2022-03-08.07:27:36.247>
    user = 'https://github.com/chadnetzer'

    bugs.python.org fields:

    activity = <Date 2022-03-08.07:27:36.247>
    actor = 'methane'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Interpreter Core']
    creation = <Date 2021-03-21.02:31:54.743>
    creator = 'Chad.Netzer'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 43574
    keywords = ['patch']
    message_count = 3.0
    messages = ['389207', '390174', '414728']
    nosy_count = 8.0
    nosy_names = ['rhettinger', 'gregory.p.smith', 'methane', 'serhiy.storchaka', 'Chad.Netzer', 'corona10', 'brandtbucher', 'Zeturic']
    pr_nums = ['24954']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'resource usage'
    url = 'https://bugs.python.org/issue43574'
    versions = ['Python 3.10', 'Python 3.11']

    @chadnetzer
    Copy link
    Mannequin Author

    chadnetzer mannequin commented Mar 21, 2021

    In Python v3.9+ there was a regression in the amount of used memory for
    list-literals, due to switching to using list_extend() to allocate memory for
    the new list to accomodate the literal elements.

    Example, in Python v3.8.x (and before):

    $ python38
    Python 3.8.5 (default, Sep  4 2020, 02:22:02)
    >>> [1].__sizeof__()
    48
    >>> [1,2].__sizeof__()
    56
    >>> [1,2,3].__sizeof__()
    64
    

    whereas for v3.9 (and later):

    $ python39
    Python 3.9.2 (default, Feb 19 2021, 17:09:53)
    >>> [1].__sizeof__()
    48
    >>> [1,2].__sizeof__()
    56
    >>> [1,2,3].__sizeof__()
    104  # a 60% increase in memory allocated
    

    However, this seems like an unintented regression, and is a side-effect of the
    new way of building the lists from literals, using the list_extend() function (via
    list_resize(), which overallocates). In particular, a consequence is that
    making a copy of the list that's initialized from a literal can end up using
    less memory:

    $ python39
    Python 3.9.2 (default, Feb 19 2021, 17:09:53)
    >>> a = [1,2,3]
    >>> b = list(a)  # Same behavior if list.copy() or slice copy is performed
    >>> a.__sizeof__()
    104
    >>> b.__sizeof__()
    64
    

    Prior to v3.9, the byte-code for making a list from a literal had the
    "BUILD_LIST" opcode with an explicit length argument, allowing allocation of
    the exact amount of memory needed for the literal. As of v3.9, the
    LIST_EXTEND opcode is used, instead. I believe the simplest way of restoring
    the old behavior is to change list_extend() to not overallocate when the list
    being extended currently has 0 elements.

    Ie. a minimal-change patch to restore the previous behavior (though with a
    side-effect of removing the overallocaton of a list that is initialzed empty,
    and then immediately extended):

    diff --git a/Objects/listobject.c b/Objects/listobject.c
    index e7987a6d35..7820e033af 100644
    --- a/Objects/listobject.c
    +++ b/Objects/listobject.c
    @@ -75,8 +75,9 @@ list_resize(PyListObject *self, Py_ssize_t newsize)
    if (newsize - Py_SIZE(self) > (Py_ssize_t)(new_allocated - newsize))
    new_allocated = ((size_t)newsize + 3) & ~(size_t)3;

    -    if (newsize == 0)
    -        new_allocated = 0;
    +    /* Don't overallocate for lists that start empty or are set to empty. */
    +    if (newsize == 0 || Py_SIZE(self) == 0)
    +        new_allocated = newsize;
         num_allocated_bytes = new_allocated * sizeof(PyObject *);
         items = (PyObject **)PyMem_Realloc(self->ob_item, num_allocated_bytes);
         if (items == NULL) {

    Relevant/related bugs/PRs:
    # Switched to initializing list literals w/ LIST_EXTEND
    https://bugs.python.org/issue39320
    #17984

    # Commit where over-allocation of list literals first appeared
    https://bugs.python.org/issue38328
    #17114
    6dd9b64

    https://bugs.python.org/issue38373
    #18952
    2fe815e

    @chadnetzer chadnetzer mannequin added 3.9 only security fixes 3.10 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage labels Mar 21, 2021
    @chadnetzer
    Copy link
    Mannequin Author

    chadnetzer mannequin commented Apr 4, 2021

    For bpo-43574, my initial plan was to change list_resize() so that it wouldn't
    overallocate empty lists that were resized to be bigger, thus restoring the
    overallocation behavior for list-literals to be like that of earlier Python
    releases.

    However, the list_resize() helper is also used by list appends and inserts, as
    well as other list methods that change list sizes. This proposed strategy of
    not-overallocating any resize that starts with an empty list also affects
    their behavior, and with the list-overallocation calculation that was
    introduced with bpo-38373, this causes unexpected side-effects on the amount
    of overallocation used for appending/inserting with an empty list.

    Therefore, my updated proposal handles this by changing the overallocation
    behavior in list_resize() only when empty lists are increased by an amount
    greater than 1 during list_resize(). This basically retains the behavior of
    not overallocating list-literals, while not changing the behavior for list
    append/insert on an empty list.

    To understand why this updated proposed change won't also cause overallocation
    for length-1 literals, see below how length-1 and length-2 list-literals are
    compiled using BUILD_LIST instead of LIST_EXTEND:

    Python 3.9.2 (default, Mar 26 2021, 23:27:12)
    >>> import dis
    >>> def foo():
    ...   [1]
    ...
    >>> dis.dis(foo)
      2           0 LOAD_CONST               1 (1)
                  2 BUILD_LIST               1
                  4 POP_TOP
                  6 LOAD_CONST               0 (None)
                  8 RETURN_VALUE
    >>> def bar():
    ...   [1,2]
    ...
    >>> dis.dis(bar)
      2           0 LOAD_CONST               1 (1)
                  2 LOAD_CONST               2 (2)
                  4 BUILD_LIST               2
                  6 POP_TOP
                  8 LOAD_CONST               0 (None)
                 10 RETURN_VALUE
    >>> def baz():
    ...   [1,2,3]
    ...
    >>> dis.dis(baz)
      2           0 BUILD_LIST               0
                  2 LOAD_CONST               1 ((1, 2, 3))
                  4 LIST_EXTEND              1
                  6 POP_TOP
                  8 LOAD_CONST               0 (None)
                 10 RETURN_VALUE
    

    Hence, the change to list_resize(), which is called by list_extend() but not
    the BUILD_LIST opcode, won't impact list-literals of length 1 and 2.

    And to show how the originally proposed change (no overallocation for
    list_resize() of empty lists) inadvertently changed how list append/insert
    overallocation worked, let's first take a look at how current Python (3.9+)
    works:

    >>> l = []
    >>> l.__sizeof__()
    40
    >>> l.append("a")  # First append will overallocate to capacity 4
    >>> l.__sizeof__()
    72
    >>> l.append("b")
    >>> l.__sizeof__()
    72
    >>> l.append("c")
    >>> l.__sizeof__()
    72
    >>> l.append("d")
    >>> l.__sizeof__()
    72
    >>> l.append("e")
    >>> l.__sizeof__()
    104
    >>> l2 = ["a"]  # Length 1 (and 2) literals don't overallocate
    >>> l2.__sizeof__()
    48
    >>> # However, note that the first append will overallocate to capacity 8
    >>> l2.append("b")
    >>> l2.__sizeof__()
    104
    

    Note that the list-literal of length 1 isn't overallocated, and that
    appending to it skips the capacity 4 list and goes straight to capacity 8.

    However, with my originally proposed change, this overallocation behavior is
    duplicated even for lists that start empty, because the first append then
    effectively becomes a non-overallocated list of length and capacity 1, which
    means that the second append overallocates to capacity 8.

    Here was the overallocation behavior of my first proposal, when an empty list
    was appended:

    Python 3.10.0a6+ (heads/bpo43574-dont-overallocate-list-literals:7436223a71, Apr  3 2021, 16:32:22) [Clang 12.0.0 (clang-1200.0.32.29)] on darwin
    >>> l = []
    >>> l.__sizeof__()
    40
    >>> l.append("a")
    >>> l.__sizeof__()  # No overallocation on first append
    48
    >>> l.append("b")
    >>> l.__sizeof__()  # Second append jumps to capacity 8, skipping 4
    104
    

    This is unexpected and likely undesirable behavior, as lists being created
    empty and then appended to should be expected to follow the same documented 4,
    8, 16, 24, ... growth pattern for bpo-38373.

    Fixing this is fairly simple because of the special-casing for length 1 and 2
    list-literals, just by checking for the use of append/insert on an empty list.
    Because they both change the size of a length 0 list to length 1, and the
    list_resize() for literals always kicks in for changing to lengths 3 or more,
    this updates proposal will retain the current empty-list insert/append
    overallocation behavior, while still allowing list-literals of length 1 or
    more to not overallocate.

    With this updated proposal, avoiding a change to the behavior of an empty list
    append/insert, the overallocation for list-literals is still removed, and the
    current overallocation behavior for empty lists being appended is preserved:

    Python 3.10.0a6+ (heads/bpo43574-dont-overallocate-list-literals-v2:56b361221d, Apr  3 2021, 17:35) [Clang 12.0.0 (clang-1200.0.32.29)] on darwin
    >>> l = []
    >>> l.__sizeof__()
    40
    >>> l.append("a")
    >>> l.__sizeof__()
    72
    >>> l.append("b")
    >>> l.__sizeof__()
    72
    >>> l2 = ["a"]
    >>> l2.__sizeof__()
    48
    >>> l2 = ["a", "b", "c"] # list_extend() literal setup case starts w/ 3 elements
    >>> l2.__sizeof__()
    64
    

    So, with this variant on my original proposal, using the current compilation
    behavior of setting up list-literals of length 3 or greater using
    list_extend(), we regain the previous Python behavior of not overallocating
    list-literals, and we also retain the current overallocation behavior for
    empty lists that are appended to.

    Just to document it, here are the functions in Objects/listobject.c that are
    impacted by a change in list_resize(), and the proposed bpo-43574 change in
    particular:

    ins1()  # insert 1 element in list, including empty list.
    app1()  # append 1 element to list.  Same impact as with ins1()
    list_ass_slice()  # can assign a sequence to a list slice, expanding list
    list_inplace_repeat()  # No impact, early exit for len == 0
    list_extend()  # Now used for list-literals of len > 2
    list_pop() # No impact, same behavior for list becoming len == 0
    list_ass_subscript()  # No impact, list_resize() only called for deletion

    The list_ass_slice() case is changed by this update, for an empty list, as
    overallocation will not occur on the first list_extend() with more than 1
    element (ie. just like list-literals, the overallocation will be delayed
    until a future list_resize()). Here's an example of the change, first for
    current Python 3.9+ behavior:

    
    Python 3.9.2 (default, Mar 26 2021, 23:27:12)
    >>> l = []
    >>> l.__sizeof__()
    40
    >>> l[:] = ["a"]
    >>> l.__sizeof__()
    72
    >>> l = []
    >>> l[:] = ["a", "b"]
    >>> l.__sizeof__()
    104
    >>> l.append("c")
    >>> l.__sizeof__()
    104
    

    and the behavior with my revised proposal for non-overallocation of
    list-literals:

    Python 3.10.0a6+ (heads/bpo43574-dont-overallocate-list-literals-v2:56b361221d, Apr  3 2021, 17:35) [Clang 12.0.0 (clang-1200.0.32.29)] on darwin
    >>> l = []
    >>> l.__sizeof__()
    40
    >>> l[:] = ["a"]
    >>> l.__sizeof__()
    72
    >>> l = []
    >>> l[:] = ["a", "b"]
    >>> l.__sizeof__()
    56
    >>> l.append("c")
    >>> l.__sizeof__()
    104
    

    @methane
    Copy link
    Member

    methane commented Mar 8, 2022

    Relating issue: https://twitter.com/nedbat/status/1489233208713437190
    Current overallocation strategy is rough. We need to make it more smooth.

    @methane methane added 3.11 only security fixes and removed 3.9 only security fixes labels Mar 8, 2022
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @sweeneyde
    Copy link
    Member

    I think this is no longer present in Python 3.11, thanks to #31816.

    py -3.9 -c "print(([1,2,3,4,5,6,7,8,9].__sizeof__() - [].__sizeof__())//8)"
    12
    py -3.10 -c "print(([1,2,3,4,5,6,7,8,9].__sizeof__() - [].__sizeof__())//8)"
    12
    py -3.11 -c "print(([1,2,3,4,5,6,7,8,9].__sizeof__() - [].__sizeof__())//8)"
    9

    Can this be closed?

    @methane
    Copy link
    Member

    methane commented May 18, 2022

    I backported #31816 to 3.10 branch. Close this issue for now.

    methane pushed a commit that referenced this issue May 18, 2022
    …31816)
    
    (cherry picked from commit 2153daf)
    
    This patch fixes gh-87740 too.
    
    Co-authored-by: Crowthebird <[email protected]>
    @methane methane closed this as completed May 18, 2022
    @serhiy-storchaka
    Copy link
    Member

    This change introduced a regression. See #92914.

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.10 only security fixes 3.11 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants