Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Infinite loop when using threadpool in notebook #1269

Open
jacksonriley opened this issue Oct 2, 2024 · 0 comments
Open

Infinite loop when using threadpool in notebook #1269

jacksonriley opened this issue Oct 2, 2024 · 0 comments

Comments

@jacksonriley
Copy link

jacksonriley commented Oct 2, 2024

Hi there, I've come across an infinite loop inside ipykernel which I can reliably reproduce when using multiple concurrent.futures.ThreadPoolExecutors inside a notebook with ipykernel >= 6.29.0. This results in the execution of the relevant cell hanging forever.

This looks to have been introduced in #1186, which added the handling around self._thread_to_parent. I believe this will still be present on main.

The infinite loop occurs here:

while identity in self._thread_to_parent:
identity = self._thread_to_parent[identity]

when there's a cycle in self._thread_to_parent.

Origin of cycle

ipykernel uses threading.get_ident() to grab a (temporarily!) unique identifier for a threading.Thread thread when it runs, and stash it in a self._thread_to_parent dict on stdout/stderr:

stream._thread_to_parent[self.ident] = parent

The threading.get_ident() function states

Thread identifiers may be recycled when a thread exits and another thread is created.

Imagine a thread A with ident 1 having a child thread B with ident 2. Once those threads exit, there's nothing preventing a thread C with ident 2 having a child thread D with ident 1.

As the stdout and stderr streams are long-lived, you can therefore end up with a cycle in self._thread_to_parent, which causes this infinite loop. I've proved that this is the case with some hacked logging in the relevant places:

                while identity in self._thread_to_parent:
+                   import datetime
+                   with open("debug.log", "a") as f:
+                       f.write(f"{datetime.datetime.now()}: {identity} is in self._thread_to_parent. Value is {self._thread_to_parent[identity]}\n")
                    identity = self._thread_to_parent[identity]

and

                    else:
+                       import datetime
+                       is_stdout = stream is stdout
+                       is_stderr = stream is stderr
+                       with open("debug.log", "a") as f:
+                           f.write(f"{datetime.datetime.now()}: Writing {self.ident}: {parent} into stream._thread_to_parent. Stream is {stream}. {is_stdout=} {is_stderr=} stream._thread_to_parent is {stream._thread_to_parent}\n")
                        stream._thread_to_parent[self.ident] = parent

This gives something like the below, where we end up with an infinite loop due to the cycle of [140278168012352, 140277658416704]:

The first link (stderr) ----------> 2024-10-02 13:34:26.043559: Writing 140277658416704: 140278168012352 into stream._thread_to_parent. Stream is <ipykernel.iostream.OutStream object at 0x7f9527f63be0>. is_stdout=True is_stderr=False stream._thread_to_parent is {140278159619648: 140278168012352, 140278151226944: 140278168012352, 140278142834240: 140278168012352, 140277658416704: 140278168012352, 140277650024000: 140278168012352}
                                    2024-10-02 13:34:26.043658: Writing 140278142834240: 140278168012352 into stream._thread_to_parent. Stream is <ipykernel.iostream.OutStream object at 0x7f9527f63a30>. is_stdout=False is_stderr=True stream._thread_to_parent is {140278159619648: 140278168012352, 140278151226944: 140278168012352, 140278142834240: 140278168012352, 140277658416704: 140278168012352, 140277650024000: 140278168012352}
                                    2024-10-02 13:34:26.043746: Writing 140278151226944: 140278168012352 into stream._thread_to_parent. Stream is <ipykernel.iostream.OutStream object at 0x7f9527f63be0>. is_stdout=True is_stderr=False stream._thread_to_parent is {140278159619648: 140278168012352, 140278151226944: 140278168012352, 140278142834240: 140278168012352, 140277658416704: 140278168012352, 140277650024000: 140278168012352}
                                    2024-10-02 13:34:26.043984: 140278159619648 is in self._thread_to_parent. Value is 140278168012352
                                    2024-10-02 13:34:26.044021: Writing 140277650024000: 140278168012352 into stream._thread_to_parent. Stream is <ipykernel.iostream.OutStream object at 0x7f9527f63be0>. is_stdout=True is_stderr=False stream._thread_to_parent is {140278159619648: 140278168012352, 140278151226944: 140278168012352, 140278142834240: 140278168012352, 140277658416704: 140278168012352, 140277650024000: 140278168012352}
                                    2024-10-02 13:34:26.044236: Writing 140278151226944: 140278168012352 into stream._thread_to_parent. Stream is <ipykernel.iostream.OutStream object at 0x7f9527f63a30>. is_stdout=False is_stderr=True stream._thread_to_parent is {140278159619648: 140278168012352, 140278151226944: 140278168012352, 140278142834240: 140278168012352, 140277658416704: 140278168012352, 140277650024000: 140278168012352}
The first link (stderr) ----------> 2024-10-02 13:34:26.044457: Writing 140277658416704: 140278168012352 into stream._thread_to_parent. Stream is <ipykernel.iostream.OutStream object at 0x7f9527f63a30>. is_stdout=False is_stderr=True stream._thread_to_parent is {140278159619648: 140278168012352, 140278151226944: 140278168012352, 140278142834240: 140278168012352, 140277658416704: 140278168012352, 140277650024000: 140278168012352}
                                    2024-10-02 13:34:26.044587: Writing 140277650024000: 140278168012352 into stream._thread_to_parent. Stream is <ipykernel.iostream.OutStream object at 0x7f9527f63a30>. is_stdout=False is_stderr=True stream._thread_to_parent is {140278159619648: 140278168012352, 140278151226944: 140278168012352, 140278142834240: 140278168012352, 140277658416704: 140278168012352, 140277650024000: 140278168012352}
                                    2024-10-02 13:34:26.044749: 140278142834240 is in self._thread_to_parent. Value is 140278168012352
                                    2024-10-02 13:34:26.045033: 140277658416704 is in self._thread_to_parent. Value is 140278168012352
                                    2024-10-02 13:34:26.045135: 140278142834240 is in self._thread_to_parent. Value is 140278168012352
                                    2024-10-02 13:34:26.045221: 140278151226944 is in self._thread_to_parent. Value is 140278168012352
                                    2024-10-02 13:34:26.045318: 140278159619648 is in self._thread_to_parent. Value is 140278168012352
                                    2024-10-02 13:34:26.045531: 140277650024000 is in self._thread_to_parent. Value is 140278168012352
                                    2024-10-02 13:34:26.045678: 140277658416704 is in self._thread_to_parent. Value is 140278168012352
                                    2024-10-02 13:34:26.045846: 140277650024000 is in self._thread_to_parent. Value is 140278168012352
                                    2024-10-02 13:34:26.045929: 140278151226944 is in self._thread_to_parent. Value is 140278168012352
                                    2024-10-02 13:34:26.147930: 140277658416704 is in self._thread_to_parent. Value is 140278168012352
                                    2024-10-02 13:34:26.148022: 140277658416704 is in self._thread_to_parent. Value is 140278168012352
                                    2024-10-02 13:34:26.148076: 140277658416704 is in self._thread_to_parent. Value is 140278168012352
                                    2024-10-02 13:34:26.148108: 140277658416704 is in self._thread_to_parent. Value is 140278168012352
                                    2024-10-02 13:34:26.155890: 140278151226944 is in self._thread_to_parent. Value is 140278168012352
Completing the cycle (stdout) ----> 2024-10-02 13:34:26.155942: Writing 140278168012352: 140277658416704 into stream._thread_to_parent. Stream is <ipykernel.iostream.OutStream object at 0x7f9527f63be0>. is_stdout=True is_stderr=False stream._thread_to_parent is {140278159619648: 140278168012352, 140278151226944: 140278168012352, 140278142834240: 140278168012352, 140277658416704: 140278168012352, 140277650024000: 140278168012352}
Completing the cycle (stderr) ----> 2024-10-02 13:34:26.156027: Writing 140278168012352: 140277658416704 into stream._thread_to_parent. Stream is <ipykernel.iostream.OutStream object at 0x7f9527f63a30>. is_stdout=False is_stderr=True stream._thread_to_parent is {140278159619648: 140278168012352, 140278151226944: 140278168012352, 140278142834240: 140278168012352, 140277658416704: 140278168012352, 140277650024000: 140278168012352}
                                    2024-10-02 13:34:26.156131: 140278168012352 is in self._thread_to_parent. Value is 140277658416704
                                    2024-10-02 13:34:26.156171: 140278168012352 is in self._thread_to_parent. Value is 140277658416704
                                    2024-10-02 13:34:26.156212: 140277658416704 is in self._thread_to_parent. Value is 140278168012352
                                    2024-10-02 13:34:26.156244: 140278168012352 is in self._thread_to_parent. Value is 140277658416704
                                    2024-10-02 13:34:26.156281: 140277658416704 is in self._thread_to_parent. Value is 140278168012352
                                    2024-10-02 13:34:26.156308: 140278168012352 is in self._thread_to_parent. Value is 140277658416704
                                    2024-10-02 13:34:26.156334: 140277658416704 is in self._thread_to_parent. Value is 140278168012352
                                    2024-10-02 13:34:26.156358: 140278168012352 is in self._thread_to_parent. Value is 140277658416704
                                    2024-10-02 13:34:26.156392: 140277658416704 is in self._thread_to_parent. Value is 140278168012352
                                    2024-10-02 13:34:26.156459: 140277658416704 is in self._thread_to_parent. Value is 140278168012352
                                    2024-10-02 13:34:26.156501: 140278168012352 is in self._thread_to_parent. Value is 140277658416704
                                    2024-10-02 13:34:26.156525: 140277658416704 is in self._thread_to_parent. Value is 140278168012352
                                    2024-10-02 13:34:26.156547: 140278168012352 is in self._thread_to_parent. Value is 140277658416704
                                    2024-10-02 13:34:26.156570: 140277658416704 is in self._thread_to_parent. Value is 140278168012352
...infinite loop

Reproduction

Apologies, it's probably not the most minimal reproduction, but this cell will reliably reproduce this infinite loop. It might take a couple of minutes to do so, but you'll see the output stop when it happens (and if you use the above added debugging, will see it clearly start infinite-looping)

import time
import threading
from concurrent import futures
import datetime

def func(*args, **kwargs):
    # Need this first layer of indirection to get into the `else` of https://github.com/ipython/ipykernel/blob/0fa543936d3f9ee69f6b9f8bd5a316692fde9aaa/ipykernel/ipkernel.py#L742
    print(f"Launching with {threading.get_ident()}")
    with futures.ThreadPoolExecutor(5) as executor:
        print(list(executor.map(innermost_func, [1, 2, 3, 4, 5])))

def innermost_func(*args, **kwargs):
    print(f"Hello from thread {threading.get_ident()}")
    time.sleep(0.1)

count = 0
while True:
    count += 1
    print(f"{datetime.datetime.now()} Iteration number {count}")
    threads = 1
    print(f"Launching with {threading.get_ident()}")
    with futures.ThreadPoolExecutor(threads) as executor:
        print(list(executor.map(func, [1])))

My environment

Python 3.10.12
ipykernel==6.29.4
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant