Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Free Threading in Python 3.13 #608

Open
8 of 11 tasks
ronaldoussoren opened this issue May 10, 2024 · 11 comments
Open
8 of 11 tasks

Support Free Threading in Python 3.13 #608

ronaldoussoren opened this issue May 10, 2024 · 11 comments
Labels
enhancement New feature or request

Comments

@ronaldoussoren
Copy link
Owner

ronaldoussoren commented May 10, 2024

Is your feature request related to a problem? Please describe.
Python 3.13 introduces an (experimental) free threading option (aka "NO_GIL"). PyObjC should support this feature.

Describe the solution you'd like

TL/DR: This is a lot more work than "build extension modules using a free threaded build"

Adding this requires a number of changes to PyObjC and the build process:

  • Ensure that all extension build in a free threaded configuration
  • Migrate all framework extension modules to 2 phase init
  • Migrate pyobjc-core to 2 phase init
  • Check all framework extension modules for possible threading issues and enable free threading for them (look for Py_MOD_GIL_USED)
  • Add fine grained locking in pyobjc-core, and then enable free threading
  • Update the testing harness to also test a free threading build for 3.13 (both architectures).
  • Update the distribution building script to also build free threaded extension modules
    This one and the previous one are implemented by updating _common_definitions.py
  • Add tests that check that the GIL is actually disabled (explicit tests in pyobjc-core, include check in framework binding sanity check as well).
  • Add documentation about support for free threading
  • Add test scripts that stress tests the free threading support and run those with TSAN enabled
  • Review all critical section to ensure they are minimal and don't invoke arbitrary code.

The hard part is updating pyobjc-core, that code relies on module global state that's protected by the GIL. A somewhat easy option is to switch to a "global pyobjc lock" in the short term and slowly peel of bits that can use more fine grained locking.

See also:

@ronaldoussoren
Copy link
Owner Author

ronaldoussoren commented Jul 14, 2024

Various changes:

  • Drop usage of PySequence_Fast, this API can already be problematic with the GIL (when the argument is a list it is returned as-is, which means the borrowed reference returned from ..._GET_ITEM can get stale when the list is mutated; there is no alternative API that avoids the borrowed reference)
  • Drop usage of PyModule_GetDict: Returns a borrowed reference. Use of this API turned out to be unnecessary in the first place.
  • Lock shared global state, even if that state is only used by APIs that are limited to the main thread according to the API contract in Apple's frameworks.
  • Don't return borrowed references from internal APIs.
  • Use Py_BEGIN_CRITICAL_SECTION to protect object state for mutable values implemented in C.
  • Use PyDict_GetItemRef instead of PyDict_GetItem/PyDict_GetItemWithError
  • Don't use PyList_ macro's
  • Switch to PyList_GetItemRef (with backward compatibility stub)
  • NSMapTable is not thread-safe according to Apple's documentation (https://developer.apple.com/library/archive/documentation/Cocoa/Conceptual/Multithreading/ThreadSafetySummary/ThreadSafetySummary.html)

For the framework bindings there is in general no global state to protected (other than the ObjC runtime, which is thread safe and is accessed through pyobjc-core).

There are a number of frameworks with extension contexts for which the implementation uses python tuples to store information. Those should be as safe without the GIL as with the GIL because the context info is owned by whatever object performs the callback. There is a very small risk at crashes when the owning object gets released, that's unchanged with and without the GIL and I'm not yet convinced that there's a window for a race condition in the first place.

ronaldoussoren added a commit that referenced this issue Jul 14, 2024
ronaldoussoren added a commit that referenced this issue Jul 14, 2024
Primary reason is that ownership of the result
of PySequence_Fast_GET_ITEM is a borrowed reference
that may be in a list. That's problematic in regular
builds, and even more so in free threaded builds because
the list can be mutated while using the item.

Issue #608
ronaldoussoren added a commit that referenced this issue Jul 14, 2024
Still far removed from actually supporting GIL-less operations.

Issue #608
ronaldoussoren added a commit that referenced this issue Jul 15, 2024
Check all extensions in a the various framework bindings for safety
in a free threaded build, and add locking where necessary.

Issue #608
@ronaldoussoren
Copy link
Owner Author

With changeset 5058104 the extension modules in the various framework bindings should be thread safe even in free-threading mode.

That's still only a first step toward supporting that mode, the core bridge still needs to be converted and that's more involved that the fairly trivial extension modules in framework bindings.

ronaldoussoren added a commit that referenced this issue Jul 15, 2024
This also audits the module for free-threaded use, which
required using PyDict_GetItemRef. That's the cause for updating
the various "pyobjc-compat.h" files: Add a backward compat
implementation for older python versions.

Issue #608
ronaldoussoren added a commit that referenced this issue Jul 16, 2024
Note that the extension is not (yet) compatible with free threading.

Issue #608
ronaldoussoren added a commit that referenced this issue Jul 16, 2024
All of them should also be safe for use with free threading
(with some ad-hoc atomics to protect global counters).

Issue #608
ronaldoussoren added a commit that referenced this issue Jul 16, 2024
ronaldoussoren added a commit that referenced this issue Jul 16, 2024
Switch to the function API, and in particular to
PyList_GetItemRef instead of PyList_GET_ITEM because
the latter is problematic in free threaded mode.

Issue #608
ronaldoussoren added a commit that referenced this issue Jul 23, 2024
First steps in migrating to PyDict_GetItemRef for fetching
items from a dict because the older APIs return a borrowed
reference.

Also stop using "PyDict.*String" APIs, explicitly construct
a string object where needed and use *PyObjCNM* constants
when using constant strings.

The changeset also marks unwanted APIs as unavailable in
"python-api-used.h" to ensure that there will be compile errors
when building using the static analyser.

Finally disable optimization for now to easy debugging while
reworking pyobjc-core (will be renabled later).

Issue #608
ronaldoussoren added a commit that referenced this issue Jul 24, 2024
There's a couple of other calls to PyDict_GetItem left, those
are in functions that return the borrowed reference and need
further restructuring to switch to returning a new reference.

Issue #608
ronaldoussoren added a commit that referenced this issue Aug 6, 2024
Also drop usage of PyDict_SetItemString.

Issue #608
ronaldoussoren added a commit that referenced this issue Aug 6, 2024
This enabled switching to PyDict_GetItemRef in that function.

Issue #608.
@ronaldoussoren
Copy link
Owner Author

Also dropped usage of PyDict_*String APIs that make it easier to convert to PyDict_GetItemRef (needed for PyDict_GetItemString, other String APIs were changed for consistency).

ronaldoussoren added a commit that referenced this issue Aug 7, 2024
The only calls to PyDict_GetItem left are
those in a couple of functions using borrowed
references.

Issue #608.
ronaldoussoren added a commit that referenced this issue Aug 7, 2024
Also adds os.PathLike support to the FSRef constructor.

Issue #608
ronaldoussoren added a commit that referenced this issue Aug 7, 2024
These are updates based on the audit of a number
of source files in pyobjc-core. Slowly getting
closer to the parts of the implementation that
need more design work w.r.t. locking.

Issue #608
ronaldoussoren added a commit that referenced this issue Aug 11, 2024
PySequence_Fast has semantics is not ideal in general, and
more so in free threaded mode: when the argument is a python
list it is returned as is, which means the borrowed references
returned from ..._GET_ITEM are problematic due to concurrent
modification of the list.

Switch to a new ``PyObjCSequence_Tuple`` API that is a version
of ``PySequence_Tuple`` with a message that gives more context
to th user. The error raised by ``PySequene_Tuple`` is used as the
``__cause__`` of the exception raised by the new API.

Issue #608
@jaraco
Copy link
Contributor

jaraco commented Aug 25, 2024

Thanks for working on this. I'm eager for there to be something that builds and installs, even if it crashes or is unstable when it runs.

I've opted into Python 3.13t as my main Python, including my shell and test runners, etc, but I've had to fall back to another Python in many cases because pyobjc simply fails to build. In many cases, the dependency is unused, but its presence blocks the depending project on installing.

Installing from source, I'm also affected by #620, so it's possible once that's addressed, I'll be able to build from source and get an unstable install.

I share all of this just so you're aware of my situation. Unless there's something trivially easy you can do, I'm not looking for any support or changes. I'm mainly aiming to share the status from my perspective. I appreciate your work.

@ronaldoussoren
Copy link
Owner Author

I'll push out a release for #620 soonish, but other than that I won't release wheels for the free threaded build of Python until I'm somewhat sure that pyobjc-core actually supports running without the GIL.

I'm getting closer to that, but need to finish work on a locking strategy and internal API for the two mappings between original objects and their proxy (from Python to ObjC and v.v.). Those mappings are needed to preserve identity, which is necessary for correctness.

The hardest part at this point is finding the time to actually do the work (famous last words...).

ronaldoussoren added a commit that referenced this issue Sep 6, 2024
Issue #608

Slowly getting closer to the core bridge being ready
to enable GIL-less operation.
ronaldoussoren added a commit that referenced this issue Sep 7, 2024
@ronaldoussoren
Copy link
Owner Author

FYI: The upcoming 10.3.2 release will include wheels for the free threaded variant of Python, but will still require the GIL. Even this required changes to the code due to using the limited API in framework bindings.

ronaldoussoren added a commit that referenced this issue Nov 29, 2024
This changeset does two things:

1. Add locking to the functions for updating and quering
   the registry (for the free-threaded build only)

2. Switch the mapping from Python to Objective-C proxies to
   using weak values for the Objective-C proxies.

   That makes clearing the mapping thread safe for both
   build variants, and allows removing the implementation of
   ``-release`` in the ``OC_Python*`` classes that acquired
   the GIL to remove a race condition between the invocation of
   ``-dealloc``  in a just released Objective-C value and removing
   that value from the mapping in a different thread.
@ronaldoussoren
Copy link
Owner Author

Big step towards finishing this work: the proxy-registry (mapping from a Python object to its Objective-C proxy, and mapping an Objective-C object to its Python proxy) is now compatible with free-threading.

(Still 19 files to go for the audit)

@ronaldoussoren
Copy link
Owner Author

OC_Python* classes intentionally do not use locking for accessing their state. That's primarily because these values are immutable. There's a small window for seeing incomplete values in the iniWithCoder: implementation, but that's not something one can fix with locking.

ronaldoussoren added a commit that referenced this issue Nov 29, 2024
The various OC_Python* classes are now compatible with free threading,
most did not require changes because the proxy objects themself are
immutable (e.g. the fields of an OC_PythonArray don't change after
initialization, even if the list is a proxy for is mutated)
ronaldoussoren added a commit that referenced this issue Dec 6, 2024
this changeset makes updating the option values thread safe in
a free-threaded build.

Also switch the usage of PyObjC_Encoder and PyObjC_Decoder values
to a safe pattern for free-threading. This also simplifies those
usages (e.g. a win regardless of free-threading).

Finally... the default value for PyObject* options is now Py_None
instead of NULL, which can simplify there usage (although I haven't
touched those yet).
ronaldoussoren added a commit that referenced this issue Dec 6, 2024
Keyvalue coding support is now compaible with free threading
ronaldoussoren added a commit that referenced this issue Dec 7, 2024
The various type checking options are now safe in the free threaded
build.
ronaldoussoren added a commit that referenced this issue Dec 7, 2024
The usage of these options is now compatible with free threading.
ronaldoussoren added a commit that referenced this issue Dec 7, 2024
All usage of options.m is now compatible with free-threading.
ronaldoussoren added a commit that referenced this issue Dec 7, 2024
"module.m" is now compatible with free-threading.
ronaldoussoren added a commit that referenced this issue Dec 8, 2024
This is a first step in making objc-object.m safe to use
in a free-threaded build.

The change to method-accessor.m fixes a real bug: a PyObjCClass
instance was used as a PyObjCObject one. This happens to work
due to both having the Objective-C value as their first field,
but is undefined behaviour in C.
ronaldoussoren added a commit that referenced this issue Dec 15, 2024
objc-object is now almost ready for free-threading, with the
exception of usage of '->flags': I haven't audited the code
yet to see if that's safe or needs more work (most flags are static,
but not all).

The rest of the code is not 100% safe with free-threading, but the
safeness issues should also be present in when using the GIL: calling
an 'init' method on an UNINITIALIZED value in two threads at once
has a race condition where one of the threads uses a no longer valid
value. That's close to impossible to guard against.
ronaldoussoren added a commit that referenced this issue Dec 20, 2024
The most impacting changes:
- Internal API no longer uses borrowed references
- The internal registry was reworked to make it thread-safe

Also changes ``#if Py_GIL_DISABLED`` to ``#ifdef Py_GIL_DISABLED``
in a number of places.
ronaldoussoren added a commit that referenced this issue Dec 20, 2024
This changeset fixes a number of small issues in the free-threading
support, and enables GIL-less operation by default.

This is the first point where tests in pyobjc-core pass with the
free-threaded build.

Next up: actually testing threaded usage.
@ronaldoussoren
Copy link
Owner Author

ronaldoussoren commented Dec 21, 2024

Gettting closer, but not quite there yet a fairly trivial script using threading fails without the GIL and passes with it enabled (basically iterating over an NSArray instance concurrently in a couple of threads).

I've added some standalone test script that demonstrate problems.

ronaldoussoren added a commit that referenced this issue Dec 21, 2024
There are not integrated in the main test suite because
that test suite can hide problems by way of doing a lot
of work before ending up at these tests.
ronaldoussoren added a commit that referenced this issue Dec 22, 2024
The logic in objc_metaclass_register was off, resulting
in a race condition when using PyObjC in multiple threads
concurrently.

The two test scripts in Tools/free-threading now work
properly.
@ronaldoussoren
Copy link
Owner Author

Gettting closer, but not quite there yet a fairly trivial script using threading fails without the GIL and passes with it enabled (basically iterating over an NSArray instance concurrently in a couple of threads).

I've added some standalone test script that demonstrate problems.

Those problems are now fixed, feeling better about this...

@ronaldoussoren
Copy link
Owner Author

Hmmm.... The overhead for free-threading is large enough to slow things down a lot when running without a GIL. That said, when I don't convert the list of an NSArray I get a similar slowdown)

% time python -Xgil=1 concurrent-sum.py; time python -Xgil=0 concurrent-sum.py
python -Xgil=1 concurrent-sum.py  11.29s user 22.32s system 192% cpu 17.436 total
python -Xgil=0 concurrent-sum.py  15.10s user 32.63s system 534% cpu 8.926 total

The script:

import threading
from Cocoa import NSObject, NSArray
import objc

N_THREAD=8

v = NSObject.alloc().init()

t_list = []
r_list = []

def f(a, b):
    b.wait()

    s = 0
    for i in a:
        s += i


bar = threading.Barrier(N_THREAD)
arr = NSArray.arrayWithArray_(list(range(1000000)))
for _ in range(N_THREAD):
     t = threading.Thread(target=f, args=(arr, bar,))
     t_list.append(t)
     t.start()

for t in t_list:
   t.join()

ronaldoussoren added a commit that referenced this issue Dec 30, 2024
Removing ObjC proxies from the registry was broken due
to the switch to weak references.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants