Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement Py2Borrowed #3654

Closed
wants to merge 1 commit into from
Closed

Conversation

davidhewitt
Copy link
Member

This PR adds Py2Borrowed<'a, 'py, T> which is similar to &'a Py2<'py, T> but directly stores NonNull<ffi::PyObject>, making it possible to transmute gil-refs to it (see discussion in #3651).

Additionally this is necessary if we want PyTupleMethods::get_item to return a borrowed object, which is safe as tuples are immutable.

Finally, in #3606 I am able to leverage this type in our extract_argument logic to avoid needing to manipulate reference counts, which gives quite a significant performance improvement.

@davidhewitt davidhewitt added the CI-skip-changelog Skip checking changelog entry label Dec 14, 2023
/// The advantage of this over `&Py2` is that it avoids the need to have a pointer-to-pointer, as Py2
/// is already a pointer to a PyObject.
#[repr(transparent)]
pub(crate) struct Py2Borrowed<'a, 'py, T>(NonNull<ffi::PyObject>, PhantomData<&'a Py2<'py, T>>); // TODO is it useful to have the generic form?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So what is the verdict on the TODO? Is there already a use case for that? Otherwise, I'd suggest leaving it non-generic for now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think #3651 might be the first use case, if we do actually think the thing I just pushed there has value.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood, but then I'd suggest removing the TODO comment now (as the whole PR is in anticipation of future uses).

#[repr(transparent)]
pub(crate) struct Py2Borrowed<'a, 'py, T>(NonNull<ffi::PyObject>, PhantomData<&'a Py2<'py, T>>); // TODO is it useful to have the generic form?

impl<'a, 'py> Py2Borrowed<'a, 'py, PyAny> {
Copy link
Member

@adamreichold adamreichold Dec 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should at least this block or even the struct definition itself have a bound ensuring that 'py outlives 'a to avoid all too obvious lifetime errors?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's an edge case; isn't this a valid construction where 'a outlives 'py:

let any: Py<PyAny> = /* ... */;
let any = &any;  // start borrow 'a
Python::with_gil(|py| Py2Borrowed::from(any.attach(py)));  // 'py shorter than 'a

Maybe in practice this is never useful though?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in this case, the borrow of any could always be shortened to at most 'py by reborrowing. So if the bound does not prevent the above but prevents some illegal uses, I would prefer to have it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually it turns out that exactly in the Py<PyBytes>::as_bytes case we're looking at in #3651 it's the case that we want 'a to outlive 'py: d6f18b5

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But doesn't attach_borrow sort of require the immutability guarantee of Py<PyBytes>::as_bytes now to avoid unsafely extending references into e.g. mutable pyclasses beyond the duration for which the GIL is held? Wouldn't this imply that attach_borrow needs to be unsafe?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I am also sort of confused now whether we discuss #3651 or #3654. Maybe these should be folded into #3651 with Py2Borrowed just being a separate commit?)

}
}

impl<T> ToPyObject for Py2Borrowed<'_, '_, T> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't these be reachable via the Deref impl or are these used in generic context somewhere?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think they were only used in extract_argument.rs and I can probably work around them not existing, so can remove for now.

@@ -214,6 +225,123 @@ unsafe impl<T> AsPyPointer for Py2<'_, T> {
}
}

/// A borrowed equivalent to `Py2`.
Copy link
Member

@adamreichold adamreichold Dec 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should just give up and call this Py3. ;-) But more seriously, I think we should get rid off Py2 sooner or later and commit to something reasonable. By now the pain of reading Py2 has made my mind pliable and I will probably accept almost anything even remotely sane.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh, maybe I will open a PR to rotate Py<T> -> PyDetached<T> (or PySend<T>) and Py2<'py, T> -> Py<'py, T>. 🙈

@adamreichold
Copy link
Member

adamreichold commented Dec 14, 2023

The type is admittedly thin but it has also no tests here. I guess the coverage will come with the PR that make use of it?

EDIT: Maybe a doc test would be helpful also just for future readers wanting to use this?

Comment on lines +319 to +327
impl<'py, T> Deref for Py2Borrowed<'_, 'py, T> {
type Target = Py2<'py, T>;

#[inline]
fn deref(&self) -> &Py2<'py, T> {
// safety: Py2 has the same layout as NonNull<ffi::PyObject>
unsafe { &*(&self.0 as *const _ as *const Py2<'py, T>) }
}
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main frustration I found with using this type is the deref here forces us to reduce the lifetime; it's not the original 'awhich thePy2Borrowed` was created with, but a reduced lifetime which is the duration of the deref.

I think this generally only matters in interactions with the gil-ref APIs, where the reduced lifetime means that this type isn't a solution to #3651 (comment)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would almost say this is make-or-break or are the performance improvements for argument extraction themselves sufficient to motivate the type? Meaning if this does not help to solve that problem, I would suggest avoiding the additional complexity for now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if I understand it, the only think really working would be to implement all Methods traits directly on Py2Borrowed and make the implementations of these traits on Py2 forward to those on Py2Borrowed?

Since these would not be user-visible, I could live with that from an API perspective. But the amount of internal boiler plate would be staggering...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think another alternative is to make Py2 generic over ownership, something like

struct Py2<'py, T, O: Ownership>(Python<'py>, ManuallyDrop<Py<T>>, PhantomData<O>);

trait Ownership { /* I think this just needs to describe how to clone and drop the reference */ } 
struct Owned;
struct Borrowed<'a>(PhantomData<&'a Py<PyAny>>);

// implement `Ownership` for each of `Owned` and `Borrowed`

In which case we can then make all the Methods traits generic over O: Ownership, and not need to have a ton of internal boilerplate.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds like pest versus cholera: Internal trait level complexity which might be short but needs to be "expanded in one's head" versus internal boiler plate which is trivial but needs to be read to "sift from the syntax noise".

If the lifetimes can work without this, I would prefer not generalizing the method traits and have only the Py2-based trait implementations.

/// The advantage of this over `&Py2` is that it avoids the need to have a pointer-to-pointer, as Py2
/// is already a pointer to a PyObject.
#[repr(transparent)]
pub(crate) struct Py2Borrowed<'a, 'py, T>(NonNull<ffi::PyObject>, PhantomData<&'a Py2<'py, T>>); // TODO is it useful to have the generic form?
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think #3651 might be the first use case, if we do actually think the thing I just pushed there has value.

#[repr(transparent)]
pub(crate) struct Py2Borrowed<'a, 'py, T>(NonNull<ffi::PyObject>, PhantomData<&'a Py2<'py, T>>); // TODO is it useful to have the generic form?

impl<'a, 'py> Py2Borrowed<'a, 'py, PyAny> {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's an edge case; isn't this a valid construction where 'a outlives 'py:

let any: Py<PyAny> = /* ... */;
let any = &any;  // start borrow 'a
Python::with_gil(|py| Py2Borrowed::from(any.attach(py)));  // 'py shorter than 'a

Maybe in practice this is never useful though?

}
}

impl<T> ToPyObject for Py2Borrowed<'_, '_, T> {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think they were only used in extract_argument.rs and I can probably work around them not existing, so can remove for now.

@@ -214,6 +225,123 @@ unsafe impl<T> AsPyPointer for Py2<'_, T> {
}
}

/// A borrowed equivalent to `Py2`.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh, maybe I will open a PR to rotate Py<T> -> PyDetached<T> (or PySend<T>) and Py2<'py, T> -> Py<'py, T>. 🙈

@davidhewitt
Copy link
Member Author

Yes, agreed this needs coverage before it can be considered mergeable.

@davidhewitt davidhewitt marked this pull request as draft December 14, 2023 22:29
@davidhewitt
Copy link
Member Author

So, summarising what I think I've learned from this review plus #3651:

  • From an internal implementation perspective, this type is useful e.g. as_bytes() in implement PyBytesMethods and PyByteArrayMethods #3651
    • The extract_argument logic can also be entirely internal, so does not require this API to be public.
  • For a user, the only places I've seen where they can reasonably encounter this type:
    • tuple iteration and tuple get_item, where this type avoids the need for users to pay a refcount cycle performance penalty.
    • similarly frozenset iteration

One option which I'm now considering is that for the first version of the Py2 API, we do not expose this Py2Borrowed type, and accept the tuple and frozenset APIs to be known cases where the new API will bring a slight slowdown over what we currently have. We can then keep this type purely as an internal implementation detail.

Later on, when the dust has settled and we have removed the pool / gil ref API, thus making things simpler, we can reconsider adding this type either as a standalone type or as a new type parameter to Py2 as per #3654 (comment)

@davidhewitt
Copy link
Member Author

davidhewitt commented Dec 15, 2023

@adamreichold if you agree with the above suggestion to keep this type internal for now, I suggest that what I do is close this review and instead merge a cut-down form of Py2Borrowed as part of #3651. Then, as I make further progress with the new API (namely tuple and frozenset methods, plus the extract_argument performance increase which I think can land once tuple and dict is complete), we can add additional internal infrastructure to this type as needed in those reviews.

We can then punt the choice whether to make this public down the road.

@adamreichold
Copy link
Member

Yes, I would certainly prefer to keep this internal as I think this somewhat meandering discussion between the two PR shows that we still quite a lot to learn about how the new API will pan out. Let's discuss the specifics in #3651...

@davidhewitt
Copy link
Member Author

Great, will close here 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI-skip-changelog Skip checking changelog entry
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants