Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reference python objects directly in perspective tables #975

Merged
merged 2 commits into from
May 5, 2020

Conversation

timkpaine
Copy link
Member

@timkpaine timkpaine commented Mar 14, 2020

Adds support for storing python objects in perspective tables by consolidating DTYPE_OBJECT and DTYPE_PTR into just DTYPE_OBJECT.

Python objects are stored by pointer (e.g. PyObject *) as uint64_t then materialized from the underlying scalar, e.g. if scalar.m_type == DTYPE_OBJECT we cast the uint64_t back to a PyObject *

Also added is support for value extraction via the _psp_dtype_ and _psp_repr_ magic methods, the presence of which indicates how perspective should treat the returned value, along with the ability to return arbitrary values (__repr__ is limited to strings, so if you want to pull an integer out of your object you can define _psp_dtype_ to return int and have _psp_repr_ return an int).

Along these lines, type inference is slightly modified to support promotions, e.g. if you return a custom object that supports __int__ or __float__, we'll look for those methods if you tell perspective to store as an integer or floating type.

Example behaviors in the test file

Some vestigial code was removed as well, and some hardcoded py::object were converted to t_val for consistency.

Todos:

  • id(val) won't increment the reference counter, so it will be possible for a scalar to store a pointer that has been cleaned up. When we store a DTYPE_OBJECT, we'll need to make sure to INCREF the object.
  • Add documentation about _psp_dtype_ and _psp_repr_
  • Decide whether default behavior should be to extract (current behavior) or store reference
  • turn TBB back on

@timkpaine timkpaine added C++ enhancement Feature requests or improvements Python labels Mar 14, 2020
@timkpaine
Copy link
Member Author

@sc1f can you take a look and let me know your thoughts

@timkpaine timkpaine force-pushed the custom_repr_python branch from 2dae090 to 17f7670 Compare March 19, 2020 15:13
@timkpaine timkpaine requested review from sc1f and texodus March 22, 2020 16:07
@timkpaine
Copy link
Member Author

@texodus ready for you to review

timkpaine added 2 commits May 5, 2020 00:29
cleanup vestigial code, copypasta

checkpoint work

handle ref count in process columns

make accessor transient, don't try to incref/decref in column scope

add test for update sequence, remove debug prints from stuff i understand better now

implement remove

update docs
@texodus texodus force-pushed the custom_repr_python branch from 8a785ff to 1b1e8e1 Compare May 5, 2020 04:29
Copy link
Member

@texodus texodus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! Great test and documentation support as well, much appreciated.

This is a really exciting new feature! There were a few conflicts here with @sc1f computed columns work which I resolved through rebase. I think we should follow this up with a more rigorous examination of our perceived use case for this - there is quite alot of potential here, but Perspective's multi-runtime nature insists some "obvious" applications that are anything-but-obvious to implement.

Specifically, how do we extend our limited knowledge of an object's lifetime and identity to a more general purpose embedding, e.g. in Jupyterlab?

} break;
case DTYPE_BOOL:
case DTYPE_UINT8:
case DTYPE_INT8: {
std::uint8_t v = 0;
set_nth<std::uint8_t>(idx, v, status);
set_nth<std::uint8_t>(idx, 0, status);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any particular reason these need to be zeroed? Just curious.

// auto col = table->get_column(colname);
// for (auto idx = 0; idx < col->size(); ++idx) {
// arr[idx] = py::cast(col->get_scalar(idx));
// }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deadbeef

@@ -191,13 +191,19 @@ def marshal(self, cidx, ridx, dtype):
if val is None:
return val

# if item implements custom repr, use it
if hasattr(val, "_psp_repr_"):
val = val._psp_repr_()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not a Python expert - is this canonical? __perspective_repr__?

@texodus texodus merged commit a273fc3 into master May 5, 2020
@texodus texodus deleted the custom_repr_python branch May 5, 2020 15:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C++ enhancement Feature requests or improvements Python
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants