[RFC] TF + DLPack, why and how #3

jermainewang · 2019-11-08T18:14:42Z

This project starts from the discussion in this issue to provide a way of conversion between Tensorflow tensor and DLPack tensor. This RFC covers the user experience and technical solutions we adopt.

User experience

We plan to release a python package tfdlpack, containing two APIs:

to_dlpack: Given a tensorflow tensor, return a DLPack tensor contain.
from_dlpack: Given a DLPack-compatible python capsule, return a tensorflow tensor.

Example codes of converting a Tensorflow tensor to Torch tensor using DLPack:

import numpy as np
import tensorflow as tf
import torch.utils.dlpack as thdlpack
import tfdlpack

t1 = tf.constant([1, 2, 3], dtype=np.float32)
dlpack = tfdlpack.to_dlpack(t1)  # tf tensor -> dlpack
t2 = thdlpack.from_dlpack(dlpack)  # dlpack -> th tensor
print(t2)
dlpack = thdlpack.to_dlpack(t2)  # th tensor -> dlpack
t3 = tfdlpack.from_dlpack(dlpack)  # dlpack -> tf tensor
print(t3)

You will find that t1, t2 and t3 all have the same values, shape, and device contexts.

Package dependency: tensorflow>=2.0

How it works?

The first design consideration is that we want to avoid any modification to the main Tensorflow library, so to get around the potential long delay of PR, code review, and release cycle of Tensorflow main package. Inspired by the solution from https://github.com/tobegit3hub/tftvm, we decide to implement the functionality as two custom tensor ops: to_dlpack and from_dlpack.

Besides, we want this feature to be plugged into other projects quite easily. For example, any project that relies on this feature is able to run without compiling against Tensorflow's header files. Not only that an extra dependency usually means extra effort, but also that such maintenance is repetitive and should be handled by the feature developer (i.e., us) alone. To this end, we have an idea of releasing it as a python package. However, the question is how to invoke the two custom tensor ops in python? The challenge is that Tensorflow's custom op interface has a limited support of argument and return types, while to_dlpack and from_dlpack should have an argument/return type of DLPack object. We work around this by encoding the address of an DLPack object as an integer, so it can be accepted/returned by the custom op interface. Then, we decode it in python or C depending on whether we return it (to_dlpack) or consume it (from_dlpack).

Finally, to achieve the maximal efficiency, we want the conversion happens without memory copy.

For to_dlpack, the returned DLPack tensor shares the same memory address of the input Tensorflow tensor and holds a reference to it. Upon the destruction of the DLPack tensor, it will dereference the Tensorflow tensor, so it can be collected by Tensorflow's memory management. (inspired by PyTorch's DLPack implementation).
For from_dlpack, it first creates an allocator object (subclass Tensorflow's allocator interface) that holds the reference to the DLPack tensor. The AllocateRaw function directly returns the memory it holds without creating any new buffer. Upon destruction, the DeallocateRaw function just calls the deletor of the DLPack tensor. (inspired by Tensorflow's immutable_constant_op).

The text was updated successfully, but these errors were encountered:

JanuszL · 2019-11-08T23:18:38Z

How is it going to work with non-eager execution mode when one wants to build a graph?

VoVAllen · 2019-11-09T09:31:27Z

@JanuszL Since tensorflow doesn't support returning a non-tensor type for symbolic execution, it would be tricky. One solution is defining a new protocol. A DLPack capsule is actually a handle, which can be considered as a number representing address.

For example, we define a tensor to represent dlpack capsule as the following protocal:

Tensor's dtype is int64 or uint64
First(or first ten) element is a magic number, to indicate this is for own protocal
Second element is the handle

Then for each op you use this tensor to represent the dl capsule. I'm not sure which interface is better. Feel free to add your thoughts!

JanuszL · 2019-11-11T23:55:41Z

@VoVAllen - sure. I just wanted to check if I haven't missed anything. It would be nice to have but the current approach sounds sufficient enough for most of the users.

alextp · 2019-12-02T23:22:07Z

Can you elaborate on why did you choose to use a tf opkernel to create dlpack objects from tensors (or vice versa) instead of using TF_TensorData in the C API to get a TF_Tensor's pointer like we do in the tf to numpy array bridge?

VoVAllen · 2019-12-03T03:33:09Z

@alextp Actually I prefer directly using CAPI instead of using tf op. However, OP is a good starting point since it's easy to compile/installed without compiling the whole tensorflow.
Finally I think the implementation could be CAPI. However I will need some guidances on how to build CAPI like TF_TensorData. TF's ffi/C++ binding is a bit complex to me.

alextp · 2019-12-03T18:07:07Z

We're in the process of simplifying the TF C++/Python interop to use pybind11. But in general you can write arbitrary C or C++ code against TF's C API (in https://github.com/tensorflow/tensorflow/tree/master/tensorflow/c/c_api.h and other files in the same directory) as a shared library and the dynamic linker will resolve things correctly.

…

On Mon, Dec 2, 2019 at 7:33 PM VoVAllen ***@***.***> wrote: @alextp <https://github.com/alextp> Actually I prefer directly using CAPI instead of using tf op. However, OP is a good starting point since it's easy to compile/installed without compiling the whole tensorflow. Finally I think the implementation could be CAPI. However I will need some guidances on how to build CAPI like TF_TensorData. TF's ffi/C++ binding is a bit complex to me. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#3?email_source=notifications&email_token=AAABHRNLNTJNH4DZ7BRFYLTQWXHPLA5CNFSM4JK4E64KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFX6YWY#issuecomment-560983131>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAABHRMJ5BZ2TGFSDMDVFBDQWXHPLANCNFSM4JK4E64A> .

-- - Alex

jermainewang mentioned this issue Nov 8, 2019

DLPack support in tensorflow tensorflow/tensorflow#24453

Closed

jermainewang pinned this issue Nov 8, 2019

jonasrauber mentioned this issue Nov 9, 2019

Support __cuda_array_interface__ on GPU jax-ml/jax#1100

Closed

EvenOldridge mentioned this issue Nov 25, 2019

RFC: DLpack support for interoperability with other GPU frameworks tensorflow/community#180

Merged

jermainewang mentioned this issue Dec 19, 2019

[Roadmap] v0.5 release plan dmlc/dgl#930

Closed

32 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC] TF + DLPack, why and how #3

[RFC] TF + DLPack, why and how #3

jermainewang commented Nov 8, 2019 •

edited

Loading

JanuszL commented Nov 8, 2019

VoVAllen commented Nov 9, 2019

JanuszL commented Nov 11, 2019

alextp commented Dec 2, 2019

VoVAllen commented Dec 3, 2019

alextp commented Dec 3, 2019 via email

[RFC] TF + DLPack, why and how #3

[RFC] TF + DLPack, why and how #3

Comments

jermainewang commented Nov 8, 2019 • edited Loading

User experience

How it works?

JanuszL commented Nov 8, 2019

VoVAllen commented Nov 9, 2019

JanuszL commented Nov 11, 2019

alextp commented Dec 2, 2019

VoVAllen commented Dec 3, 2019

alextp commented Dec 3, 2019 via email

jermainewang commented Nov 8, 2019 •

edited

Loading