tf.load_op_library unable to load manylinux2010 repaired custom ops #31807

seanpmorgan · 2019-08-20T16:55:22Z

System information

Have I written custom code (as opposed to using a stock example script provided in TensorFlow): No -- using https://github.com/tensorflow/custom-op (But it breaks for addons too)
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu16.04
TensorFlow installed from (source or binary): Binary
TensorFlow version (use command below): tf-nightly & tf-nighty-2.0-preview

Describe the current behavior
Currently when I build a custom op in the tensorflow/tensorflow:custom-op-ubuntu16 docker image using the defined steps I get an install-able pip package tensorflow_zero_out-0.0.1-cp27-cp27mu-linux_x86_64.whl

This works fine, however if I repair that wheel to be manylinux2010 compliant, then tf.load_op_library will fail to find the custom-op.

python -c "import tensorflow as tf; print(dir(tf.load_op_library('manylinux/tensorflow_zero_out/python/ops/_zero_out_ops.so')))"

['LIB_HANDLE', 'OP_LIST', 'ZeroOut', '_IS_TENSORFLOW_PLUGIN', 
'_InitOpDefLibrary', '__builtins__', '__doc__', '__name__', '__package__', 
'_collections', '_common_shapes', '_context', '_core', '_dispatch', '_doc_controls', 
'_dtypes', '_errors', '_execute', '_kwarg_only', '_op_def_lib', '_op_def_library', 
'_op_def_pb2', '_op_def_registry', '_ops', '_pywrap_tensorflow', '_six', 
'_tensor_shape', 'deprecated_endpoints', 'tf_export', 'zero_out',
 'zero_out_eager_fallback']

python -c "import tensorflow as tf;print(dir(tf.load_op_library('manylinux2010/tensorflow_zero_out/python/ops/_zero_out_ops.so')))"

['LIB_HANDLE', 'OP_LIST', '_IS_TENSORFLOW_PLUGIN', 
'_InitOpDefLibrary', '__builtins__', '__doc__', '__name__', '__package__', 
'_collections', '_common_shapes', '_context', '_core', 
'_dispatch', '_doc_controls', '_dtypes', '_errors', '_execute', '_kwarg_only', 
'_op_def_lib', '_op_def_library', '_op_def_pb2', '_op_def_registry', '_ops', 
'_pywrap_tensorflow', '_six', '_tensor_shape', 'deprecated_endpoints', 'tf_export']

Notice 'zero_out' & 'zero_out_eager_fallback' are not found in the loaded library for manylinux2010

Code to reproduce the issue

git clone https://github.com/tensorflow/custom-op.git && cd custom-op
docker run -it --rm -v ${PWD}:/workspace -w /workspace tensorflow/tensorflow:custom-op-ubuntu16 /bin/bash

pip install tf-nightly
./configure.sh
bazel build build_pip_pkg
bazel-bin/build_pip_pkg artifacts

# Installed auditwheel is too old for manylinux2010
pip3 install --upgrade auditwheel

# Libtensorflow framework needs to be on LD path
export LD_LIBRARY_PATH="/usr/local/lib/python2.7/dist-packages/tensorflow_core"

# Repair logs look more or less okay
auditwheel -v repair --plat manylinux2010_x86_64 artifacts/tensorflow_zero_out-0.0.1-cp27-cp27mu-linux_x86_64.whl &> repair.txt

Other info / logs
Here are the auditwheel repair logs:
repair.txt

Here are the readelf inspections of the so files:
readelf.txt
readelf-manylinux2010.txt

Here are the so files:
so-files.zip

cc @perfinion @gunan @yifeif

--------------------------EDIT--------------------
Here are the extracted whl directories which will work with the python tf.load_op_library commands from above. (Manylinux2010 repair makes it so the custom op depends on a newly copied libtensorflow_framework.so which is part of the new whl):
custom-op-dirs.zip

The text was updated successfully, but these errors were encountered:

yongtang · 2019-08-20T20:18:14Z

I remember I encountered an issue when there is a collision of names for added kernel ops. (used to be fine for 1.14, not with new tf-nightly) Wondering if there are multiple versions of zero_out kernel ops?

seanpmorgan · 2019-08-20T21:46:04Z

I remember I encountered an issue when there is a collision of names for added kernel ops. (used to be fine for 1.14, not with new tf-nightly) Wondering if there are multiple versions of zero_out kernel ops?

Thanks! Looking at the binaries' symbols I'm not seeing any duplication that isn't present in the .so before auditwheel repair though:
https://www.diffchecker.com/pfJbJX8g

Is there a way to increase the verbosity of the load_library call so we could see if there is a conflict or something else?

The only major difference I see is that the repaired binary requires the newly copied libtensorflow_framework-65610c2c.so.1 instead of the libtensorflow_framework.so.1 that would get picked up from the TF install. I'm not sure what the implications of that are though and without being able to step through load_library function it's a bit tough.

yongtang · 2019-08-20T22:40:42Z

My previous issue was the LMDBDataset. I initially implemented LMDBDataset (C++) into TF's core rep (tf.contrib) some time ago. Later on since we try to modularize, the LMDBDataset has been moved to tensorflow/io. So there are two copies if both tensorflow and tensorflow-io are loaded.

That used to be fine. However, very recently I noticed that LMDBDataset in tensorflow/io is not working anymore with tf-nightly (couldn't remember which version but must be very recent), and I have to change the name in tensorflow/io to LMDBDatasetV2 to get around it.

Don't know if this could be related as well.

yongtang · 2019-08-20T23:14:00Z

Ah the libtensorflow_framework.so.1 is a known limitation of auditwheel.

I wrote a patch for auditwheel, to get around the issue : tensorflow/io@02dcf4a

seanpmorgan · 2019-08-20T23:54:24Z

@yongtang Amazing thanks so much! Could you explain what that file edit does / why that patch works (I'm assuming somehow tricks auditwheel to thinking the sharedlib is a common one on all systems)?

We should probably describe this and include the patch in custom-op repo.

seanpmorgan · 2019-08-21T00:31:41Z

EDIT -- Found out what policy.json was being editted:
https://github.com/pypa/auditwheel/blob/master/auditwheel/policy/policy.json

Thanks again for the patch @yongtang!

njzjz · 2022-11-08T21:27:10Z

In the auditwheel 5.2.0 which is recently released, one can use --exclude option instead of editting policy.json:

auditwheel repair --exclude libtensorflow_framework.so.2 --exclude libtensorflow_framework.so.1 --exclude libtensorflow_framework.so some_wheel.whl

If one uses cibuildwheel, add the following option to pyproject.toml.

[tool.cibuildwheel.linux]
repair-wheel-command = "auditwheel repair --exclude libtensorflow_framework.so.2 --exclude libtensorflow_framework.so.1 --exclude libtensorflow_framework.so -w {dest_dir} {wheel}"

seanpmorgan mentioned this issue Aug 20, 2019

Make whls manylinux2010 compatible tensorflow/addons#119

Closed

seanpmorgan mentioned this issue Aug 20, 2019

Revert back to manylinux1 package which breaks spec tensorflow/addons#434

Closed

seanpmorgan closed this as completed Aug 21, 2019

gadagashwini-zz self-assigned this Aug 21, 2019

gadagashwini-zz added TF 2.0 Issues relating to TensorFlow 2.0 comp:ops OPs related issues type:bug Bug labels Aug 21, 2019

This was referenced Feb 10, 2022

Whitelist tensorflow dependency for auditwheel making manylinux wheel flink-extended/dl-on-flink#698

Merged

File system scheme 'queue' not implemented flink-extended/dl-on-flink#696

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tf.load_op_library unable to load manylinux2010 repaired custom ops #31807

tf.load_op_library unable to load manylinux2010 repaired custom ops #31807

seanpmorgan commented Aug 20, 2019 •

edited

Loading

yongtang commented Aug 20, 2019

seanpmorgan commented Aug 20, 2019 •

edited

Loading

yongtang commented Aug 20, 2019

yongtang commented Aug 20, 2019 •

edited

Loading

seanpmorgan commented Aug 20, 2019 •

edited

Loading

seanpmorgan commented Aug 21, 2019

njzjz commented Nov 8, 2022

tf.load_op_library unable to load manylinux2010 repaired custom ops #31807

tf.load_op_library unable to load manylinux2010 repaired custom ops #31807

Comments

seanpmorgan commented Aug 20, 2019 • edited Loading

yongtang commented Aug 20, 2019

seanpmorgan commented Aug 20, 2019 • edited Loading

yongtang commented Aug 20, 2019

yongtang commented Aug 20, 2019 • edited Loading

seanpmorgan commented Aug 20, 2019 • edited Loading

seanpmorgan commented Aug 21, 2019

njzjz commented Nov 8, 2022

seanpmorgan commented Aug 20, 2019 •

edited

Loading

seanpmorgan commented Aug 20, 2019 •

edited

Loading

yongtang commented Aug 20, 2019 •

edited

Loading

seanpmorgan commented Aug 20, 2019 •

edited

Loading