Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build: promote Pip wheels to Bazel target #2363

Merged
merged 22 commits into from
Jun 24, 2019
Merged
Show file tree
Hide file tree
Changes from 21 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,7 @@ script:
- elapsed && bazel fetch //tensorboard/...
- elapsed && bazel build //tensorboard/...
- elapsed && bazel test //tensorboard/... --test_tag_filters="${test_tag_filters}"
- elapsed && bazel run //tensorboard/pip_package:build_pip_package -- --tf-version "${TF_VERSION_ID}" --smoke
- elapsed && bazel run //tensorboard/pip_package:test_pip_package -- --default-python-only --tf-version "${TF_VERSION_ID}"
# Run manual S3 test
- elapsed && bazel test //tensorboard/compat/tensorflow_stub:gfile_s3_test
- elapsed && bazel test //tensorboard/summary/writer:event_file_writer_s3_test
Expand Down
30 changes: 24 additions & 6 deletions tensorboard/pip_package/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,20 @@ package(default_visibility = ["//visibility:private"])

licenses(["notice"]) # Apache 2.0

# rm -rf /tmp/tensorboard
# bazel run //tensorboard/pip_package:build_pip_package
# pip install -U /tmp/tensorboard/*py2*.pip
genrule(
name = "pip_package",
# Pip packages are provided in a tarball because we can't statically
# know the file names: they must contain the TensorBoard version,
# which is only defined in a Python module. Dependents of this
# target can extract the tarball and glob its contents.
outs = ["pip_packages.tar.gz"],
cmd = "$(execpath :build_pip_package) $@",
tools = [
":build_pip_package",
],
visibility = ["//tensorboard:internal"],
)

sh_binary(
name = "build_pip_package",
srcs = ["build_pip_package.sh"],
Expand All @@ -17,6 +28,7 @@ sh_binary(
"README.rst",
"setup.cfg",
"setup.py",
":deterministic_tar_gz",
"//tensorboard", # Main tensorboard binary and everything it uses
"//tensorboard:lib", # User-facing overall TensorBoard API
"//tensorboard:version", # Version module (read by setup.py)
Expand All @@ -26,9 +38,15 @@ sh_binary(
"//tensorboard/plugins/projector", # User-facing projector API
"//tensorboard/summary:tf_summary", # tf.summary API for TF 2.0
],
tags = [
"local",
"manual",
)

# Not an `sh_test`: accepts command-line flags to select Python and
# TensorFlow versions against which to test. Pass `--help` for details.
sh_binary(
name = "test_pip_package",
srcs = ["test_pip_package.sh"],
data = [
":pip_package",
],
)

Expand Down
224 changes: 56 additions & 168 deletions tensorboard/pip_package/build_pip_package.sh
Original file line number Diff line number Diff line change
Expand Up @@ -12,9 +12,31 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
set -e
if [ -z "${RUNFILES}" ]; then
RUNFILES="$(CDPATH= cd -- "$0.runfiles" && pwd)"

set -eu

usage() {
cat <<EOF
usage: build_pip_package OUTPUT_DIR

Build TensorBoard Pip packages and store the resulting wheel files
into OUTPUT (a \`.tar.gz\` path or directory).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I'd say a directory or a \`.tar.gz\` path to make it clear that that there is no need for the directory name to end in .tar.gz since it's otherwise strictly speaking ambiguous

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, absolutely. (I usually try to do this, but missed this one.)


Arguments:
OUTPUT: A path ending in \`.tar.gz\` for the output archive, or else a
path to an existing directory into which to store the wheel files
directly.
EOF
}

if [ $# -ne 1 ]; then
usage 2>&1
exit 1
fi
output="$1"

if [ -z "${RUNFILES+set}" ]; then
RUNFILES="$(CDPATH="" cd -- "$0.runfiles" && pwd)"
fi

if [ "$(uname)" = "Darwin" ]; then
Expand All @@ -23,69 +45,36 @@ else
sedi="sed -i"
fi

tf_version="tf-nightly"
if [ -n "${TF_VERSION}" ]; then
tf_version="tensorflow==${TF_VERSION}"
fi
smoke="all"
while [ "$#" -gt 0 ]; do
case "$1" in
"--tf-version")
tf_version="$2"
shift
shift
;;
"--smoke")
smoke=1
shift
;;
"--smoke-all")
smoke="all"
shift
;;
"--no-smoke")
smoke=0
shift
;;
*)
echo >&2 'fatal: unknown argument:' "$1"
exit 1
;;
esac
done

set -x
command -v curl >/dev/null
command -v perl >/dev/null
command -v virtualenv >/dev/null
[ -d "${RUNFILES}" ]

dest=/tmp/tensorboard
if [ ! -e $dest ]; then
mkdir $dest
if [ "$(uname)" = "Darwin" ]; then
workdir="$(mktemp -d -t tensorboard-pip)"
else
if [ "$(uname)" == "Darwin" ]; then
dest="$(mktemp -d -t tensorboard-pip)"
else
dest="$(mktemp -d -p /tmp -t tensorboard-pip.XXXXXXXXXX)"
fi
workdir="$(mktemp -d -p /tmp -t tensorboard-pip.XXXXXXXXXX)"
fi
cd "${dest}"
original_wd="${PWD}"
cd "${workdir}"

cleanup() {
rm -r "${workdir}"
}
trap cleanup EXIT

cp -LR "${RUNFILES}/org_tensorflow_tensorboard/tensorboard" .
mv -f "tensorboard/pip_package/LICENSE" .
mv -f "tensorboard/pip_package/MANIFEST.in" .
mv -f "tensorboard/pip_package/README.rst" .
mv -f "tensorboard/pip_package/setup.cfg" .
mv -f "tensorboard/pip_package/setup.py" .
rm -rf tensorboard/pip_package
rm -rf "tensorboard/pip_package"

rm -f tensorboard/tensorboard # bazel py_binary sh wrapper
chmod -x LICENSE # bazel symlinks confuse cp
find . -name __init__.py | xargs chmod -x # which goes for all genfiles
rm -f tensorboard/tensorboard # bazel py_binary sh wrapper
chmod -x LICENSE # bazel symlinks confuse cp
find . -name __init__.py -exec chmod -x {} + # which goes for all genfiles

mkdir -p tensorboard/_vendor
touch tensorboard/_vendor/__init__.py
>tensorboard/_vendor/__init__.py
cp -LR "${RUNFILES}/org_html5lib/html5lib" tensorboard/_vendor
cp -LR "${RUNFILES}/org_mozilla_bleach/bleach" tensorboard/_vendor
# Vendor tensorflow-serving-api because it depends directly on TensorFlow.
Expand All @@ -94,140 +83,39 @@ cp -LR "${RUNFILES}/org_tensorflow_serving_api/tensorflow_serving" tensorboard/_

chmod -R u+w,go+r .

find tensorboard -name \*.py |
xargs $sedi -e '
find tensorboard -name \*.py -exec $sedi -e '
s/^import html5lib$/from tensorboard._vendor import html5lib/
s/^from html5lib/from tensorboard._vendor.html5lib/
s/^import bleach$/from tensorboard._vendor import bleach/
s/^from bleach/from tensorboard._vendor.bleach/
s/from tensorflow_serving/from tensorboard._vendor.tensorflow_serving/
'
' {} +

virtualenv venv
virtualenv -q venv
export VIRTUAL_ENV=venv
export PATH="$PWD/venv/bin:${PATH}"
export PATH="${PWD}/venv/bin:${PATH}"
unset PYTHON_HOME

# Require wheel for bdist_wheel command, and setuptools 36.2.0+ so that
# env markers are handled (https://github.com/pypa/setuptools/pull/1081)
export PYTHONWARNINGS=ignore:DEPRECATION # suppress Python 2.7 deprecation spam
pip install -qU wheel 'setuptools>=36.2.0'

python setup.py bdist_wheel --python-tag py2 >/dev/null
python setup.py bdist_wheel --python-tag py3 >/dev/null

smoke() {
py_major_version="$1"
if [ -z "${py_major_version}" ]; then
py_major_version="$(python -c 'import sys; print(sys.version_info[0])')"
fi
smoke_python="python$1"
smoke_venv="smoke-venv$1"
smoke_tf="$2"
set +x
printf '\n\n%70s\n' | tr ' ' '='
if [ -z "${smoke_tf}" ]; then
echo "Smoke testing with ${smoke_python} and no tensorflow..."
export TENSORBOARD_NO_TF=1
else
echo "Smoke testing with ${smoke_python} and ${smoke_tf}..."
fi
printf '\n'
set -x
command -v "${smoke_python}" >/dev/null
virtualenv -qp "${smoke_python}" "${smoke_venv}"
cd "${smoke_venv}"
. bin/activate
pip install -qU pip

if [ -n "${smoke_tf}" ]; then
pip install -qU "${smoke_tf}"
pip uninstall -qy tensorboard tb-nightly # Drop any conflicting packages
fi
pip install -qU ../dist/*"py${py_major_version}"*.whl >/dev/null
pip freeze # Log the results of pip installation

# Test TensorBoard application
[ -x ./bin/tensorboard ] # Ensure pip package included binary
mkfifo pipe
tensorboard --port=0 --logdir=smokedir 2>pipe &
perl -ne 'print STDERR;/http:.*:(\d+)/ and print $1.v10 and exit 0' <pipe >port
curl -fs "http://localhost:$(cat port)" >index.html
grep '<tf-tensorboard' index.html
curl -fs "http://localhost:$(cat port)/data/logdir" >logdir.json
grep 'smokedir' logdir.json
curl -fs "http://localhost:$(cat port)/data/plugin/projector/runs" >projector_runs.json
# logdir does not contain any checkpoints and thus an empty runs.
grep '\[\]' projector_runs.json
curl -fs "http://localhost:$(cat port)/data/plugin/projector/projector_binary.html" >projector_binary.html
grep '<vz-projector-dashboard' projector_binary.html
kill $!

# Test TensorBoard APIs
export TF_CPP_MIN_LOG_LEVEL=1 # Suppress spammy TF startup logging.
python -c "
import tensorboard as tb
assert tb.__version__ == tb.version.VERSION
from tensorboard.plugins.projector import visualize_embeddings
tb.notebook.start # don't invoke; just check existence
from tensorboard.plugins.hparams import summary_v2 as hp
hp.hparams_pb({'optimizer': 'adam', 'learning_rate': 0.02})
"
if [ -n "${smoke_tf}" ]; then
# Only test summary scalar, beholder, and mesh summary
python -c "
import tensorboard as tb
tb.summary.scalar_pb('test', 42)
from tensorboard.plugins.beholder import Beholder, BeholderHook
from tensorboard.plugins.mesh import summary
"
fi

if [ -n "${smoke_tf}" ]; then
# Exhaustively test various sequences of importing tf.summary.
test_tf_summary() {
# First argument is subpath to test, e.g. '' or '.compat.v2'.
import_attr="import tensorflow as tf; a = tf${1}.summary; a.write; a.scalar"
import_as="import tensorflow${1}.summary as b; b.write; b.scalar"
import_from="from tensorflow${1} import summary as c; c.write; c.scalar"
printf '%s\n' "${import_attr}" "${import_as}" "${import_from}" | python -
printf '%s\n' "${import_attr}" "${import_from}" "${import_as}" | python -
printf '%s\n' "${import_as}" "${import_attr}" "${import_from}" | python -
printf '%s\n' "${import_as}" "${import_from}" "${import_attr}" | python -
printf '%s\n' "${import_from}" "${import_attr}" "${import_as}" | python -
printf '%s\n' "${import_from}" "${import_as}" "${import_attr}" | python -
}
test_tf_summary '.compat.v2'
is_tf_2() {
python -c "import tensorflow as tf; assert tf.__version__[:2] == '2.'" \
>/dev/null 2>&1
}
if is_tf_2 ; then
test_tf_summary ''
fi
fi

deactivate
cd ..
rm -rf "${smoke_venv}"
}

case "${smoke}" in
"all")
smoke 2 "${tf_version}"
smoke 3 "${tf_version}"
;;
"1")
# Empty string indicates to use the default "python".
smoke "" "${tf_version}"
;;
"0")
printf "\nSkipping smoke test\n\n"
cd "${original_wd}" # Bazel gives "${output}" as a relative path >_>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just resolve output to an absolute path while we're still in the original working directory, rather than having to cd back to it later?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t know a portable way to do that. GNU readlink -e/-f works
perfectly on Linux, but on macOS readlink(1) is basically just
readlink(2). realpath(1) is also not POSIX. (Here is a handy list
of portable utilities
—it’s not large, especially when you look
at (e.g.) xargs and see that there’s no portable -0 option!)

I mean, there’s always

path="$(python -c 'print(__import__("os").path.realpath(__import__("sys").argv[1]))' "${path}")"

which has the downside of being a bit hideous (and also not correct
because command substitutions strip whitespace); just cding back is
significantly simpler imho.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(If there is a portable way to do this, I would love to know!)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about

case "${output}" in 
  /*) ;;
  *) output="${PWD}/${output}" ;;
esac

case "${output}" in
*.tar.gz)
mkdir -p "$(dirname "${output}")"
"${RUNFILES}/org_tensorflow_tensorboard/tensorboard/pip_package/deterministic_tar_gz" \
"${output}" "${workdir}"/dist/*.whl
;;
*)
echo >&2 'fatal: unknown smoke value:' "${smoke}"
exit 1
if ! [ -d "${output}" ]; then
printf >&2 'fatal: no such output directory: %s\n' "${output}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems more straightforward to just mkdir -p "${output}" here if it doesn't exist. I don't see any particular reason why it should require the directory to already exist, and this makes it more consistent with the .tar.gz behavior?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requiring the output directory to exist is more consistent with other applications
that write files to specified directories—

$ tar czf - a.c | tar xzf - -C ./nonexistent/
tar: nonexistent: Cannot open: No such file or directory
tar: Error is not recoverable: exiting now
$ cp a.c ./nonexistent/
cp: cannot create regular file './nonexistent/': Not a directory
$ mv a.c ./nonexistent/
mv: cannot move 'a.c' to './nonexistent/': Not a directory
$ ln -s a.c ./nonexistent/
ln: target './nonexistent/' is not a directory: No such file or directory
$ gcc a.c -o ./nonexistent/
/usr/bin/ld: cannot open output file ./nonexistent/: Is a directory
collect2: error: ld returned 1 exit status
$ gcc a.c -o ./nonexistent/a.c
/usr/bin/ld: cannot open output file ./nonexistent/a.c: No such file or directory
collect2: error: ld returned 1 exit status
$ printf 'shell redirections\n' >./nonexistent/out
bash: ./nonexistent/out: No such file or directory

—so I’d prefer to keep it as is unless you feel strongly. (The explicit
error is just to have a nicer message.)

exit 1
fi
mv "${workdir}"/dist/*.whl "${output}"
;;
esac

# Print the wheel files we built.
du -hs "$PWD"/dist/*
Loading