Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check whether values are constant before smooth #1698

Merged
merged 3 commits into from
Dec 18, 2018

Conversation

stephanwlee
Copy link
Contributor

@stephanwlee stephanwlee commented Dec 17, 2018

Due to IEEE 754 floating-point precision, multiplying floating smoothing
factor into a value can cause discrepency in otherwise mathematically
constant value.

Please see #786 for examples of weird spikes/messed up y-scale.

Partially addresses #786.

@wchargin wchargin self-requested a review December 17, 2018 17:30
Copy link
Contributor

@wchargin wchargin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m really hesitant about this code. It’s hacking around a problem
without actually identifying or fixing the root cause. Checking whether
a sequence of floating point-numbers are bitwise equal doesn’t have any
useful meaning; reals are not an equality type. It it brittle: sure, it
solves the problem for exactly-constant time series, but it does not
solve the problem for time series that have even the tiniest
oscillations.

However, due to an unrelated series of unfortunate events, it happens
that the code is approximately equivalent to a less objectionable
version. In particular, our public APIs only emit single-precision
floating point numbers, so the problematic time series with tiny
oscillations cannot actually occur (unless people make their own tensor
protos and summary protobufs manually). The dubious identity-check is
then equivalent to a more reasonable range-check due to the increased
relative precision of JavaScript numbers. So, as much as I dislike this
code, I’m okay with merging it (modulo inline), because plotting
constant summaries really is something that we should support correctly,
and I don’t believe that it’s possible to construct an example where
this code does the wrong thing.

d.smoothed = nextVal;
} else {
// This arithematic causes IEEE 754 floating-point precision error and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove this comment. It doesn’t contain any useful information.

Copy link
Contributor

@wchargin wchargin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also update: tensorboard/components/vz_line_chart/vz-line-chart.ts

@stephanwlee
Copy link
Contributor Author

@wchargin in case of the tiniest oscillations, what is the expected extent of the chart? TensorBoard not making any assumptions about the values we are getting cannot detect such oscillations and even if so, what should the scale be? 10% of the spread or some minimum integer? Only former makes sense to me and that's what we will have with this code.

Also, plan on deprecating vz-line-chart and have no plans to fix that.

@wchargin
Copy link
Contributor

Also, plan on deprecating vz-line-chart and have no plans to fix that.

Discussed offline. We do indeed plan to remove vz-line-chart (see #1700
and Google-internal http://b/121158875). In the meantime, we’ll make
the corresponding change to vz-line-chart in this PR. The code is
duplicated verbatim across the two modules; there’s no need to cause any
further regressions or bifurcations.

what is the expected extent of the chart? TensorBoard not making any
assumptions about the values we are getting cannot detect such
oscillations

I’m not saying that TensorBoard should attempt to detect and hide small
oscillations. I’m saying that the code provided in this commit does not
actually solve the general case of #786. In particular, if I have a plot
that is constant almost everywhere (say, piecewise constant), then the
special case in this commit will not apply and the smoothing will still
distort the plot. Below is a script to generate such problematic
summaries, and here is a screenshot of this commit’s behavior on that
data:

Screenshot of this commit on a piecewise-constant monotonic function, which is still rendered with spikes due to floating-point imprecision in the smoothing logic

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import numpy as np
import tensorflow as tf

from tensorboard.plugins.scalar import metadata


LOGDIR = "/tmp/constantish"
STEPS = 32


def float64pb(name, scalar):
  """Create a scalar summary with a float64-tensor payload."""
  nparray = np.array(scalar, dtype=np.float64)
  tensor = tf.make_tensor_proto(nparray.astype(np.float64))
  summary = tf.Summary()
  summary_metadata = metadata.create_summary_metadata(
      display_name=name, description="")
  tf_summary_metadata = tf.SummaryMetadata.FromString(
      summary_metadata.SerializeToString())
  summary.value.add(tag='%s/scalar_summary' % name,
                    metadata=tf_summary_metadata,
                    tensor=tensor)
  return summary


def main():
  x1 = 1.0
  x2 = 1.0 + 2e-16
  assert x1 != x2, (repr(x1), repr(x2))
  with tf.summary.FileWriter(LOGDIR) as writer:
    for step in xrange(STEPS):
      x = x1 if step < STEPS / 2 else x2
      summ = float64pb("constantish", x)
      writer.add_summary(summ, global_step=step)
  print("Done.")


if __name__ == "__main__":
  main()

This is what I meant about not solving the underlying problem: the exact
same user issue appears before and after this commit; this commit merely
prevents one common occurrence of it.

As I mentioned previously, the only reason that I’m not concerned about
this is that none of our public APIs offer the ability to easily create
such problematic data, because the scalar plugin only emits
single-precision data, and the difference in precision between single-
and double-precision floating point means that the rendering errors are
vastly overshadowed by the actual differences in the data. The commit is
still a hack, it’s just that it’s likely to solve most users’ problems,
so I’m okay with accepting the hack.

@stephanwlee
Copy link
Contributor Author

In particular, if I have a plot
that is constant almost everywhere (say, piecewise constant), then the
special case in this commit will not apply and the smoothing will still
distort the plot
This is what I meant about not solving the underlying problem: the exact
same user issue appears before and after this commit; this commit merely
prevents one common occurrence of it.

At least imo, if data contains all constant but one value that is perturbed by 2e-16 like your example, it should draw as is today. TensorBoard cannot make any assumption and cannot tell that perturbation apart from a tiny gaussian noise added to a large constant (e.g., input data has very small variance close to the precision limit of IEEE 754). Our chart should NOT show a straight line in either case and show spiked value with very small extent to convey that there is some discrepancy. Perhaps I am reading between lines but I don't think #786 is expecting us to show those as straight lines.

Anyways, your concern is noted and got that there is no action item.

@wchargin
Copy link
Contributor

At least imo, if data contains all constant but one value that is
perturbed by 2e-16 like your example […] Our chart should NOT show a
straight line in either case

Absolutely. I agree. It should not show a single straight line; it
should show the data exactly as it is.

But this is not what is happening. We’re rendering the data as is but
then introducing additional spikes in the smoothing process. This is
broken:

image

@stephanwlee
Copy link
Contributor Author

argh, right. I forgot the arithmetic error e >> Number.MIN_VALUE. I imagined the error only impacting when the variance is few magnitudes close to Number.MIN_VALUE and I am very wrong. How Interesting!

@wchargin
Copy link
Contributor

I imagined the error only impacting when the variance is few
magnitudes close to Number.MIN_VALUE and I am very wrong. How
Interesting!

Right. The actual value of MIN_VALUE (2^−1074 \approx 1e−350) suggests
a degree of precision far greater than what you’ll see in practice,
because the ulp varies approximately linearly with the size of the value
itself. As a rough rule of thumb, for double-precision floating-point
numbers, you have precision of about 1 part in 1e16, and for
single-precision it’s about 1 part in 1e7—really not that much! For
normal numbers, this should be within factor-2; for subnormals the ratio
of course tapers off toward unity (the ulp of MIN_VALUE is itself).

Due to IEEE 754 floating-point precision, multiplying floating smoothing
factor into a value can cause discrepency in otherwise mathematically
constant value.

Please see tensorflow#786 for examples of weird spikes/messed up y-scale.

Fixes tensorflow#786.
@stephanwlee stephanwlee merged commit 927fa1b into tensorflow:master Dec 18, 2018
@stephanwlee stephanwlee deleted the smooth branch December 18, 2018 21:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants