Check whether values are constant before smooth #1698

stephanwlee · 2018-12-17T16:12:38Z

Due to IEEE 754 floating-point precision, multiplying floating smoothing
factor into a value can cause discrepency in otherwise mathematically
constant value.

Please see #786 for examples of weird spikes/messed up y-scale.

Partially addresses #786.

wchargin

I’m really hesitant about this code. It’s hacking around a problem
without actually identifying or fixing the root cause. Checking whether
a sequence of floating point-numbers are bitwise equal doesn’t have any
useful meaning; reals are not an equality type. It it brittle: sure, it
solves the problem for exactly-constant time series, but it does not
solve the problem for time series that have even the tiniest
oscillations.

However, due to an unrelated series of unfortunate events, it happens
that the code is approximately equivalent to a less objectionable
version. In particular, our public APIs only emit single-precision
floating point numbers, so the problematic time series with tiny
oscillations cannot actually occur (unless people make their own tensor
protos and summary protobufs manually). The dubious identity-check is
then equivalent to a more reasonable range-check due to the increased
relative precision of JavaScript numbers. So, as much as I dislike this
code, I’m okay with merging it (modulo inline), because plotting
constant summaries really is something that we should support correctly,
and I don’t believe that it’s possible to construct an example where
this code does the wrong thing.

wchargin · 2018-12-17T18:22:03Z

tensorboard/components/vz_line_chart2/line-chart.ts

        d.smoothed = nextVal;
      } else {
+        // This arithematic causes IEEE 754 floating-point precision error and


Please remove this comment. It doesn’t contain any useful information.

wchargin

Also update: tensorboard/components/vz_line_chart/vz-line-chart.ts

stephanwlee · 2018-12-17T23:01:47Z

@wchargin in case of the tiniest oscillations, what is the expected extent of the chart? TensorBoard not making any assumptions about the values we are getting cannot detect such oscillations and even if so, what should the scale be? 10% of the spread or some minimum integer? Only former makes sense to me and that's what we will have with this code.

Also, plan on deprecating vz-line-chart and have no plans to fix that.

wchargin · 2018-12-18T01:51:53Z

Also, plan on deprecating vz-line-chart and have no plans to fix that.

Discussed offline. We do indeed plan to remove vz-line-chart (see #1700
and Google-internal http://b/121158875). In the meantime, we’ll make
the corresponding change to vz-line-chart in this PR. The code is
duplicated verbatim across the two modules; there’s no need to cause any
further regressions or bifurcations.

what is the expected extent of the chart? TensorBoard not making any
assumptions about the values we are getting cannot detect such
oscillations

I’m not saying that TensorBoard should attempt to detect and hide small
oscillations. I’m saying that the code provided in this commit does not
actually solve the general case of #786. In particular, if I have a plot
that is constant almost everywhere (say, piecewise constant), then the
special case in this commit will not apply and the smoothing will still
distort the plot. Below is a script to generate such problematic
summaries, and here is a screenshot of this commit’s behavior on that
data:

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import numpy as np
import tensorflow as tf

from tensorboard.plugins.scalar import metadata


LOGDIR = "/tmp/constantish"
STEPS = 32


def float64pb(name, scalar):
  """Create a scalar summary with a float64-tensor payload."""
  nparray = np.array(scalar, dtype=np.float64)
  tensor = tf.make_tensor_proto(nparray.astype(np.float64))
  summary = tf.Summary()
  summary_metadata = metadata.create_summary_metadata(
      display_name=name, description="")
  tf_summary_metadata = tf.SummaryMetadata.FromString(
      summary_metadata.SerializeToString())
  summary.value.add(tag='%s/scalar_summary' % name,
                    metadata=tf_summary_metadata,
                    tensor=tensor)
  return summary


def main():
  x1 = 1.0
  x2 = 1.0 + 2e-16
  assert x1 != x2, (repr(x1), repr(x2))
  with tf.summary.FileWriter(LOGDIR) as writer:
    for step in xrange(STEPS):
      x = x1 if step < STEPS / 2 else x2
      summ = float64pb("constantish", x)
      writer.add_summary(summ, global_step=step)
  print("Done.")


if __name__ == "__main__":
  main()

This is what I meant about not solving the underlying problem: the exact
same user issue appears before and after this commit; this commit merely
prevents one common occurrence of it.

As I mentioned previously, the only reason that I’m not concerned about
this is that none of our public APIs offer the ability to easily create
such problematic data, because the scalar plugin only emits
single-precision data, and the difference in precision between single-
and double-precision floating point means that the rendering errors are
vastly overshadowed by the actual differences in the data. The commit is
still a hack, it’s just that it’s likely to solve most users’ problems,
so I’m okay with accepting the hack.

stephanwlee · 2018-12-18T02:04:16Z

In particular, if I have a plot
that is constant almost everywhere (say, piecewise constant), then the
special case in this commit will not apply and the smoothing will still
distort the plot
This is what I meant about not solving the underlying problem: the exact
same user issue appears before and after this commit; this commit merely
prevents one common occurrence of it.

At least imo, if data contains all constant but one value that is perturbed by 2e-16 like your example, it should draw as is today. TensorBoard cannot make any assumption and cannot tell that perturbation apart from a tiny gaussian noise added to a large constant (e.g., input data has very small variance close to the precision limit of IEEE 754). Our chart should NOT show a straight line in either case and show spiked value with very small extent to convey that there is some discrepancy. Perhaps I am reading between lines but I don't think #786 is expecting us to show those as straight lines.

Anyways, your concern is noted and got that there is no action item.

wchargin · 2018-12-18T02:12:40Z

At least imo, if data contains all constant but one value that is
perturbed by 2e-16 like your example […] Our chart should NOT show a
straight line in either case

Absolutely. I agree. It should not show a single straight line; it
should show the data exactly as it is.

But this is not what is happening. We’re rendering the data as is but
then introducing additional spikes in the smoothing process. This is
broken:

stephanwlee · 2018-12-18T03:08:50Z

argh, right. I forgot the arithmetic error e >> Number.MIN_VALUE. I imagined the error only impacting when the variance is few magnitudes close to Number.MIN_VALUE and I am very wrong. How Interesting!

wchargin · 2018-12-18T15:48:53Z

I imagined the error only impacting when the variance is few
magnitudes close to Number.MIN_VALUE and I am very wrong. How
Interesting!

Right. The actual value of MIN_VALUE (2^−1074 \approx 1e−350) suggests
a degree of precision far greater than what you’ll see in practice,
because the ulp varies approximately linearly with the size of the value
itself. As a rough rule of thumb, for double-precision floating-point
numbers, you have precision of about 1 part in 1e16, and for
single-precision it’s about 1 part in 1e7—really not that much! For
normal numbers, this should be within factor-2; for subnormals the ratio
of course tapers off toward unity (the ulp of MIN_VALUE is itself).

Due to IEEE 754 floating-point precision, multiplying floating smoothing factor into a value can cause discrepency in otherwise mathematically constant value. Please see tensorflow#786 for examples of weird spikes/messed up y-scale. Fixes tensorflow#786.

wchargin self-requested a review December 17, 2018 17:30

wchargin approved these changes Dec 17, 2018

View reviewed changes

wchargin suggested changes Dec 17, 2018

View reviewed changes

wchargin mentioned this pull request Dec 17, 2018

Deduplicate vz-line-chart and vz-line-chart2 #1700

Open

stephanwlee added 2 commits December 18, 2018 08:45

CR review

beee74a

stephanwlee force-pushed the smooth branch from 4e006d6 to beee74a Compare December 18, 2018 16:45

Editor foobared

87d225f

wchargin approved these changes Dec 18, 2018

View reviewed changes

stephanwlee merged commit 927fa1b into tensorflow:master Dec 18, 2018

stephanwlee deleted the smooth branch December 18, 2018 21:32

nfelt mentioned this pull request Dec 20, 2019

Smoothing of a constant plot is messed up #786

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Check whether values are constant before smooth #1698

Check whether values are constant before smooth #1698

stephanwlee commented Dec 17, 2018 •

edited

Loading

wchargin left a comment

wchargin Dec 17, 2018

wchargin left a comment

stephanwlee commented Dec 17, 2018

wchargin commented Dec 18, 2018

stephanwlee commented Dec 18, 2018

wchargin commented Dec 18, 2018

stephanwlee commented Dec 18, 2018

wchargin commented Dec 18, 2018

Check whether values are constant before smooth #1698

Check whether values are constant before smooth #1698

Conversation

stephanwlee commented Dec 17, 2018 • edited Loading

wchargin left a comment

Choose a reason for hiding this comment

wchargin Dec 17, 2018

Choose a reason for hiding this comment

wchargin left a comment

Choose a reason for hiding this comment

stephanwlee commented Dec 17, 2018

wchargin commented Dec 18, 2018

stephanwlee commented Dec 18, 2018

wchargin commented Dec 18, 2018

stephanwlee commented Dec 18, 2018

wchargin commented Dec 18, 2018

stephanwlee commented Dec 17, 2018 •

edited

Loading