Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvements to Double/Float conversion #121

Open
zapov opened this issue Mar 30, 2019 · 7 comments
Open

Improvements to Double/Float conversion #121

zapov opened this issue Mar 30, 2019 · 7 comments

Comments

@zapov
Copy link
Member

zapov commented Mar 30, 2019

Grisu3 works most of the time but it could be improved/replaced with a different faster algorithm.
Parsing doubles does not match Java algorithm in all cases (unless it's configured with Exact precision option; High precision probably gives the same numbers as Java - but not guaranteed).
Float uses double conversion which can lead to bit loss.

Look into suggested replacement for some of the problems: #120

@plokhotnyuk
Copy link
Contributor

plokhotnyuk commented Mar 31, 2019

Currently during parsing of float primitives there is a small error ~1ULP. I think it should be documented properly until "exact" parsing option (with precision ~0.5ULP) is not available.

BTW, here is a post about the Rust project which tries to push the performance limits without losing in precision:
https://www.reddit.com/r/rust/comments/a6j5j1/making_rust_float_parsing_fast_and_correct/

@zapov
Copy link
Member Author

zapov commented Mar 31, 2019

I'm actually not aware of float examples which lead to wrong result. Do you have any?
If I knew of them I would probably already look into it, or added a configuration option for floats too.

I saw that article but didn't have time to look into the code ;(

Also, I find it interesting that Java does not behave as expected too: https://www.exploringbinary.com/java-doesnt-print-the-shortest-strings-that-round-trip/

@plokhotnyuk
Copy link
Contributor

plokhotnyuk commented Mar 31, 2019

The rounding error can be easy reproduced when parsing string representation of some double values.

scala> "1.00000017881393432617187499".toFloat
res0: Float = 1.0000001

scala> "1.00000017881393432617187499".toDouble.toFloat
res1: Float = 1.0000002

The detailed explanation is in this comment

@zapov
Copy link
Member Author

zapov commented Mar 31, 2019

Sure, but that will not trigger rounding error in DSL-JSON for floats.
It's not that number is parsed into exact double equivalent first, rather after significant number of digits, the rest will be ignored. Thus this rounding error on double is not hit.

@plokhotnyuk
Copy link
Contributor

plokhotnyuk commented Apr 1, 2019

The following code can print lot of such numbers which are affected by rounding during parsing with DSL-JSON:

val reader = new DslJson[Any](new DslJson.Settings[Any]()).newReader()
(1 to 100000).foreach { _ =>
  val n = ThreadLocalRandom.current().nextLong()
  val x = java.lang.Double.longBitsToDouble(n & ~0xFFFFFFFL)
  if (java.lang.Double.isFinite(x)) checkAndPrint(x.toString)
}

def checkAndPrint(input: String): Unit = {
  val bs = ("[" + input + "]").getBytes
  reader.process(bs, bs.length)
  reader.read()
  val actualOutput = NumberConverter.FLOAT_ARRAY_READER.read(reader)(0)
  val expectedOutput = input.toFloat
  if (actualOutput != expectedOutput) {
    println(s"input = $input, expectedOutput =$expectedOutput, actualOutput = $actualOutput")
  }
}

Below are samples from its output:

input = -269.91502380371094, expectedOutput =-269.91504, actualOutput = -269.915
input = -0.46754591166973114, expectedOutput =-0.4675459, actualOutput = -0.46754593
input = -7.665778767318443E-8, expectedOutput =-7.665779E-8, actualOutput = -7.6657784E-8

@plokhotnyuk
Copy link
Contributor

plokhotnyuk commented Jun 7, 2020

@zapov you can peek solutions for parsing and serialization of floats and decimals immediately from the jsoniter-scala-coreJVM sub-project:

  1. fast and mid paths for parsing of floats and doubles

  2. the Schubfach algorithm

Feel free to translate all them into Java from mine or original code of authors of algorithms, as long as you adhere to the copyright notices for the writing and the code in authors' repositories and/or appropriate attribution is mentioned.

Below are screenshots from results of benchmarks that compares those approaches used in jsoniter-scala with different JSON parsers for Scala on different JVMs. Throughput (ops/sec) of parsing for serialization of arrays with 128 floats or doubles is measured here:

image

image

image

image

@zapov
Copy link
Member Author

zapov commented Jun 8, 2020

It would be nice to improve this, I "just" need to find some time to work on it :D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants