Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -127,16 +127,37 @@ class FractionTimestampFormatter(zoneId: ZoneId)
override protected lazy val formatter = DateTimeFormatterHelper.fractionFormatter

// The new formatter will omit the trailing 0 in the timestamp string, but the legacy formatter
// can't. Here we borrow the code from Spark 2.4 DateTimeUtils.timestampToString to omit the
// trailing 0 for the legacy formatter as well.
// can't. Here we use the legacy formatter to format the given timestamp up to seconds fractions,
// and custom implementation to format the fractional part without trailing zeros.
override def format(ts: Timestamp): String = {
val timestampString = ts.toString
val formatted = legacyFormatter.format(ts)

if (timestampString.length > 19 && timestampString.substring(19) != ".0") {
formatted + timestampString.substring(19)
} else {
var nanos = ts.getNanos
if (nanos == 0) {
formatted
} else {
// Formats non-zero seconds fraction w/o trailing zeros. For example:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does perf matter so much here? It's much simpler to write

while (nanos % 10 == 0) {
  nanos /= 10
}
formatted + "." + nanos.toString

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is wrong. For example, if nanos = 000001000. You removed trailing zeros in the loop, and get just 1. So, the result will be .1 which is wrong. It should be .000001.

// formatted = '2020-05:27 15:55:30'
// nanos = 001234000
// Counts the length of the fractional part: 001234000 -> 6
var fracLen = 9
while (nanos % 10 == 0) {
nanos /= 10
fracLen -= 1
}
// Places `nanos` = 1234 after '2020-05:27 15:55:30.'
val fracOffset = formatted.length + 1
val totalLen = fracOffset + fracLen
// The buffer for the final result: '2020-05:27 15:55:30.001234'
val buf = new Array[Char](totalLen)
formatted.getChars(0, formatted.length, buf, 0)
buf(formatted.length) = '.'
var i = totalLen
do {
i -= 1
buf(i) = ('0' + (nanos % 10)).toChar
nanos /= 10
} while (i > fracOffset)
new String(buf)
}
}
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -135,6 +135,9 @@ class TimestampFormatterSuite extends SparkFunSuite with SQLHelper with Matchers
test("format fraction of second") {
val formatter = TimestampFormatter.getFractionFormatter(UTC)
Seq(
-999999 -> "1969-12-31 23:59:59.000001",
-999900 -> "1969-12-31 23:59:59.0001",
-1 -> "1969-12-31 23:59:59.999999",
0 -> "1970-01-01 00:00:00",
1 -> "1970-01-01 00:00:00.000001",
1000 -> "1970-01-01 00:00:00.001",
Expand Down