String decode using 'new String' is slow #1532

asfimport · 2014-08-20T19:55:22Z

There are three implementations of the Binary class and only one is using the faster 'UTF8.decode' in the 'toStringUsingUTF8' method. This fixes them to all use the faster UTF8.decode.

As noted in the comments, the 'new String' approach creates a new decoder each time, which is slower than the cached instance used by 'UTF8.decode'.

#1. ByteArraySliceBackedBinary <-- UTF8.decode
#2. ByteArrayBackedBinary <-- new String
#3. ByteBufferBackedBinary <-- new String

https://github.com/apache/incubator-parquet-mr/pull/40

Reporter: Daniel Weeks / @danielcweeks
Assignee: Daniel Weeks / @danielcweeks

Related issues:

Use thread local decoder cache in Binary toStringUsingUTF8() (is duplicated by)

_{Note: This issue was originally created as PARQUET-75. Please see the migration documentation for further details.}

asfimport · 2014-08-20T20:02:03Z

Daniel Weeks / @danielcweeks:
https://issues.apache.org/jira/browse/PARQUET-75

asfimport · 2014-08-28T18:31:24Z

Julien Le Dem / @julienledem:
Issue resolved by pull request 40
https://github.com/apache/incubator-parquet-mr/pull/40

asfimport closed this as completed Aug 28, 2014

asfimport mentioned this issue Jun 23, 2024

Use thread local decoder cache in Binary toStringUsingUTF8() #1531

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

String decode using 'new String' is slow #1532

String decode using 'new String' is slow #1532

asfimport commented Aug 20, 2014 •

edited

Loading

asfimport commented Aug 20, 2014

asfimport commented Aug 28, 2014

String decode using 'new String' is slow #1532

String decode using 'new String' is slow #1532

Comments

asfimport commented Aug 20, 2014 • edited Loading

Related issues:

asfimport commented Aug 20, 2014

asfimport commented Aug 28, 2014

asfimport commented Aug 20, 2014 •

edited

Loading