Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PARQUET-1128: [Java] Upgrade the Apache Arrow version to 0.8.0 for SchemaConverter #443

Closed
wants to merge 2 commits into from

Conversation

masayuki038
Copy link
Contributor

When I converted parquet(1.9.1-SNAPSHOT) schema to arrow(0.4.0) with SchemaConverter, this exception raised.

java.lang.NoClassDefFoundError: org/apache/arrow/vector/types/pojo/ArrowType$Struct_

	at net.wrap_trap.parquet_arrow.ParquetToArrowConverter.convertToArrow(ParquetToArrowConverter.java:67)
	at net.wrap_trap.parquet_arrow.ParquetToArrowConverter.convertToArrow(ParquetToArrowConverter.java:40)
	at net.wrap_trap.parquet_arrow.ParquetToArrowConverterTest.parquetToArrowConverterTest(ParquetToArrowConverterTest.java:27)

This reason is that SchemaConverter refer to Apache Arrow 0.1.0.
I upgrade the Apache Arrow version to 0.8.0(latest) for SchemaConverter.

…hemaConverter

Upgrade Apache Arrow version to 0.8.0.

Author: Masayuki Takahashi <[email protected]>
Copy link
Member

@xhochy xhochy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM

case TIMESTAMP_MILLIS:
return field(new ArrowType.Timestamp(org.apache.arrow.vector.types.TimeUnit.MILLISECOND, "UTC"));
case TIME_MILLIS:
return field(new ArrowType.Date(DateUnit.MILLISECOND));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not Time(MILLISECOND) ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@julienledem I made a mistake. I will fix it. Thanks.

@julienledem
Copy link
Member

Thanks for you contribution. see my question above.

…hemaConverter

Fix the wrong conversion from TIME_MILLIS(Parquet) to Time(Arrow).

Author: Masayuki Takahashi <[email protected]>
@@ -471,6 +469,7 @@ public TypeMapping convertINT64(PrimitiveTypeName primitiveTypeName) throws Runt
case LIST:
case MAP:
case MAP_KEY_VALUE:
case TIME_MILLIS:
throw new IllegalArgumentException("illegal type " + type);
}
Copy link
Contributor Author

@masayuki038 masayuki038 Jan 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@julienledem @xhochy And I changed above in SchemaConverter#convertINT64 because TIME_MILLIS only takes int32.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@julienledem @xhochy Could you please check this?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xhochy I would like this too, please

@xhochy xhochy closed this in af977ad Apr 21, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants