Skip to content

Conversation

@aihuaxu
Copy link

@aihuaxu aihuaxu commented Aug 16, 2025

Rationale for this change

This is to generate the test files by reading the files from apache/parquet-testing#91 and writing out the same files through GO implementation. The files are generated in apache/parquet-testing#93. A few invalid cases are not generated from GO.

Then I tested out using Parquet-Java against those test files.

  • Overall the implementation is compatible.
  • Issues:
  1. Variant logical type should write as VARIANT(1) instead of VARIANT(0) since the variant spec version should be 1.
  2. Type for time should TIME(MICROS,false) per spec, not TIME(MICROS,true).
aixu@K7YJWY4PK6 go_variant % parquet2 meta case-032.parquet

File path:  case-032.parquet
Created by: parquet-go version 18.4.0
Properties: (none)
Schema:
message schema {
  required int32 id (INTEGER(32,true)) = 1;
  optional group var (VARIANT(0)) = 2 {
    required binary metadata;
    optional binary value;
    optional int64 typed_value (TIME(MICROS,true));
  }
}

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

@zeroshade
Copy link
Member

I had something for this locally that I just hadn't gotten to uploading yet. I'll finish it up and get it up by Monday

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants