Skip to content
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 12 additions & 1 deletion LogicalTypes.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,13 +32,24 @@ This file contains the specification for all logical types.
The parquet format's `ConvertedType` stores the type annotation. The annotation
may require additional metadata fields, as well as rules for those fields.

### UTF8 (Strings)
## String Types

### UTF8

`UTF8` may only be used to annotate the binary primitive type and indicates
that the byte array should be interpreted as a UTF-8 encoded character string.

The sort order used for `UTF8` strings is `UNSIGNED` byte-wise comparison.

### ENUM

`ENUM` annotates the binary primitive type and indicates that the value
was converted from an enumerated type in another data model (e.g. Thrift, Avro, Protobuf).
Applications using a data model lacking a native enum type should interpret `ENUM`
annotated field as a UTF-8 encoded string.

The sort order used for `ENUM`s is `UNSIGNED` byte-wise comparison.

## Numeric Types

### Signed Integers
Expand Down