Skip to content

Conversation

@okumin
Copy link
Contributor

@okumin okumin commented Dec 21, 2024

Description

A trivial doc update to describe how users can explicitly specify sort directions or null ordering when we create a sorted table of Apache Iceberg.

Additional context and related issues

We found no Trino document mentioning the advanced parameters of sorted_by while developing Apache Hive. Simply, we misunderstood that Trino would not have the capability at first glance. I believe some people would love the information.

Release notes

(x) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
( ) Release notes are required, with the following suggested text:

@cla-bot cla-bot bot added the cla-signed label Dec 21, 2024
@github-actions github-actions bot added the docs label Dec 21, 2024
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested this SQL in my local machine.

trino> CREATE TABLE iceberg.default.orders (
    ->     order_id BIGINT,
    ->     order_date DATE,
    ->     account_number BIGINT,
    ->     customer VARCHAR,
    ->     country VARCHAR)
    -> WITH (sorted_by = ARRAY['order_date DESC NULLS FIRST', 'order_id ASC NULLS LAST'])
    -> ;
CREATE TABLE
trino> 
$ hdfs dfs -cat /user/hive/warehouse/orders-5fae815b97d54add91c0100370e56f0c/metadata/00000-cf46fb4f-226b-42dc-b640-c14d9e000c09.metadata.json
...
  "schemas" : [ {
    "type" : "struct",
    "schema-id" : 0,
    "fields" : [ {
      "id" : 1,
      "name" : "order_id",
      "required" : false,
      "type" : "long"
    }, {
      "id" : 2,
      "name" : "order_date",
      "required" : false,
      "type" : "date"
    }, {
      "id" : 3,
      "name" : "account_number",
      "required" : false,
      "type" : "long"
    }, {
      "id" : 4,
      "name" : "customer",
      "required" : false,
      "type" : "string"
    }, {
      "id" : 5,
      "name" : "country",
      "required" : false,
      "type" : "string"
    } ]
  } ],
...
  "default-sort-order-id" : 1,
  "sort-orders" : [ {
    "order-id" : 1,
    "fields" : [ {
      "transform" : "identity",
      "source-id" : 2,
      "direction" : "desc",
      "null-order" : "nulls-first"
    }, {
      "transform" : "identity",
      "source-id" : 1,
      "direction" : "asc",
      "null-order" : "nulls-last"
    } ]
  } ],

@okumin
Copy link
Contributor Author

okumin commented Dec 21, 2024

This is the part I updated
image

@okumin okumin force-pushed the iceberg-sort-directions branch from 131360e to c67245c Compare December 21, 2024 12:02
@okumin okumin changed the title Explain how to specify sort directions and null ordering using the sorted_by property Document sort direction and null order in Iceberg Dec 21, 2024
@okumin
Copy link
Contributor Author

okumin commented Dec 21, 2024

Thanks. I made two changes.

  1. Accepted your suggestion
  2. Rebase the two commits into a single commit, rephrasing the commit message briefly

@ebyhr ebyhr merged commit a401e37 into trinodb:master Dec 21, 2024
8 checks passed
@github-actions github-actions bot added this to the 469 milestone Dec 21, 2024
@okumin okumin deleted the iceberg-sort-directions branch December 21, 2024 15:12
@okumin
Copy link
Contributor Author

okumin commented Dec 21, 2024

Thanks for reviewing and merging my PR. It will hopefully help future users 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

2 participants