Skip to content

Conversation

@lvyanquan
Copy link
Contributor

In the pr of #4516, we added specId for partitions metadata table, but the document of query(https://iceberg.apache.org/docs/latest/spark-queries/#partitions ) hasn't been changed to adapt to that.
What's more, the ResultSet structures are not the same between partitioned and not-partitioned tables.

query example:

0: jdbc:hive2://host:10000> create table p (id int, age int) using iceberg partitioned by (id);
0: jdbc:hive2://host:10000> create table np (id int, age int) using iceberg;
0: jdbc:hive2://host:10000> insert into p(id, age) values (1,1),(2,2),(3,3);
0: jdbc:hive2://host:10000> insert into np(id, age) values (1,1),(2,2),(3,3);
0: jdbc:hive2://host:10000> select * from spark_catalog.kunni_db.p.partitions;
+------------+---------------+-------------+----------+
| partition  | record_count  | file_count  | spec_id  |
+------------+---------------+-------------+----------+
| {"id":2}   | 1             | 1           | 0        |
| {"id":3}   | 1             | 1           | 0        |
| {"id":1}   | 1             | 1           | 0        |
+------------+---------------+-------------+----------+
3 rows selected (0.185 seconds)
0: jdbc:hive2://host:10000> select * from spark_catalog.kunni_db.np.partitions;
+---------------+-------------+
| record_count  | file_count  |
+---------------+-------------+
| 3             | 3           |
+---------------+-------------+
1 row selected (0.126 seconds)

@github-actions github-actions bot added the docs label Aug 29, 2022
@lvyanquan lvyanquan changed the title Doc: Add doc to display the results of the table partitions query Doc: Update doc to display the results of the table partitions query Aug 29, 2022
Copy link
Member

@szehon-ho szehon-ho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the docs make sense overall, one comment.

| {20211002, 11}| 1| 1|
| {20211001, 10}| 1| 1|
| {20211002, 10}| 1| 1|
If this table is not partitioned
Copy link
Member

@szehon-ho szehon-ho Aug 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Im not sure about adding non-partitioned example for "partitions" table , as I think its something few users would need, but it is right under the header so most people have to scroll past to see what they really want to see (the schema). Id suggest removing it for now, which would be consistent with the other tables, if it sounds reasonable to you?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I I keep the result of partitioned table and display the result of non-partitioned table using note to make it more understandable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed it.

@lvyanquan lvyanquan force-pushed the doc_sparkquery branch 2 times, most recently from 893cdde to ad65635 Compare August 30, 2022 03:24
Copy link
Member

@szehon-ho szehon-ho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK its fine with me, left one small nit

| {20211002, 10}| 1| 1| 0|

Note:
If this table is non-partitioned, it will contain only the record_count and file_count columns.
Copy link
Member

@szehon-ho szehon-ho Aug 31, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just realized "this table... it" will be wrong because we mean different tables (data table and metadata table)

How about this fix?

"For unpartitioned tables, the partitions table will contain only the record_count and file_count columns."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, addressed it.

Copy link
Member

@szehon-ho szehon-ho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I re-read this and just one small thing to address, then I think we can merge it.

@szehon-ho szehon-ho merged commit 4a16fb1 into apache:master Sep 2, 2022
@szehon-ho
Copy link
Member

Merged, thanks @lvyanquan

@lvyanquan lvyanquan deleted the doc_sparkquery branch September 5, 2022 11:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants