Skip to content

Conversation

@jcralmeida
Copy link
Contributor

@jcralmeida jcralmeida commented Dec 20, 2021

This add an auxiliary class FlightSqlColumnMetadata (Java) and ColumnMetadata(CPP) meant to read and write known metadata for Arrow schema fields, such as:

  • CATALOG_NAME
  • SCHEMA_NAME
  • TABLE_NAME
  • PRECISION
  • SCALE
  • IS_AUTO_INCREMENT
  • IS_CASE_SENSITIVE
  • IS_READ_ONLY
  • IS_SEARCHABLE

@github-actions
Copy link

Thanks for opening a pull request!

If this is not a minor PR. Could you open an issue for this pull request on JIRA? https://issues.apache.org/jira/browse/ARROW

Opening JIRAs ahead of time contributes to the Openness of the Apache Arrow project.

Then could you also rename pull request title in the following format?

ARROW-${JIRA_ID}: [${COMPONENT}] ${SUMMARY}

or

MINOR: [${COMPONENT}] ${SUMMARY}

See also:

@jcralmeida
Copy link
Contributor Author

@lidavidm Here is the changes related to the column metadata that @jduo was talking about.

Could you take a look and share your thoughts, please?

Copy link
Member

@lidavidm lidavidm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the metadata here sufficient to cover at least JDBC and ODBC?

@jcralmeida
Copy link
Contributor Author

jcralmeida commented Dec 20, 2021

Is the metadata here sufficient to cover at least JDBC and ODBC?

Yes, we thought of this values thinking on both JDBC and ODBC.

@lidavidm
Copy link
Member

Just a heads up, this needs rebasing now (sorry for the churn).

Given there are some format changes here, we can bundle this with the type info method vote.

@jcralmeida jcralmeida force-pushed the flight-sql-column-metadata branch from 70d11ea to 5ff5df7 Compare December 23, 2021 10:27
@jcralmeida
Copy link
Contributor Author

No problem. I've already rebased both PRs.

@jcralmeida jcralmeida force-pushed the flight-sql-column-metadata branch from 5ff5df7 to b172c92 Compare January 10, 2022 19:52
@jcralmeida jcralmeida changed the base branch from flight-sql to master January 10, 2022 19:53
@jcralmeida jcralmeida changed the title [JAVA] Add missing metadata on Arrow schemas returned by Flight SQL Add missing metadata on Arrow schemas returned by Flight SQL Jan 10, 2022
@jcralmeida
Copy link
Contributor Author

Sure @lidavidm, let's combine both implementation and make one vote.

By the way we don't have the Jira Ticket for this, where can I create the ticket?

@lidavidm
Copy link
Member

It can be created here: https://issues.apache.org/jira/secure/Dashboard.jspa

Then just update the PR title as described above: #11999 (comment)

@jcralmeida jcralmeida changed the title Add missing metadata on Arrow schemas returned by Flight SQL ARROW-15314: [Flight SQL] Add missing metadata on Arrow schemas returned by Flight SQL Jan 12, 2022
@github-actions
Copy link

@github-actions
Copy link

⚠️ Ticket has not been started in JIRA, please click 'Start Progress'.

@lidavidm lidavidm changed the title ARROW-15314: [Flight SQL] Add missing metadata on Arrow schemas returned by Flight SQL ARROW-15314: [C++][Java][FlightRPC] Add missing metadata on Arrow schemas returned by Flight SQL Jan 12, 2022
@lidavidm
Copy link
Member

Thanks for linking this up. BTW, are integration tests planned here too?

@jcralmeida
Copy link
Contributor Author

@lidavidm Sorry, I've found an error here. I'm fixing it and I will let you know when I finish it

Copy link
Member

@lidavidm lidavidm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Just one little thing I noticed on a last pass.

@jcralmeida
Copy link
Contributor Author

@lidavidm I've added the integration test for both PRs and fix the nits on both PRs

* - return 4 (\b100) => [SQL_SUBQUERIES_IN_INS];
* - return 5 (\b101) => [SQL_SUBQUERIES_IN_COMPARISONS, SQL_SUBQUERIES_IN_INS];
* - return 6 (\b110) => [SQL_SUBQUERIES_IN_INS, SQL_SUBQUERIES_IN_EXISTS];
* - return 6 (\b110) => [SQL_SUBQUERIES_IN_COMPARISONS, SQL_SUBQUERIES_IN_EXISTS];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this change valid? Judging by the above, SQL_SUBQUERIES_IN_COMPARISONS is 0b001 while SQL_SUBQUERIES_IN_INS is 0b100.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this seems wrong. Additionally the enum values are sequential (0, 1, 2, 3) not bitmasks (unless those values are intended to be bit indices in which case we should document them as such).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jcralmeida any comment here? I forgot about this, let's fix these last few things and then kick off the vote

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh sorry @lidavidm , I've changed back to how it was which was the correct one. I don't know how/why I've changed it

* - ARROW:FLIGHT:SQL:DB_SCHEMA_NAME - Database schema name
* - ARROW:FLIGHT:SQL:TABLE_NAME - Table name
* - ARROW:FLIGHT:SQL:PRECISION - Column precision/size
* - ARROW:FLIGHT:SQL:SCALE - Column scale/decimal digits
Copy link
Member

@pitrou pitrou Feb 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are precision and scale useful? Can they annotate non-decimal columns?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also requested by the JDBC and ODBC, to be extracted by them via ResultSetMetadata

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we annotate this as (if applicable for the column type) or something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes senses. Added


/*
* Represents a SQL query. Used in the command member of FlightDescriptor
* for the following RPC calls:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this have the same description as CommandGetTables? Do we have two different messages for doing the same thing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The message itself is different, what is similar here is the response since the column metadata can be retrieved via both command

@lidavidm
Copy link
Member

There was one doc lint hidden in a pipeline:

/arrow/cpp/src/arrow/flight/sql/column_metadata.h:139: error: argument 'IsAutoIncrement' of command @param is not found in the argument list of arrow::flight::sql::ColumnMetadata::ColumnMetadataBuilder::IsAutoIncrement(bool is_auto_increment) (warning treated as error, aborting now)

I think other failures are flakes/unrelated.

@jcralmeida
Copy link
Contributor Author

Sorry @lidavidm. There were some errors related to the cases of the parameter. I've already fixed them

@jcralmeida
Copy link
Contributor Author

Hi @lidavidm I've got some failures on some CI jobs but according to I'm seeing they are unrelated to code. Is it right?

@lidavidm
Copy link
Member

I believe they're unrelated, I'll kick them to try again (the Java one is definitely unrelated)

Comment on lines +937 to +939
* - ARROW:FLIGHT:SQL:CATALOG_NAME - Table's catalog name
* - ARROW:FLIGHT:SQL:DB_SCHEMA_NAME - Database schema name
* - ARROW:FLIGHT:SQL:TABLE_NAME - Table name
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason these three are duplicated from the fields above?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just because they should unified in every place it appears

* - ARROW:FLIGHT:SQL:IS_CASE_SENSITIVE - "1" indicates if the column is case sensitive, "0" otherwise.
* - ARROW:FLIGHT:SQL:IS_READ_ONLY - "1" indicates if the column is read only, "0" otherwise.
* - ARROW:FLIGHT:SQL:IS_SEARCHABLE - "1" indicates if the column is searchable via WHERE clause, "0" otherwise.
* The returned data should be ordered by catalog_name, db_schema_name, table_name, then table_type, followed by table_schema if requested.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ordering by table_schema is just a bytewise lexicographic ordering?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

Copy link
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some questions below.

@lidavidm lidavidm closed this in c2fac05 Mar 25, 2022
@ursabot
Copy link

ursabot commented Mar 25, 2022

Benchmark runs are scheduled for baseline = acc6c2e and contender = c2fac05. c2fac05 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished ⬇️0.0% ⬆️0.0%] ec2-t3-xlarge-us-east-2
[Finished ⬇️0.67% ⬆️0.34%] test-mac-arm
[Failed ⬇️0.0% ⬆️0.0%] ursa-i9-9960x
[Finished ⬇️0.04% ⬆️0.0%] ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

pribor pushed a commit to GlobalWebIndex/arrow that referenced this pull request Oct 24, 2025
…emas returned by Flight SQL

This add an auxiliary class FlightSqlColumnMetadata (Java) and ColumnMetadata(CPP) meant to read and write known metadata for Arrow schema fields, such as:

- CATALOG_NAME
- SCHEMA_NAME
- TABLE_NAME
- PRECISION
- SCALE
- IS_AUTO_INCREMENT
- IS_CASE_SENSITIVE
- IS_READ_ONLY
- IS_SEARCHABLE

Closes apache#11999 from jcralmeida/flight-sql-column-metadata

Authored-by: Jose Almeida <[email protected]>
Signed-off-by: David Li <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants