Skip to content

Conversation

@rafael-telles
Copy link
Contributor

This implements a JDBC driver able to communicate to Flight SQL sources.

So far this covers:

  • Metadata retrieval by DatabaseMetadata, ResultSetMetadata, etc.
  • Query execution by statements and prepared statements

Yet to be done:

  • Parameter binding on prepared statements

@github-actions
Copy link

Thanks for opening a pull request!

If this is not a minor PR. Could you open an issue for this pull request on JIRA? https://issues.apache.org/jira/browse/ARROW

Opening JIRAs ahead of time contributes to the Openness of the Apache Arrow project.

Then could you also rename pull request title in the following format?

ARROW-${JIRA_ID}: [${COMPONENT}] ${SUMMARY}

or

MINOR: [${COMPONENT}] ${SUMMARY}

See also:

@rafael-telles rafael-telles changed the title [FlightRPC][Java] JDBC driver for Flight SQL ARROW-15452: [FlightRPC][Java] JDBC driver for Flight SQL Jan 25, 2022
@github-actions
Copy link

@github-actions
Copy link

⚠️ Ticket has not been started in JIRA, please click 'Start Progress'.

@lidavidm lidavidm self-requested a review January 25, 2022 21:05
Copy link
Member

@lidavidm lidavidm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I've done an initial pass but haven't looked at things too deeply.

Does having benchmarks make sense? (Maybe for particular components like accessor/transformer?)

java/pom.xml Outdated
<artifactId>spotbugs-maven-plugin</artifactId>
<version>4.2.3</version>
</plugin>

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to keep these plugins/dependencies in the driver's pom.xml? (Though it would be nice to enable static analyzers project-wide, that should be done separately.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

@@ -0,0 +1,295 @@
/*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to assume any changes under flight-sql are from the other PRs…let me know if that's not the case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's from the ColumnMetadata PR, we needed it to continue implementing some missing stuff like proper ResultSetMetaData for PreparedStatement metadata, and other metadata for ResultSet#getColumns.

@@ -0,0 +1,29 @@
-----BEGIN CERTIFICATE-----
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have certificates in /testing/data/flight, could those be used? (I guess we need to make a keystore for them.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could use them, we just need to get it in flight-jdbc-driver, and see how we can make a keystore for them.

return new ArrowFlightJdbcNullVectorAccessor(setCursorWasNull);
}

throw new UnsupportedOperationException();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe include the type in the exception message?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem to have changed

} else if (vector instanceof DenseUnionVector) {
return new ArrowFlightJdbcDenseUnionVectorAccessor((DenseUnionVector) vector, getCurrentRow,
setCursorWasNull);
} else if (vector instanceof NullVector || vector == null) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can a null vector ever occur?

Copy link
Contributor

@vfraga vfraga Mar 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could occur if the Flight SQL Server Endpoint defines a column with it. It could represent java.sql.Types.NULL.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean specifically null, not NullVector, sorry

/**
* Auxiliary wrapper class for {@link Connection}, used on {@link ArrowFlightJdbcPooledConnection}.
*/
public class ConnectionWrapper implements Connection {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this appears to be a pure proxy, it might be worth noting that fact (aka this is purely to serve as a base class and factor out the boilerplate elsewhere)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want us to document that this is to remove boilerplate?

*
* @return the exception.
*/
public static UnsupportedOperationException getOperationNotSupported(final Class<?> type) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems this is used purely in the base Accessor class, maybe it could be a private static method of that class?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

* @deprecated not updatable to match dinamic server allocation.
*/
@Deprecated
public enum UrlSample {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we adding a new deprecated class?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're cleaning up deprecated classes/methods.

* @see FlightServerTestRule
* @deprecated not updatable to match dinamic server allocation.
*/
@Deprecated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto here, why add a class only to immediately deprecate it? (I guess things are in the middle of a refactoring here?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! This is still work in progress and we will have to clean up

try {
queue.next();
} catch (final IllegalStateException e) {
expectedExceptionOnNextIfClosed = Optional.of(e);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems other parts of the codebase use assertThrows for this type of check

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright.

@rafael-telles
Copy link
Contributor Author

Thank you so much @lidavidm ! We will be working on your feedback soon

@wesm
Copy link
Member

wesm commented Feb 4, 2022

I just wanted to echo my comment from the mailing list that I think it would be best if we held a code donation vote and IP clearance for this piece of work since it is substantial and was developed outside of the Arrow community. I don't expect there will be any issues but it is good due diligence. As a reminder we welcome the development of large new projects in feature branches, where the development process otherwise is conducted according to the Apache Way (public discussions, issues, PRs, etc.)

I believe this will involve a Software Grant from Dremio (correct me if I'm wrong) for the work and ICLAs for the individual contributors.

@lidavidm
Copy link
Member

lidavidm commented Feb 4, 2022

Thanks for chiming in Wes - I'll look into the process and follow up on the mailing list @rafael-telles.

@lidavidm
Copy link
Member

BTW, please let us know when you think this is ready for a full review

* @return a {@code FlightStream} of results.
*/
public List<FlightStream> getStreams(final FlightInfo flightInfo) {
return flightInfo.getEndpoints().stream()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't correct. This will try to get every ticket using the current client. However each endpoint is a pair <Ticket, List>.

What we actually need to do is for each endpoint, create a FlightClient at each location and call getStream() with the associated ticket. See:

Note that if the list of locations is empty, it means the data is only available at the current location:

Can optimize this to some extent if the current location is used once, use the current client, though that limits the driver to only running one Statement at a time.

We should consider pooling FlightClients. I think Apache Pools is a good choice here, and use KeyedObjectPool to model the problem: https://commons.apache.org/proper/commons-pool/apidocs/org/apache/commons/pool2/KeyedObjectPool.html

@lidavidm
Copy link
Member

This can be rebased against the accepted proposals now.

jcralmeida and others added 14 commits April 22, 2022 16:32
Implement toString method for JdbcArray class
* Add toString to Time obj in Time#toString

* Improve Time toString

* Fix maven plugins

* Revert "Update java/flight/flight-jdbc-driver/src/test/java/org/apache/arrow/driver/jdbc/accessor/impl/calendar/ArrowFlightJdbcTimeStampVectorAccessorTest.java"

This reverts commit 00808c0.

* Revert "Merge pull request #29 from rafael-telles/Timestamp_fix"

This reverts commit 7924e7b, reversing
changes made to f6ac593.

* Fix DateTime for negative epoch

* Remove unwanted change

* Fix negative timestamp shift

* Fix coverage

* Refator DateTimeUtilsTest
…ormat

Change the behavior to return the correct format for day and year intervals.
…etadata-type-name

[Java] [JDBC] Flight jdbc driver create metadata type name.
@pitrou
Copy link
Member

pitrou commented Jun 7, 2022

@rafael-telles It seems this needs rebasing on git master and then fixing conflicts, would you like to do that?

@lidavidm
Copy link
Member

lidavidm commented Aug 3, 2022

We should probably close this PR in favor of a new PR based on the cleared code

@amol-
Copy link
Member

amol- commented Mar 30, 2023

Closing because it has been untouched for a while, in case it's still relevant feel free to reopen and move it forward 👍

@amol- amol- closed this Mar 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.