Reuse JDBC connection in JDBC connectors#14653
Conversation
|
It does not work somehow. Things do not get committed. |
There was a problem hiding this comment.
There is no need to make it functional. Just make the class not static.
52e86b3 to
1ee9091
Compare
|
So these are cached in each worker for the duration of a query and after the query is done all connection(s) are released...? |
This is in progress. I don't want to hold connections open for a long time. Now I also added a change to keep it up to 2 seconds when innactive. |
0c40b64 to
187cbe2
Compare
There was a problem hiding this comment.
good catch
Can you move it to a separate PR?
There was a problem hiding this comment.
i have some comments in #14702.
please apply & rebase
5742255 to
c07c9b6
Compare
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/ReusingConnectionFactory.java
Outdated
Show resolved
Hide resolved
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/BaseJdbcConfig.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
improve commit message grammar
There was a problem hiding this comment.
i have some comments in #14702.
please apply & rebase
lib/trino-collect/src/test/java/io/trino/collect/cache/TestSafeCaches.java
Outdated
Show resolved
Hide resolved
lib/trino-collect/src/test/java/io/trino/collect/cache/TestSafeCaches.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
This should also test that these make the cache empty.
There was a problem hiding this comment.
I don't think so. SafeCaches provides forwarding implementation of cache, so it is up to a delegate cache to clear things. So testing if methods are properly implemented should be on a different level, where a delegate cache implementation is.
There was a problem hiding this comment.
What's the rationale to have NonLoadableCache interface and NonLoadableCacheImpl as separate classes?
I.e. would anything bad happen if the implementation class was called NonLoadableCache, was public and had a package-private constructor?
There was a problem hiding this comment.
The cache eviction problem isn't limited to get(K key, Callable<? extends V> loader) method.
Even if I call asMap().putIfAbsent etc i insert into a cache a value that's "fresh" from the calling thread perspective, but can already stale from some other thread perspective.
Thus this safety is an illusion, the NonLoadableCache isn't any more safe than any other Guava cache. It's just less convenient to use.
findepi
left a comment
There was a problem hiding this comment.
"Introduce QueryEventListener to JDBC connectors"
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/QueryEventListener.java
Outdated
Show resolved
Hide resolved
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/DefaultJdbcMetadata.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
For some reason....
/trino master$ git grep '.in(SINGLETON)' | wc -l
69
/trino master$ git grep '.in(Scopes.SINGLETON)' | wc -l
675
findepi
left a comment
There was a problem hiding this comment.
"Test JDBC connection creations"
plugin/trino-base-jdbc/src/test/java/io/trino/plugin/jdbc/TestJdbcConnectionCreation.java
Outdated
Show resolved
Hide resolved
plugin/trino-base-jdbc/src/test/java/io/trino/plugin/jdbc/TestJdbcConnectionCreation.java
Outdated
Show resolved
Hide resolved
plugin/trino-base-jdbc/src/test/java/io/trino/plugin/jdbc/TestJdbcConnectionCreation.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
My IDE shows me a warning here.
'DriverConnectionFactory' used without 'try'-with-resources statement
There was a problem hiding this comment.
I'd suggest writing this like
return new ConnectionFactory()
{
private final DriverConnectionFactory delegate = new DriverConnectionFactory(new Driver(), config, credentialProvider);and then adding
@Override
public void close()
throws SQLException
{
delegate.close();
}at the bottom
There was a problem hiding this comment.
My IDE shows me a warning here.
Thank you. I had this disabled somehow.
plugin/trino-base-jdbc/src/test/java/io/trino/plugin/jdbc/TestJdbcConnectionCreation.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
USING may prevent JOIN pushdown, which looks like something you want to test for #6781
BTW it would be good to assert the pushdown actually happened.
It won't happen under default configuration due to lack of stats, right?
There was a problem hiding this comment.
I will add something with even more easier predicate and without a predicate at all too. I don't want to go very deeply about what is supported and what is not.
There was a problem hiding this comment.
add SHOW STATS
also, we should test with a connector that supports table stats
There was a problem hiding this comment.
also, we should test with a connector that supports table stats
SHOW STATS is interesting as it is using JDBI which double closes connections. However I would prefer not to use postgresql connector here. Also using different connectors may have different number of opened connections. Maybe I should not count them at all?
plugin/trino-base-jdbc/src/test/java/io/trino/plugin/jdbc/TestJdbcConnectionCreation.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
java.lang.AssertionError:
Expecting:
<14>
to be less than or equal to:
<11>
There was a problem hiding this comment.
I think it is related to reusing connections.
There was a problem hiding this comment.
assert that connectionCreations is empty.
also, rename the field to openConnections
There was a problem hiding this comment.
It is verified after setup of test class and after each test already.
ssheikin
left a comment
There was a problem hiding this comment.
This is a first pass of review and comments which I left here mostly for myself for the second path :)
plugin/trino-base-jdbc/src/test/java/io/trino/plugin/jdbc/TestJdbcConnectionCreation.java
Outdated
Show resolved
Hide resolved
plugin/trino-base-jdbc/src/test/java/io/trino/plugin/jdbc/TestJdbcConnectionCreation.java
Outdated
Show resolved
Hide resolved
plugin/trino-base-jdbc/src/test/java/io/trino/plugin/jdbc/TestJdbcConnectionCreation.java
Outdated
Show resolved
Hide resolved
plugin/trino-base-jdbc/src/test/java/io/trino/plugin/jdbc/TestJdbcConnectionCreation.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
I'm not really follow why cache has to be represented as map. Looks very similar to how Set is backed by Map, but I'd say it's not necessary abstraction here.
Chain of not very structured thoughts:
What if we try to add something to map? Would it mean that map is readonly? Logically it leads that Cache has to implement Map interface (too when readonly).
There was a problem hiding this comment.
You need to reverse your thinking. It is not cache represent as map, it is a map that is presented to you as cache. Internally that cache is implemented as map, basically I change the API here, behavior is same.
There was a problem hiding this comment.
.remove I'd expect that connections holder (this class) is called pool
There was a problem hiding this comment.
it's not a typical pool
This is effectively a queryId-keyd cache of 1-element pools.
It should be documented by keying by queryId is used here. This is not obvious especially that we don't reuse any "dirty" connections.
There was a problem hiding this comment.
Looks like dirty path could be improved even more.
Do we have statistics how many connections fit to the dirty bucket?
plugin/trino-base-jdbc/src/test/java/io/trino/plugin/jdbc/TestJdbcConnectionCreation.java
Outdated
Show resolved
Hide resolved
findepi
left a comment
There was a problem hiding this comment.
"Reuse JDBC connection in JDBC connectors"
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/BaseJdbcConfig.java
Outdated
Show resolved
Hide resolved
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/JdbcModule.java
Outdated
Show resolved
Hide resolved
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/JdbcModule.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
I think this should be enough:
| innerBinder.bind(ReusingConnectionFactory.class).in(SINGLETON); | |
| innerBinder.bind(ConnectionFactory.class).to(Key.get(ReusingConnectionFactory.class)).in(SINGLETON); | |
| innerBinder.bind(ConnectionFactory.class).to(ReusingConnectionFactory.class).in(SINGLETON); |
but i cannot verify that since this code isn't exercised by tests
There was a problem hiding this comment.
It is tested by TestJdbcConnectionsCreation. They Key is a key here, otherwise you would get two different instances.
plugin/trino-base-jdbc/src/test/java/io/trino/plugin/jdbc/TestReusingConnectionFactory.java
Outdated
Show resolved
Hide resolved
plugin/trino-base-jdbc/src/test/java/io/trino/plugin/jdbc/TestReusingConnectionFactory.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
assert that now one is retained in the cache and one underlying has been closed already
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/ReusingConnectionFactory.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
it's not a typical pool
This is effectively a queryId-keyd cache of 1-element pools.
It should be documented by keying by queryId is used here. This is not obvious especially that we don't reuse any "dirty" connections.
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/ReusingConnectionFactory.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Add separate test coverage for planning and execution
eg
assertOpenConnections(String sql, int expectedPlanningConections, int expectedExecutionConnections)
this would do
assertOpenConnections("EXPLAIN" + sql, expectedPlanningConections);
assertOpenConnections(sql, expectedPlanningConections + expectedExecutionConnections);
-- this would help understand why we're not reaching 1 connection per query, even with reuse
There was a problem hiding this comment.
:-)
Nice try. It is way beyond the scope of this commit (PR). Connector and even more ConnectionFactory has no notion about planning and execution. There is plenty of plumbing already and ask to test SHOW STATS for a real connector requires even more plumbing. Let's do go crazy with feature requests :)
|
Extracted #14732 |
kokosing
left a comment
There was a problem hiding this comment.
I am going to reimplement this a bit. Having this "dirty" thing is very fragile, it is hard to tell what connection and when will be reused. Instead I will make part of the implementation of JdbcClient only for metadata queries.
plugin/trino-base-jdbc/src/test/java/io/trino/plugin/jdbc/TestJdbcConnectionCreation.java
Outdated
Show resolved
Hide resolved
plugin/trino-base-jdbc/src/test/java/io/trino/plugin/jdbc/TestJdbcConnectionCreation.java
Outdated
Show resolved
Hide resolved
plugin/trino-base-jdbc/src/test/java/io/trino/plugin/jdbc/TestJdbcConnectionCreation.java
Outdated
Show resolved
Hide resolved
plugin/trino-base-jdbc/src/test/java/io/trino/plugin/jdbc/TestJdbcConnectionCreation.java
Outdated
Show resolved
Hide resolved
plugin/trino-base-jdbc/src/test/java/io/trino/plugin/jdbc/TestJdbcConnectionCreation.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
I think it is related to reusing connections.
There was a problem hiding this comment.
:-)
Nice try. It is way beyond the scope of this commit (PR). Connector and even more ConnectionFactory has no notion about planning and execution. There is plenty of plumbing already and ask to test SHOW STATS for a real connector requires even more plumbing. Let's do go crazy with feature requests :)
There was a problem hiding this comment.
I will add something with even more easier predicate and without a predicate at all too. I don't want to go very deeply about what is supported and what is not.
There was a problem hiding this comment.
My IDE shows me a warning here.
Thank you. I had this disabled somehow.
There was a problem hiding this comment.
It is verified after setup of test class and after each test already.
c07c9b6 to
ef89cb9
Compare
a340a10 to
271d634
Compare
There was a problem hiding this comment.
It matches the order in which fields are defined now.
There was a problem hiding this comment.
is it a possible situation that connection returned from the cache is already closed? Is it a harmful?
There was a problem hiding this comment.
I don't think it is possible.
There was a problem hiding this comment.
Duration.ofSeconds(5) -> const FOREVER = Duration.ofDays(5)
10 -> const VERY_LARGE = 1000
There was a problem hiding this comment.
If I understood correctly Connection ignored is not yet closed, but connectionFactory.openConnection(ALICE) tries to open new connection in the same session.
Does it mean that ReusableConnectionFactory is designed not only for metadata retrieval, but a general implementation? (I guess yes)
Is it a real situation? (I guess not for metadata)
Does it mean that Metadata retrieval is parallel? (guess no)
There was a problem hiding this comment.
The point is that JDBC connection are so-so thread safe. So it is better not to share a single connection between different threads. Also the next step is to verify we don't open nested connections in the same thread.
Yes. Yes. Not for the same Trino query.
There was a problem hiding this comment.
what if
Duration.ofMillis(0) ?
10 -> 0 ?
Is it covered by tests?
...o-base-jdbc/src/test/java/io/trino/plugin/jdbc/TestCachedBasedReusableConnectionFactory.java
Outdated
Show resolved
Hide resolved
...o-base-jdbc/src/test/java/io/trino/plugin/jdbc/TestCachedBasedReusableConnectionFactory.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
What does session.getQueryId() returns for ALICE?
looks like this heavily relies on that metadata requests are not parallel.
and this test could be extended:
@Test
public void testConnectionIsNotShared()
throws Exception
{
try (MockConnectionFactory mockConnectionFactory = new MockConnectionFactory();
ReusableConnectionFactory connectionFactory = new CacheBasedReusableConnectionFactory(mockConnectionFactory, Duration.ofSeconds(5), 10);
Connection ignored = connectionFactory.openConnection(ALICE)) {
connectionFactory.openConnection(ALICE).close();
assertThat(mockConnectionFactory.openedConnections).isEqualTo(2);
}
}
There was a problem hiding this comment.
may be close it explicitly?
There was a problem hiding this comment.
That is the point that connection is still open.
76c71d2 to
c12c852
Compare
There was a problem hiding this comment.
consider previous connection to be closed and add test
putIfAbsent?
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/JdbcModule.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
It matches the order in which fields are defined now.
There was a problem hiding this comment.
I don't think it is possible.
There was a problem hiding this comment.
The point is that JDBC connection are so-so thread safe. So it is better not to share a single connection between different threads. Also the next step is to verify we don't open nested connections in the same thread.
Yes. Yes. Not for the same Trino query.
There was a problem hiding this comment.
That is the point that connection is still open.
6309924 to
8ad2293
Compare
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/BaseJdbcClient.java
Outdated
Show resolved
Hide resolved
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/QueryEventListener.java
Outdated
Show resolved
Hide resolved
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/BaseJdbcConfig.java
Outdated
Show resolved
Hide resolved
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/JdbcModule.java
Outdated
Show resolved
Hide resolved
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/ReusableConnectionFactory.java
Outdated
Show resolved
Hide resolved
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/ReusableConnectionFactory.java
Outdated
Show resolved
Hide resolved
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/JdbcModule.java
Outdated
Show resolved
Hide resolved
plugin/trino-base-jdbc/src/test/java/io/trino/plugin/jdbc/TestJdbcConnectionCreation.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
still need to see this in detail
There was a problem hiding this comment.
Test something like:
aliceConnection1 = connectionFactory.openConnection(ALICE);
aliceConnection2 = connectionFactory.openConnection(ALICE);
aliceConnection1.close();
aliceConnection2.close();
There was a problem hiding this comment.
Added as testSingleUserCanCreateMultipleConnections
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/ReusableConnectionFactory.java
Outdated
Show resolved
Hide resolved
Statements executed during operation like commitCreateTable are going to change the state of the remote data base so auto commit mode is expected.
8ad2293 to
673a92d
Compare
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/BaseJdbcClient.java
Outdated
Show resolved
Hide resolved
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/ReusableConnectionFactory.java
Outdated
Show resolved
Hide resolved
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/ReusableConnectionFactory.java
Outdated
Show resolved
Hide resolved
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/ReusableConnectionFactory.java
Outdated
Show resolved
Hide resolved
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/ReusableConnectionFactory.java
Outdated
Show resolved
Hide resolved
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/ReusableConnectionFactory.java
Outdated
Show resolved
Hide resolved
plugin/trino-base-jdbc/src/test/java/io/trino/plugin/jdbc/TestJdbcConnectionCreation.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
assumes connections are going to be acquired from a single thread - this is probably true for metadata queries.
To elaborate this assumes that metadata queries happen in single-threaded manner so for same queryId we'll only ever have 1 connection.
There was a problem hiding this comment.
It is not necessary. This factory can be used by multiple threads. The point is that only one connection can be stored for future use and no connection is not shared between threads.
There are no issues if two threads wants to create a connection for same queryId or single thread that wants to create two connections. This factory is not optimized for such use case so hit rate will be low, but there will be no regression and no correctness issues.
There was a problem hiding this comment.
Test something like:
aliceConnection1 = connectionFactory.openConnection(ALICE);
aliceConnection2 = connectionFactory.openConnection(ALICE);
aliceConnection1.close();
aliceConnection2.close();
JdbcQueryEventListener will be notified when query is started or finished.
Otherwise data processing queries may happen on coordinator and sometimes they may reuse connections. That makes a test to be difficult to deterministic.
673a92d to
dc02f1a
Compare
|
AC |
hashhar
left a comment
There was a problem hiding this comment.
Looks good to me (for the goal of connection reuse only for metadata queries).
|
All @findepi comments got addressed. I am happy to address more in follow up PRs. |
| @Config("query.reuse-connection") | ||
| @ConfigDescription("Enables reusing JDBC connection for metadata queries to data source within a single Trino query") | ||
| public BaseJdbcConfig setReuseConnection(boolean reuseConnection) | ||
| { | ||
| this.reuseConnection = reuseConnection; | ||
| return this; | ||
| } |
There was a problem hiding this comment.
This needs to be documented. Will submit a follow-up PR.
Reuse JDBC connection in JDBC connectors