Skip to content

Thread-local Database Connection Pooling#1276

Merged
daschuer merged 59 commits intomixxxdj:masterfrom
uklotzde:dbconnection_fixes_and_cleanup
Jun 16, 2017
Merged

Thread-local Database Connection Pooling#1276
daschuer merged 59 commits intomixxxdj:masterfrom
uklotzde:dbconnection_fixes_and_cleanup

Conversation

@uklotzde
Copy link
Copy Markdown
Contributor

@uklotzde uklotzde commented Jun 4, 2017

Well, here's now the full story :) The connection pool makes everything safer and easier to handle.

But it's still a long way to go for safe, multi-threaded database access. This work just provides the groundwork we can build upon.

Purpose:

  • Associate database connections with threads
  • Restrict the lifetimes of database connections to their thread , i.e. properly close them before exiting a thread
    • Explicitly by DbConnectionPool::ThreadLocalScope
    • Implicitly by QThreadStorage + ~DbConnection
  • Use DbConnectionPoolPtr instead of QSqlDatabase and dynamically obtain the actual connection for the current thread by calling threadLocalConnection()
    • Within the lifetime of the corresponding DbConnectionPool::ThreadLocalScope
    • Violations of the lifetime restrictions are properly detected

[Outdated]
Lots of small fixes and cleanups more or less related to accessing the Mixxx SQLite database.

Outlook: I have already implemented a managed pool of thread-local database connections on top of these changes. It will help us taming the redundant and chaotic creation and sharing of database connections in the existing code base and lays the ground for safe, multi-threaded database access in the future.

@Be-ing
Copy link
Copy Markdown
Contributor

Be-ing commented Jun 4, 2017

QueryUtilTest.FieldEscaperEscapesQuotes is failing on AppVeyor, but strangely I do not see it being run in the log.

@uklotzde
Copy link
Copy Markdown
Contributor Author

uklotzde commented Jun 4, 2017

The previous failure might have revealed another flaw in FieldEscaper! Let's wait for the results...

Comment thread src/library/library.cpp Outdated
m_pLibraryControl(new LibraryControl(this)),
m_pRecordingManager(pRecordingManager),
m_scanner(m_pTrackCollection, pConfig) {
if (!m_pTrackCollection->initDatabaseSchema()) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While you're working on this, it may be a good time to speed up the library tests.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've seen your bug report ;) But I don't have a strategy that works for all tests, this one is an exception.

@uklotzde uklotzde changed the title DB Connections: Fixes and cleanup [WIP] DB Connections: Fixes and cleanup Jun 5, 2017
@uklotzde
Copy link
Copy Markdown
Contributor Author

uklotzde commented Jun 7, 2017

I'm considering to push the managed pool of thread-local database connection right away. Now it's only a small commit and afterwards we can eliminate the remaining unmanaged connections in LibraryScanner and AnalyzerWaveform very easily. Tabula rasa :)

@daschuer
Copy link
Copy Markdown
Member

daschuer commented Jun 7, 2017

I have read the first commits. I hope I find time to read the latest as well soon. Is this still suitable to be part of 2.1? For the first part it would work for me.

@uklotzde
Copy link
Copy Markdown
Contributor Author

uklotzde commented Jun 7, 2017

I've just eliminated the remaining unmanaged database connections. This was possible even without introducing the pool as the final step.

@uklotzde
Copy link
Copy Markdown
Contributor Author

uklotzde commented Jun 7, 2017

I just noticed that the database connections for AnalyzerWaveform are still opened on the wrong thread! The constructor where the connection is opened runs on the main thread whereas the opened database connection is accessed later from the analysis thread.

I didn't break it, but I will try to fix it! ;)

@uklotzde
Copy link
Copy Markdown
Contributor Author

uklotzde commented Jun 8, 2017

Old threading issues fixed.

But the connection pool will lift this work to a higher level, by

  • sharing a single database connection within the same host thread and
  • ensuring that database connections are closed reliably before exiting the host thread

Stay tuned...

@uklotzde
Copy link
Copy Markdown
Contributor Author

uklotzde commented Jun 8, 2017

@daschuer I strongly recommend to plan this PR for the 2.1 release. After discovering a whole bunch of threading issues and violations I have no other option ;)

@daschuer
Copy link
Copy Markdown
Member

daschuer commented Jun 8, 2017

Since this is now quite big, could you help me reviewing, by splitting that into two PRs? I mean keeping this PR as it is an create a second one that point to a reasonable commit on half of the commit history?

@uklotzde
Copy link
Copy Markdown
Contributor Author

uklotzde commented Jun 8, 2017

I just wanted to submit some quick, obvious fixes before starting the actual rework. Then I found so many issues that I couldn't resist to fix them immediately ;)

Reverting all commits into a single diff and splitting off some small changes as separate commits may reduce the amount. But in the end we may end up with one big remaining change set that is even more difficult to split into separate steps. I'm not sure how to go forward?!

What I won't do is spending another half a day or even more just for dissecting and recombining commits although I totally agree with you and understand your concerns @daschuer ;)

@daschuer
Copy link
Copy Markdown
Member

daschuer commented Jun 9, 2017 via email

Copy link
Copy Markdown
Member

@daschuer daschuer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First part done up to 043f3b0

Comment thread src/repository/repository.cpp Outdated

} // anonymous namespace

Repository::Repository(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain the name?

The intermediated name database/mixxxdb was not too bad.
If a new contributor wants to look how mixxxdb.sqlite is handled, he will likely not search for repository in the first place.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, that's a valid argument. I wanted to be more generic, but it turns out that this class simply host a generic connection (pool) that is initialized with custom parameters related to mixxxdb.sqlite. We might rename it in some final step.

};

class CrateStorage: public SqlStorage {
class CrateStorage: public virtual /*implements*/ SqlStorage {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we need here a virtual inheritance?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I use virtual inheritance as a default for interface inheritance, i.e. no implementation. A class should only have a single non-virtual parent. In this case it has non, i.e. no dedicated base class. SqlStorage is not a base class, but just a collection of pure virtual functions that define an interface. You may have a base class and still implement SqlStorage.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, OK that makes sense.

@uklotzde
Copy link
Copy Markdown
Contributor Author

uklotzde commented Jun 9, 2017

@daschuer I tried to split some stuff like the AnalyzerQueue, but it is just too entangled.

@uklotzde uklotzde changed the title [WIP] DB Connections: Fixes and cleanup [WIP] Thread-local Database Connection Pooling Jun 10, 2017
@uklotzde uklotzde changed the title [WIP] Thread-local Database Connection Pooling Thread-local Database Connection Pooling Jun 10, 2017
@uklotzde
Copy link
Copy Markdown
Contributor Author

Not a single issue so far given the fact that I inserted lots of assertions for checking invariants and consistent state. Looks like we have a sound concept. The only thing that stands in the way now is the entangled and sometimes abstruse design of various Mixxx components.

All the LibraryFeature classes need to be reworked now, but I expect too many conflicts with the library redesign branch.

@uklotzde
Copy link
Copy Markdown
Contributor Author

@daschuer There are some conflicts with your merge support library-redesign branch, but nothing serious on first sight. I might step in and resolve those conflicts when needed. Just want to make sure that we don't move into the wrong direction ;)

@daschuer
Copy link
Copy Markdown
Member

Did you accidentally rebased #1282 here? I think that can be merged first, I am just waiting for CI.
Than we will have the same changes in two commits in the tree.

@uklotzde
Copy link
Copy Markdown
Contributor Author

Forgot not to rebase this branch after review has started! I had in mind that those commits would vanish after rebasing this branch. But anyway, some commits were already on this branch, because it was the starting point.

How about rebasing the entire branch before finally merging it? All redundant commits will vanish. I also feel more safe when doing a final rebase and validation on top of the current development head for long running, parallel branches instead of just merging them and hoping for the best ;)

@daschuer
Copy link
Copy Markdown
Member

OK, I have just merged #1282 so you can rebase this now.

uklotzde added 6 commits June 12, 2017 07:30
...otherwise someone would be able to replace the implicitly shared
instance behind it!
...and reorder parameters in SchemaManager
...to reduce the number of strayed DB connections prior to some
bigger refactoring regarding thread-safety.
Comment thread src/test/schemamanager_test.cpp Outdated
QSqlDatabase m_db;
private:
MixxxDb m_mixxxDb;
const mixxx::DbConnectionPool::ThreadLocalScope m_dbConnectionScope;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just m_dbConnection else, it looks like that it is the scope of the connection.

Comment thread src/util/db/dbconnectionpool.cpp Outdated
}

DbConnectionPool::ThreadLocalScope::~ThreadLocalScope() {
if (m_pDbConnectionPool) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if m_dbConnection ?

Comment thread src/test/schemamanager_test.cpp Outdated
m_dbConnectionScope(m_mixxxDb.connectionPool()) {
}

const mixxx::DbConnectionPool::ThreadLocalScope& dbConnectionScope() const {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dbConnection()

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks also misleading that we return a reference to a scoped connection. Can't we return the connection itself?

Comment thread src/test/queryutiltest.cpp Outdated
}

MixxxDb m_mixxxDb;
const mixxx::DbConnectionPool::ThreadLocalScope m_dbConnectionScope;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just m_dbConnection

Comment thread src/preferences/upgrade.cpp Outdated
bool successful = false;
{
MixxxDb mixxxDb(config);
const mixxx::DbConnectionPool::ThreadLocalScope dbConnectionScope(
Copy link
Copy Markdown
Member

@daschuer daschuer Jun 14, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to have here a helper for
const mixxx::DbConnectionPool::ThreadLocalScope dbConnection = mixxxDb.connectionPool().getScopedConnection();

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just dbConnection

Comment thread src/preferences/upgrade.cpp Outdated
const mixxx::DbConnectionPool::ThreadLocalScope dbConnectionScope(
mixxxDb.connectionPool());
if (dbConnectionScope) {
QSqlDatabase connection(dbConnectionScope);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QSqlDatabase db(dbConnectionScope);?

@daschuer
Copy link
Copy Markdown
Member

It looks like we have a mixture of concepts with a local scope connection and a thread scope connection.
Is it required to have both? We probably should consider to spit these into two classes or get rid of one.
Or just two namespaces for the connection name?

Comment thread src/util/db/dbconnectionpool.cpp Outdated
}

DbConnectionPool::ThreadLocalScope::ThreadLocalScope(
DbConnectionPool::ThreadLocalScoped::ThreadLocalScoped(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DbConnectionPool::ThreadLocalScoped is IMHO still not self explaining. Isn't it a kind of Scoped pointer for a DbConnection? Normally we would call it DBConnectionPtr without the smart pointer type Hint. It is OK for me to add all the additional features to the name, if if we can efford the extra characters so it is DBConnectionThreadLocalScopedPtr.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, in that sense it effectively is a thread-local database connection. I will think about how the design could be aligned with that of the low-level DbConnection class. The low-level DbConnection class will only be used behind the scenes for resource management and should not appear application code. Maybe I can hide it entirely.

Good input, @daschuer. It's always helpful to look at things from a different angle! And it's often the "stupid" questions that open your eyes ;) But sadly not everyone understands those principles of a peer review, exhausting discussions in daily business :(

Comment thread src/util/db/dbconnectionpool.cpp Outdated
m_threadLocalConnections.setLocalData(nullptr);
}

QSqlDatabase DbConnectionPool::threadLocalConnection() const {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your good explanations, I think I understand now.
The source of confusion has to do with this function because we have now two owner of the connection.
On one hand the scoped connection and on the other hand the connection pool.

I see two possible solutions:

  1. Remove this function here and move it into the scoped connection. Than the caller is instantly aware who actually owns the connection
  2. Keep this function and hand the connection out as a shared pointer. Remove the ThreadLocalScoped class.
    Than the connection pools owns the connection and actually just one for one thread.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shared users of the connection within the same thread do not need to know were the lifetime of the thread-local connection is actually managed. The DbConnection::LocalThreadScoped reference should not be passed around.

Instead users needing a connection temporarily for executing queries may ask the pool for one. But they must be prepared to get none, if the lifetime of the connection has already ended.

If we wrap DbConnectionPool::ThreadLocalScoped into a smart_ptr I see the danger that those references get passed between threads which is strictly forbidden and would lead into lifetime management hell. DbConnectionPool::ThreadLocalScoped should not be allocated dynamically, but as a local variable in the run() function of a thread.

@daschuer
Copy link
Copy Markdown
Member

Yes you are right. I was thinking about a kind of handle that can be passed, with a data member there always fetches a fresh pointer from the thread pool. But this probably overcomplikates the thing even more.

@uklotzde
Copy link
Copy Markdown
Contributor Author

No, that's even a pretty nice idea! But I'm still unsure how to name those 2 classes???

DbConnectionPooled is currently the owner of the connection and defines the lifetime. But we need an additional class (= "handle") that dynamically hands out existing connections from the pool for the current thread by just calling DbConnectionPool::threadLocalConnection().

Those 2 facades will be the only way to access the pool, like 2 side of the same coin. The pool itself should not have any public member functions in the end.

@uklotzde
Copy link
Copy Markdown
Contributor Author

DbConnectionPooled = Owner (thread-affine with implicit creation/destruction)
DbConnectionShared = Handle (thread-independent with dynamic lookup)

Both will carry a DbConnectionPoolPtr inside.

@uklotzde
Copy link
Copy Markdown
Contributor Author

uklotzde commented Jun 15, 2017

I think I got it:

DbConnectionPooler: The owner that creates and destroys a pooled, thread-local connection (formerly know as DbConnectionPooled)

DbConnectionPooled: Dynamically accesses pooled, thread-local connections

@daschuer
Copy link
Copy Markdown
Member

Ok son thoughts: so we have the pool, it offers a public getDbConnection() this returns a shared pointer like class: DbConnectionPtr which manages closing the connection if it falls out of as scope. It's data() fetches always direct from the pool and has no local copy. If data() is called from a wrong thread, it schould probably return null instead of an other pointer which might be available as well. Problem, fetching the data from the pool might be expensive.
....
Ok, so we keep a local copy along with the pointer to the current thread. Each data() call verifies that the thread has not changed if it happens, return null.

@daschuer
Copy link
Copy Markdown
Member

I have read you Pooler comment after my post.

: m_pDbConnectionPool(pDbConnectionPooler.m_pDbConnectionPool) {
}

explicit operator bool() const {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this requires a comment. I now understand that checking for m_pDbConnectionPool is sufficiant because "operator QSqlDatabase()" never fails. But it was not easy to discover.

@daschuer
Copy link
Copy Markdown
Member

Yes! This works quite good for me. The names are that un-self explaining, that you do not get confused and are able to catch the concept from the usage ;-)

I have just added a comment request and than it LGTM. Thank you.

@daschuer
Copy link
Copy Markdown
Member

Thank you. Give me a hint if you are done.

@uklotzde
Copy link
Copy Markdown
Contributor Author

@daschuer Great input, thank you! I like the design that has finally appeared now, very simple and effective. Tiny classes with a clear responsibilities that fit together seamlessly.

Ready, if the CI builds finish successfully.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants