Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

timeout option for the pool:get() #28

Merged
merged 1 commit into from
Nov 24, 2019

Conversation

mkv
Copy link

@mkv mkv commented Sep 19, 2019

This patch adds timeout option for the pool:get() operation.
It is may be necessary to know when there are no connections at the pool.
E.g. when we lost the server.

@mkv
Copy link
Author

mkv commented Oct 25, 2019

Please, we need a version with this changes.

1 similar comment
@mkv
Copy link
Author

mkv commented Oct 29, 2019

Please, we need a version with this changes.

@mkv mkv closed this Oct 29, 2019
@mkv mkv reopened this Oct 29, 2019
@Totktonada
Copy link
Member

Okay, I’ll look at the week.

Copy link
Member

@Totktonada Totktonada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to add a test. Other then one comment below I have no objections. Please, fix it and then we can push it.

mysql/init.lua Show resolved Hide resolved
@mkv
Copy link
Author

mkv commented Nov 12, 2019

I had tried to do test to emulate connection errors, but it is complicated task for current tests infrastructure.

@Totktonada
Copy link
Member

@mkv We should at least verify work of the pool in usual cases when a pool was drained and we make attempts to receive a connection with and without a timeout. I wrote a test case and found that your implementation would not give nil when there are no more connections in a pool: it will add a new connection to the pool and return it. Since a pool has an upper size boundary it may lead to unability to put a connection to the pool later. This is incorrect behaviour and I changed it.

Please, look into the new version of the patch and let me know whether are you okay. If you have some external tests (say, within your project), it would be glab if you'll verify that everything works as expected for your cases.

@Totktonada
Copy link
Member

@Gerold103 I would be grateful if you'll find a time to look into the patch.

@Totktonada
Copy link
Member

Waiting for @mkv verification.

@mkv
Copy link
Author

mkv commented Nov 19, 2019

How we can restart broken connections in this version of code?
We do this like as shown in the code below.

   local conn
   local ok, err = pcall(function() conn = ctx.pool:get{timeout = 0} end)
   if not conn then
      log.error('Failed to get connection: %s', err)
      return
   end

   local tuples, status
   local ok, err = pcall(function() tuples, status = conn:execute(query) end)
   if not ok then
      conn.usable = false
      ctx.pool:put(conn)
      log.error('Failed to execute query: %s', err)
      return
   end
   ctx.pool:put(conn)

Is it right way?

@Totktonada
Copy link
Member

It seems it should work, but okay I'll add a test case to ensure.

@mkv
Copy link
Author

mkv commented Nov 19, 2019

Something wrong happenes.
When we use the module like described above and disconnection happens, after about 80 request/response pairs we stops on the second "pool:put".

@mkv
Copy link
Author

mkv commented Nov 19, 2019

It seems that "conn.__gc_hook" adds extra connection into queue.

@Totktonada
Copy link
Member

@mkv While I'm working on that, can you please share a reason why you need to mark a connection as broken manually? Does not automatic reestablished work in your case? If so, what is the case?

@mkv
Copy link
Author

mkv commented Nov 20, 2019

Yes, broken connections does not automatic reestablished.
We just use API as shown in the code above.
And our system limits queries outside of a component that works with tarantool.mysql.

Currently we testing this patch:

diff --git a/mysql/init.lua b/mysql/init.lua
index ec6c2e9..90089c8 100644
--- a/mysql/init.lua
+++ b/mysql/init.lua
@@ -213,6 +213,8 @@ local function pool_put(self, conn)
     if conn.usable then
         self.queue:put(conn_put(conn))
     else
+        conn.conn:close()
+        ffi.gc(conn.__gc_hook, nil)
         self.queue:put(POOL_EMPTY_SLOT)
     end
 end

Suspending of a fiber on the channel:put is gone.

@Totktonada
Copy link
Member

@mkv It seems you're step into a problem that was exist before this PR. I'll prepare a test case and a fix. As I see, your fix at least in the right direction.

I see that in case of any failed request a connection is marked as broken (either with conn.queue:put(false) or with conn.usable = false). Both cases should lead to reestablishing a connection. Is manual setting of conn.usable still needed after your fix?

BTW, I just observed a typo I made in my changes:

diff --git a/mysql/init.lua b/mysql/init.lua
index ec6c2e9..235620f 100644
--- a/mysql/init.lua
+++ b/mysql/init.lua
@@ -60,7 +60,7 @@ local function conn_put(conn)
     ffi.gc(conn.__gc_hook, nil)
     if not conn.queue:get() then
         conn.usable = false
-        return nil
+        return POOL_EMPTY_SLOT
     end
     conn.usable = false
     return mysqlconn

Hope it'll help with testing on your side.

@Totktonada
Copy link
Member

My initial assumption was not true. There are broken connections (a connection is marked as broken after any failed request) and unusable connections (after explicit :close() or putting back into a pool). If you don't close a connection from a pool (it seems you shouldn't), don't touch private fields like conn.usable and conn.queue and apply the fix above and one below, then everything should be fine: you don't need to (and should not) manually set conn.usable field.

diff --git a/mysql/init.lua b/mysql/init.lua
index 235620f..7e56ed3 100644
--- a/mysql/init.lua
+++ b/mysql/init.lua
@@ -185,7 +185,7 @@ local function pool_close(self)
     self.usable = false
     for i = 1, self.size do
         local mysql_conn = self.queue:get()
-        if mysql_conn ~= nil then
+        if mysql_conn ~= POOL_EMPTY_SLOT then
             mysql_conn:close()
         end
     end

Don't sure about leaks at the moment. I'll continue looking into that and at least will cover the pool behaviour with appropriate tests in separate PRs. Then I'll update this PR with proper fixes and test cases.

Thanks for testing!

@Totktonada
Copy link
Member

Updated the PR with the proposed fixes for convenience.

@Totktonada
Copy link
Member

Now I'm more or less organized things for myself. Filed issues about things that looks as subjects to change: #30, #31, #32, #33, #34. Wrote basic test cases for the connection pool: PR #35. Verified that this PR does not break those test cases.

The plan is the following: do self-review and push the test cases to master (PR #35), rebase this PR on top of the new master (and squash commits), wait for @mkv for verification on his cases. If everything will be okay, we can finally push this feature.

@mkv
Copy link
Author

mkv commented Nov 22, 2019

Currently we can say that our test is OK.

@Totktonada
Copy link
Member

Great, thanks!

@Totktonada
Copy link
Member

My tests about broken connections do not actually broke connections (#37), so I verified them manually before and after this PR. I added a parse error to the list of non-transitive errors:

mysql/mysql/driver.c

Lines 149 to 157 in 2d22046

int err = mysql_errno(raw_conn);
switch (err) {
case CR_SERVER_LOST:
case CR_SERVER_GONE_ERROR:
lua_pushnumber(L, -1);
break;
default:
lua_pushnumber(L, 1);
}

Everything looks okay with this PR. I going to merge it.

@Totktonada Totktonada merged commit 7a63383 into tarantool:master Nov 24, 2019
@mkv
Copy link
Author

mkv commented Nov 25, 2019

Thank you very much!
We are going to continue tests when the PR will be tagged of an official version.

@Totktonada
Copy link
Member

We have to deviler our packages continuosly. This PR was pushed as 2.0-42-g7a63383 and all packages are already in the repository:

https://packagecloud.io/app/tarantool/2_3/search?q=2.0.42&filter=all&dist=

(The same for 1.10, 2x and 2.2 repos.)

However if you're using tarantoolctl rocks / luarocks, then, yep, explicit release is necessary to hold a specific version. It seems it worth to release 2.1.

@mkv
Copy link
Author

mkv commented Nov 25, 2019

@Totktonada Thank you for this link, that`s enough.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants