-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prepare and write txn rows in parallel #802
Conversation
Codecov Report
@@ Coverage Diff @@
## develop #802 +/- ##
===========================================
+ Coverage 58.83% 58.98% +0.15%
===========================================
Files 29 30 +1
Lines 4059 4084 +25
===========================================
+ Hits 2388 2409 +21
- Misses 1374 1378 +4
Partials 297 297
Continue to review full report at Codecov.
|
ctx, cancelFunc := context.WithCancel(context.Background()) | ||
defer cancelFunc() | ||
|
||
ch := make(chan []interface{}, 64) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By definition, we're doing this because serialization is slower than the network. Therefore we shouldn't need a buffered channel.
If we do keep the buffered channel, you'll need to make sure we remove any unread items from the channel.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, serialization is faster than postgres.
Unread items in case of an error? The channel will be garbage collected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since serialization is faster, doesn't that mean a buffer is not required? As long as we're able to have 1 pending object to load there would never be any downtime.
Re: garbage collection. I didn't realize the unread items would be garbage collected, but it makes sense that this would be the case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if there is a more authoritative source for this, but here is a reference to something I've heard before:
https://github.com/uber-go/guide/blob/master/style.md#channel-size-is-one-or-none
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If one is much faster than the other, this PR wouldn't improve anything. However, here postgres and serialization are somewhat comparable in speed.
Our postgres driver creates buffers of 65KB and writes them to the network. Depending on whether the "network" has a large enough buffer, this call may block and we would waste CPU time that we could use for serialization. So actually, I'm thinking we should increase the buffer to 1024 to be safe. Would do you think?
I've seen this uber style guide. I think they mainly worry about blocking that can occur when writing to a channel, but when the buffer size is large, blocking only rarely occurs in production and not in testing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found the 65KB number you mentioned, but it doesn't seem to be working the way you described. It doesn't accumulate 65KB of data, it reads up to 65KB of data then sends it immediately. Since it's TCP it needs to be received correctly (the wasted CPU time you mentioned), but there doesn't seem to be a way to speed that up, so I don't think we need a large buffer of transactions.
https://github.com/jackc/pgconn/blob/master/pgconn.go#L1199
go func() {
buf := make([]byte, 0, 65536)
buf = append(buf, 'd')
sp := len(buf)
for {
// here, r is a named pipe. When you call CopyTo data is written into the pipe.
// 65KB isn't a target, it's a temporary buffer. There is no guarantee that it should be filled.
// To be more precise, from the documentation:
// Reads and Writes on the pipe are matched one to one except when multiple Reads are
// needed to consume a single Write. That is, each Write to the PipeWriter blocks until it
// has satisfied one or more Reads from the PipeReader that fully consume the written data.
n, readErr := r.Read(buf[5:cap(buf)])
if n > 0 {
buf = buf[0 : n+5]
pgio.SetInt32(buf[sp:], int32(n+4))
// If any data was read... 1 byte, all 65KB bytes, it's written to the network.
// This means that the writer is blocked here regardless of how large the buffered
// channel is. After the read, our code will queue the next row in the pipe, and
// queue the one after in the channel.
_, writeErr := pgConn.conn.Write(buf)
if writeErr != nil {
// Write errors are always fatal, but we can't use asyncClose because we are in a different goroutine.
pgConn.conn.Close()
copyErrChan <- writeErr
return
}
}
if readErr != nil {
copyErrChan <- readErr
return
}
select {
case <-abortCopyChan:
return
default:
}
}
}()
It seems like having a large buffered channel wouldn't help anything. Did you see that performance was affected in your testing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you agree that we would waste cpu time with an unbuffered channel, then as soon as w.Write()
returns, where w
is the write side of the pipe, we will need to spend time serializing transactions until we can call w.Write()
again. I tried tracing down the code, and I think it boils down to how fast the write syscall returns. That seems to be platform depended, so I would rather introduce a big enough buffer in our code. In any case, it's small enough not to worry about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Waste" is the wrong term since the goal isn't to use as much CPU as possible. From what I see we need one pending round to continuously load the DB. Thanks to the pipe we already have 2 queued up with an unbuffered channel.
How did you get the 8% metric? Could you try with an unbuffered channel to see if it has the same performance?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem with that pipe is that the writer writes slightly more than 2^16 bytes and will unblock only when all of it is read. However, the reader reads slightly less than 2^16 bytes and writes it to the network before reading the other couple of bytes.
I'm running tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
chan size | TPS
0 | 6798.43
64 | 6891.80
1024 | 6946.75
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great. Based on your comment I think we just need to decide whether the channel should be unbuffered, buffered with a size of one, or convince ourselves that a buffer > 1 is helpful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for running the tests, there's clearly a marginal improvement after all!
Summary
At the moment, we first prepare all database rows for the
txn
table, and only then write them to the database. Preparing database rows is expensive, mainly because of json encoding. This PR makes preparation and writing to the database concurrent, thus improving performance. Sample tests showed 8% in TPS improvement.Closes #794.