Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

statements in context cancelled transaction still getting committed #1010

Closed
dadkins-at-dropbox opened this issue Oct 8, 2019 · 3 comments
Closed

Comments

@dadkins-at-dropbox
Copy link

Issue description

It is possible to execute statements in a transaction which has been aborted via cancelled context, and have those statements commit.

Basically, in response to context cancellation, the library calls
Tx.Rollback
Stmt.ExecContext
Conn.Close

It should not be possible to execute a statement after the transaction has been rolled back, and in fact because of mysql's autocommit default, the statement will surprisingly commit.

See golang/go#34775.

While it looks like a database/sql bug, I wonder if there's something that can be done to mitigate on the driver side.

Example code

https://gist.github.com/dadkins-at-dropbox/5b051b9ccb30f115c332c6037d03d63d

package main

import (
	"context"
	"database/sql"
	"fmt"
	"log"
	"os"
	"time"

	_ "github.com/go-sql-driver/mysql"
)

func main() {
	if len(os.Args) != 2 {
		log.Fatalf("usage: %s <dbname>", os.Args[0])
	}
	dbName := os.Args[1]
	db, err := sql.Open("mysql", "root@/"+dbName)
	if err != nil {
		log.Fatal(err)
	}
	defer db.Close()

	db.SetMaxOpenConns(100)
	db.SetMaxIdleConns(100)

	if _, err := db.Exec("drop table if exists t"); err != nil {
		log.Fatal(err)
	}
	if _, err := db.Exec("create table t (id int not null auto_increment primary key)"); err != nil {
		log.Fatal(err)
	}

	ins, err := db.Prepare("insert into t (id) values (null)")
	if err != nil {
		log.Fatal(err)
	}

	sel, err := db.Prepare("select count(*) from t")
	if err != nil {
		log.Fatal(err)
	}

	for i := 0; i < 10; i++ {
		go func() {
			for i := 0; ; i++ {
				runTx(db, ins)
			}
		}()
	}
	for range time.Tick(time.Second) {
		var n int
		r := sel.QueryRow()
		if err := r.Scan(&n); err != nil {
			fmt.Println("select", err)
			continue
		}
		fmt.Printf("%d rows (should be 0)\n", n)
	}
}

func runTx(db *sql.DB, stmt *sql.Stmt) error {
	ctx, cancel := context.WithCancel(context.Background())
	quit := make(chan struct{})
	done := make(chan struct{})
	tx, err := db.BeginTx(ctx, nil)
	if err != nil {
		fmt.Println("begin", err)
		return err
	}
	defer tx.Rollback()

	stmt = tx.Stmt(stmt)

	go func() {
		timer := time.NewTimer(time.Microsecond)
		defer timer.Stop()
		select {
		case <-quit:
		case <-timer.C:
		}
		cancel()
		close(done)
	}()
	defer func() {
		close(quit)
		<-done
	}()

	for i := 0; i < 10; i++ {
		if _, err := stmt.Exec(); err != nil {
			return err
		}
	}
	_ = tx.Rollback()
	return nil
}

Error log

1 rows (should be 0)
3 rows (should be 0)
4 rows (should be 0)
5 rows (should be 0)

Configuration

Driver version (or git SHA):
14bb9c0

Go version: run go version in your console
go version go1.13.1 darwin/amd64

Server version: E.g. MySQL 5.6, MariaDB 10.0.20
Server version: 8.0.17 Homebrew

Server OS: E.g. Debian 8.1 (Jessie), Windows 10

$ uname -a
Darwin dadkins-mbp.corp.dropbox.com 18.7.0 Darwin Kernel Version 18.7.0: Tue Aug 20 16:57:14 PDT 2019; root:xnu-4903.271.2~2/RELEASE_X86_64 x86_64

(I also see the problem on Linux against mysql 5.7)

@methane
Copy link
Member

methane commented Oct 9, 2019

I think there is nothing we can do here.

@dadkins-at-dropbox
Copy link
Author

Is there anything we can do to detect and mitigate cases where the library violates its contract and makes simultaneous calls to the driver? Here's a real trace from my system which ends in disaster, when tx.Rollback is called (by the database/sql library) in the middle of stmt.ExecContext.

The rollback fails with a busy buffer error, but not before resetting the sequence number and fatally confusing the effort to read the result set.

Ultimately, the standard library will need to be fixed, but it would make the impact less severe if the mysql driver were able to protect itself.

  • can we not reset the sequence number if fail to take the buffer?
  • can we add an 'in use' flag and fail operations which try to use the driver incorrectly?
  • why did we keep trying to read packets after the presumed ErrPktSyncMul? Isn't the connection tainted at that point?
[mysql] 2019/10/10 23:25:59 connection.go:577: [0xc000348fc0]stmt.ExecContext

*** COM_STMT_EXECUTE
[mysql] 2019/10/10 23:25:59 packets.go:155: [0xc000348fc0] :write len=10 seq=0
[mysql] 2019/10/10 23:25:59 packets.go:156: [0xc000348fc0]
00000000  17 01 00 00 00 00 01 00  00 00                    |..........|

*** column count
[mysql] 2019/10/10 23:25:59 packets.go:45: [0xc000348fc0] :read len=1 seq=1
[mysql] 2019/10/10 23:25:59 packets.go:80: [0xc000348fc0]
00000000  01                                                |.|

*** column def
[mysql] 2019/10/10 23:25:59 packets.go:45: [0xc000348fc0] :read len=37 seq=2
[mysql] 2019/10/10 23:25:59 packets.go:80: [0xc000348fc0]
00000000  03 64 65 66 00 00 00 0f  43 4f 4e 4e 45 43 54 49  |.def....CONNECTI|
00000010  4f 4e 5f 49 44 28 29 00  0c 3f 00 15 00 00 00 08  |ON_ID()..?......|
00000020  a1 00 00 00 00                                    |.....|

*** EOF
[mysql] 2019/10/10 23:25:59 packets.go:45: [0xc000348fc0] :read len=5 seq=3
[mysql] 2019/10/10 23:25:59 packets.go:80: [0xc000348fc0]
00000000  fe 00 00 03 00                                    |.....|

*** RACE: stmt.ExecContext is still active, shouldn't be calling tx.Rollback ***
[mysql] 2019/10/10 23:25:59 transaction.go:30: [0xc000348fc0]tx.Rollback
[mysql] 2019/10/10 23:25:59 packets.go:454: busy buffer
*** resets mc.sequence = 0 ***

*** packet header for row...
[mysql] 2019/10/10 23:25:59 packets.go:45: [0xc000348fc0] :read len=10 seq=4
*** ErrPktSyncMul? since mc.sequence got reset ***

*** disaster, we've treating the packet body as a new packet header, and the sequence number matches! we're going to lunch now while we wait to read 10MB that's never coming.
[mysql] 2019/10/10 23:25:59 packets.go:45: [0xc000348fc0] :read len=10027008 

@methane
Copy link
Member

methane commented May 2, 2023

golang/go#34775 has been fixed.

@methane methane closed this as not planned Won't fix, can't repro, duplicate, stale May 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants