Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't run two queries in a transaction? #81

Open
sbowman opened this issue Feb 25, 2013 · 44 comments · May be fixed by #117
Open

Can't run two queries in a transaction? #81

sbowman opened this issue Feb 25, 2013 · 44 comments · May be fixed by #117

Comments

@sbowman
Copy link

sbowman commented Feb 25, 2013

I have some data I've loaded into a table from a CSV file that I want to massage into a master table. When I try to do this in a transaction, pq just hangs on the second query I perform for no reason I can ascertain.

Here's my code. I chopped out everything after the second query; that query checks to see if the data is already migrated, and if not, runs all the create code and such. If you comment out the "select * from community_gnis" line, the loop runs to completion, printing every city in the table. If I leave the query in, it just hangs after the first city name.

if db, err := sql.Open("postgres", "user=postgres dbname=geo sslmode=disable"); err == nil {
    defer db.Close()
    if tx, err := db.Begin(); err == nil {
        if cities, err := tx.Query("select name, state, id, lon, lat from noaa.cities where id is not null"); err == nil {

            var name, state, gnis string
            var longitude, latitude float32

            for cities.Next() {

                if err = cities.Scan(&name, &state, &gnis, &longitude, &latitude); err == nil {

                    fmt.Println("Got city", name, "(", gnis, ")")

                    tx.Query("select * from community_gnis where gnis = $1", gnis)
                }
            }
        }
    }
}

I'm new to Go, so maybe I'm doing things incorrectly here. However, if I move the "select * from community_gnis" up above the "select name, state..." statement, the community_gnis statement runs fine and the "select name, state..." query will then be the one that hangs.

If I remove the transaction code and run everything against "db", e.g. "db.Query" instead of "tx.Query", etc., everything runs completely fine (though it doesn't run in a transaction, which is what I want).

I've tried various combinations, using prepared statements, adding Close calls everywhere, etc. Nothing works. I'm 99% sure it's not the database, because I'm migrating this code from a Node JS app to test whether we should use Go or Node for our project, and the Node version runs completely in a transaction.

@sbowman
Copy link
Author

sbowman commented Feb 27, 2013

I did some additional digging. Turns out this isn't a pq problem, it's a problem with database/sql in Go. Here's the ticket: https://code.google.com/p/go/issues/detail?id=3857

@sbowman sbowman closed this as completed Feb 27, 2013
@sbowman
Copy link
Author

sbowman commented Jul 16, 2013

With the latest version of Go (1.1), they sort of fixed this issue, so now I think it's a problem with pq.

If I have two queries running simultaneously in a transaction, pq gets confused by the responses and returns weird error messages. Again, this is perfectly reasonable functionality, and works in every other language but Go.

We don't get deadlocks anymore. Instead, here's what I'm seeing now:

var err error

// Where lookupExisting, updateExisting, and insertNew are prepared statements, a la tx.Prepare(...)
rows, _ := lookupExisting.Query(id)
defer rows.Close()

if rows.Next() {
    _, err = updateExisting(newValue, id)
} else {
    _, err = insertNew(id, newValue)
}

fmt.Printf("Errors? %s\n", err)

If you create something like this, when lookupExisting doesn't find anything, an insert occurs and everything is ok. If a record already exists and you try to update it, you'll get the error message "Errors? pq: unexpected bind response: 'C'". My guess is the two prepared statements are "crossing streams" in the pq driver and the driver is confused. This means that my old example is still problematic too, since if you try to query and update in the loop, you'll run into this same overlapping of information.

If you make this change, everything will work (there's no fix for my original example though; that just won't work in Go with pq):

if rows.Next() {
    rows.Close()
    _, err = updateExisting(newValue, id)
} else {
    rows.Close()
    _, err = insertNew(id, newValue)

@sbowman sbowman reopened this Jul 16, 2013
@RangelReale
Copy link

I am having the exact same problem, but with a different error message: pq: unexpected describe rows response: 'D'

Looking on the pq manual, 'D' is "DataRow", which is not handled in pq, but without using transactions, this doesn't happen;

Here is a complete program that fails with this error.

package main

import (
    "database/sql"
    "fmt"
    _ "github.com/lib/pq"
)

func main() {

    db, err := sql.Open("postgres", fmt.Sprintf("host=%s user=%s dbname=%s password=%s sslmode=disable",
        "localhost", "postgres", "db_test", "password"))
    if err != nil {
        panic(err)
    }

    tx, err := db.Begin()
    if err != nil {
        panic(err)
    }

    rows, err := tx.Query("select tid from table1")
    if err != nil {
        panic(err)
    }

    for rows.Next() {
        var tid int32

        if err = rows.Scan(&tid); err != nil {
            panic(err)
        }

        sirows, err := tx.Query("select xid from table2")
        if err != nil {
            panic(err)
        }

        for sirows.Next() {
            var xid int32

            if err := sirows.Scan(&xid); err != nil {
                panic(err)
            }
        }

        if err := sirows.Err(); err != nil {
            panic(err)
        }

        sirows.Close()
    }

    if err = rows.Err(); err != nil {
        panic(err)
    }

    if err = tx.Commit(); err != nil {
        panic(err)
    }
}

Outputs:

panic: pq: unexpected describe rows response: 'D'

on this line:

sirows, err := tx.Query("select xid from table2")

@RangelReale
Copy link

This may be the way the PostgreSQL protocol works, according to this link, maybe there is no solution for this. This is not good.
http://www.postgresql.org/message-id/[email protected]

@sbowman
Copy link
Author

sbowman commented Jul 18, 2013

Hmm, it's an old post, so I don't know I it's still an issue. If you follow my link to the Google bug about this, I uploaded some sample code where I ran this scenario through JavaScript, Java, and Ruby (maybe Python too), and every other database library in those languages had no problems running through these types if scenarios. Now maybe they were doing something sneaky behind the scenes, but I don't thing so.

Also, every language but Java uses libpq to interact with the database, so maybe libpq is the one being sneaky. I wish there was a libpq wrapper for Go.

On Jul 18, 2013, at 9:42 AM, Rangel Reale [email protected] wrote:

This may be the way the PostgreSQL protocol works, according to this link, maybe there is no solution for this. This is not good.
http://www.postgresql.org/message-id/[email protected]


Reply to this email directly or view it on GitHub.

@RangelReale
Copy link

I am investigating this in depth, one of the thinks I discovered is that libpq is in fact sneaky. It always loads all results sets entirely in memory. If you want to return row by row, then you can't issue a new query before you close the previous one:

http://www.postgresql.org/docs/9.2/static/libpq-single-row-mode.html

Ordinarily, libpq collects a SQL command's entire result and returns it to the application as a single PGresult.

@ghost
Copy link

ghost commented Jul 18, 2013

On Thu, Jul 18, 2013 at 1:32 PM, Sean Bowman [email protected] wrote:

I wish there was a libpq wrapper for Go.

Not sure if this meets your needs or not, but at least one cgo based
libpq wrappers exists.

https://github.com/jgallagher/go-libpq

@RangelReale
Copy link

Found another user with this problem, although the email is from 2005, my study shows that this is still the case:

http://www.mail-archive.com/[email protected]/msg71802.html

On the same thread, there is a suggestion of using a cursor:

PQexec(conn, "BEGIN");
PQexec(conn, "DECLARE \"test\" CURSOR FOR SELECT * FROM \"TEST\"");
for (...)
{
        PQexec(conn, "FETCH FORWARD 100 from \"test\"");
        /* Use just retrieved result. */

        /* Don't forget to clear used PGresult. */
}
PQexec(conn, "...");

I did a quick test in Golang, and it seems to work, but we would need to FETCH 1 por 1, each one generating a new exec, it probably can become expensive.

  _, err = tx.Exec("declare TESTC1 CURSOR for select tid from table1")
  if err != nil {
    panic(err)
  }

  rows, err := tx.Query("FETCH ALL in TESTC1")
  if err != nil {
    panic(err)
  }

  for rows.Next() {
  }

  _, err = tx.Exec("close TESTC1")
  if err != nil {
    panic(err)
  }

I think that the way to go is to emulate libpq, that is, load all rows on Query(). This looks like the the "official" way.

@kisielk
Copy link
Contributor

kisielk commented Jul 18, 2013

Does it always load all rows on query, or only if another query is issued while one is still pending?

@RangelReale
Copy link

libpq always fetch all records on query, except if "PQsetSingleRowMode" was called.
But this is a very new function, it was introduced in 9.2.

@sbowman
Copy link
Author

sbowman commented Jul 18, 2013

I think I would rather it stay with pulling data as it does, and just document the library to suggest opening a second connection if you need to do this kind of interwoven processing. I don't like the idea of loading all the results at once. Defeats the purpose of going row by row, presumably because there are so many results it could overflow the memory.

On Jul 18, 2013, at 1:28 PM, Rangel Reale [email protected] wrote:

libpq always fetch all records on query, except if "PQsetSingleRowMode" was called.
But this is a very new function, it was introduced in 9.2.


Reply to this email directly or view it on GitHub.

@RangelReale RangelReale linked a pull request Jul 18, 2013 that will close this issue
@RangelReale
Copy link

I implemented it to be like libpq, with a fallback to the current mode, using the same name as libpq (singlerowmode).
This fixes my problem, and all my applications continue to work normally.

msakrejda pushed a commit that referenced this issue Jul 19, 2013
@msakrejda
Copy link
Contributor

Thanks for the patch, @RangelReale. I think your exact approach (i.e., a flag for the old behavior) would be problematic, but after reading through the various linked messages, bugs, and docs (thanks to you and @sbowman for linking these), and considering my own experience with the protocol, a lot of this comes down to the underlying behavior in the Postgres wire protocol: it doesn't support multiple concurrent standard statements because it's fundamentally synchronous (ignoring things like cancel).

So to resolve this issue, we could either:

  1. Block this behavior outright
  2. Fake it by buffering results in the client where necessary (e.g., when one needs to issue a second, concurrent query in the same transaction)
  3. Approximate it by having everything use the Extended Query Protocol, and always forcing Execute messages to retrieve one row at a time.

I think (3) is right out because it would be disastrous to performance, requiring a round-trip per row. (1) is simple but shifts the burden onto the pq user. (2) may be reasonable, but I think we'd need to approach it differently: rather than just offering an option for buffering, we could use the current row-by-row processing in the common case, but immediately buffer the remainder of the previous query if the user issues a second query in the same transaction. This might be a little too magical, but it seems fairly reasonable.

It might also be nice to have some combination of (2) and (3), fetching k rows at a time via Execute to allow interleaving of queries, but I'm not sure exactly what that would look like

Thoughts?

@RangelReale
Copy link

To help our decision, I made a study of how other platform's native postgresql do in this case.
All except the C# one fetches all rows on Query by default, but the C# one have some notes.

PHP (libpq)

Default method: fetch all on query

Python (http://python.projects.pgfoundry.org/)

Default method: fetch all on query
http://python.projects.pgfoundry.org/docs/1.1/driver.html#statement-interface-points

The default execution method, call, produces the entire result set. It is the simplest form of statement execution. Statement objects can be executed in different ways to accommodate for the larger results or random access(scrollable cursors).

node.js (https://github.com/brianc/node-postgres)

Default method: fetch all on query - queues queries so no 2 queries are run at the same time
https://github.com/brianc/node-postgres/wiki/Query#example
Don't allow the user to iterate itself, also as queries are queued, it is impossible to run one query inside another

Java (http://jdbc.postgresql.org/)

Default method: fetch all on query
http://jdbc.postgresql.org/documentation/head/query.html

Once you have a Statement or PreparedStatement, you can use issue a query. This will return a ResultSet instance, which contains the entire result

C# (http://npgsql.projects.pgfoundry.org/)

Default method: fetches row by row, but warns that only one query can be made at a time, with option to preload all rows
http://npgsql.projects.pgfoundry.org/docs/manual/UserManual.html
(P.S: according to this page, ADO.NET explicitly says that only one query can be run at a time on a connection. Golang database/sql documentation doesn't says this)

According to the ADO.NET documentation, while an IDataReader is open the IDbConnection used to obtain it is "busy" and cannot be used for any other operations
Note that you can 'daisy chain' select statements in a command object's commandtext to retrieve more than one record set: "select * from tablea; select * from tableb"

Ruby (libpq - https://github.com/ged/ruby-pg)

Default method: fetch all on query

@kisielk
Copy link
Contributor

kisielk commented Jul 19, 2013

The Python driver is a little more complicated than that. While __call__ just returns all the rows at once, it also has the option of stream iterating them, or fetching them in chunks: https://github.com/python-postgres/fe/blob/master/postgresql/driver/pq3.py#L1596

Of course this doesn't deal with multiple simultaneous queries in any way, presumably you'd need to deal with that on a higher level.

Similarly the JDBC driver supports fetching rows with a cursor: http://jdbc.postgresql.org/documentation/head/query.html#query-with-cursor

@RangelReale
Copy link

Yes, I was looking for mostly what is the "default" way of doing things, what most people would use.

On the Java case, the "setFetchSize" is part of the standard Java api, so the api have some notion of Cursors, which the go one doesn't.

@kisielk
Copy link
Contributor

kisielk commented Jul 19, 2013

Right, but in any case pq needs to provide a means of working with streaming results, whether it is the default or not. Otherwise it makes working with large datasets unmanageable. It would be good to maintain the current behaviour for existing code that just performs a single query on a connection so that we don't cause any surprises. Since multiple simultaneous queries don't work right now anyway, we have a bit more leeway in how to handle that.

@RangelReale
Copy link

Also is this library thread safe, or does it need to? If it needs locks, I imagine it would be dificult to return streaming results in a thread-safe manner.

@kisielk
Copy link
Contributor

kisielk commented Jul 20, 2013

In typical usage database/sql will use a new sql.Conn for each query so it's not typically a concern. Transactions are the exception, since a transaction context is tied to a particular Conn for obvious reasons. The library doesn't specify that a transaction is safe for concurrent use, which typically means it is not. I don't think we really need to worry about concurrency in that case.

@msakrejda
Copy link
Contributor

Right, but in any case pq needs to provide a means of working with streaming results, whether it is the default or not. Otherwise it makes working with large datasets unmanageable. It would be good to maintain the current behaviour for existing code that just performs a single query on a connection so that we don't cause any surprises. Since multiple simultaneous queries don't work right now anyway, we have a bit more leeway in how to handle that.

I tend to agree. I think we should either outright block this as unsupported, or implement the just-in-time buffering I suggested above (i.e., only buffer if we're in the middle of processing a Rows object and another query is being issued--in that case, buffer the outstanding rows so we can issue the next query).

@RangelReale
Copy link

I can implement it like that, only buffering if a new query arrives, but for that I would need to cache the last rows object in the connection.

In the golang documentation it says:

Conn is a connection to a database. It is not used concurrently by multiple goroutines.
Conn is assumed to be stateful.

So I can safelly assume that a connection will never be used by too goroutines, and just store the last rows object on the connection, to fetch all data in case a new query arrives?

@msakrejda
Copy link
Contributor

@RangelReale yeah, I think so. Keep in mind that you'll need to do this dynamically: e.g., if multiple queries are started-but-not-yet-finished, you'll need to buffer the one currently in progress and push that onto the stack of buffered queries, then unravel the stack as necessary once things complete.

@sbowman
Copy link
Author

sbowman commented Jul 24, 2013

I don't like the idea that the library is going to "magically" do something behind the scenes for me, so I vote for not supporting this functionality (yes, with all this new information, I'm switching my original vote). Seems contrary to the Tao of Go to start helping me when I didn't ask for it.

On Jul 24, 2013, at 12:04 AM, Maciek Sakrejda [email protected] wrote:

@RangelReale yeah, I think so. Keep in mind that you'll need to do this dynamically: e.g., if multiple queries are started-but-not-yet-finished, you'll need to buffer the one currently in progress and push that onto the stack of buffered queries, then unravel the stack as necessary once things complete.


Reply to this email directly or view it on GitHub.

@RangelReale
Copy link

Maybe we could ask on golang-nuts what other people think about this?

@kisielk
Copy link
Contributor

kisielk commented Jul 24, 2013

If we don't go with the buffering solution, we should at least raise a panic if there is an outstanding query in a transaction and another one is attempted. Currently it just locks up and makes for hard to track down problems, at least with a panic it would indicate that there's been a programming error. @sbowman may be right that automatic buffering may be a little too magical

@sbowman
Copy link
Author

sbowman commented Jul 24, 2013

@kisielk Are you using Go 1.0 or 1.1? There was an issue with database/sql in 1.0 that is fixed in 1.1. In 1.1 you'll get a weird error message, but it shouldn't lock up anymore.

@sbowman
Copy link
Author

sbowman commented Nov 21, 2014

I don’t think this is the same problem. Looks like you’re creating a prepared statement but trying to put multiple SQL commands into it (“BEGIN”, “UPDATE”, “UPDATE”, “COMMIT”). I don’t think PostgreSQL supports that (though I’ll admit I’ve never tried).

On Nov 20, 2014, at 6:40 PM, sorenhoyer [email protected] wrote:

I just ran into the problem with transactions in Go+pq

BEGIN;
UPDATE tbl1 SET col1 = 'col1'
WHERE id = 1;
UPDATE tbl2 SET col1 = 'col1', col2 = 'col2'
WHERE id = 2;
COMMIT;

The error message I get is this:
pq cannot insert multiple commands into a prepared statement

What am I missing? Is there another way around this?


Reply to this email directly or view it on GitHub #81 (comment).

@elvizlai
Copy link

pq: unexpected Parse response 'D' happened!

    for rows.Next() {
        if err := rows.Scan(&id); err != nil {
            fmt.Println("scan err:", err)
        }
        fmt.Println(id)

        // this works 
        fmt.Println(tx.Exec(`UPDATE configs set update_at=now()`))
        // this will return pq: unexpected Parse response 'D'
        fmt.Println(tx.Exec(`UPDATE configs set update_at=$1`, time.Now()))
    }

@w20089527
Copy link

Before we do next query or update, we must close the previous rows.
For examples:

If we do not close the rows, the next update will fail.

tx, err := utils.GetDB().Begin()
if err != nil {
    log.Fatal(err)
}
rows, err := tx.Query("select id from t_client where data=$1", "test")
if err != nil {
    tx.Rollback()
    log.Fatal(err)
}
var id int32
if rows.Next() {
    rows.Scan(&id)
    rows.Close()   //Very very very important!!!
}
// Here if the rows do not close, the update will fail.
tx.Exec("update t_client set last_upload_time=NOW() where id=$1", id)
err = tx.Commit()

@solarfly73
Copy link

This is still a problem under go 1.8 with the latest pg driver. We see all of the errors posted above in unit tests that hold tx open for a long time and some parallel go routine kicks off to also perform a query after a Close has been called. It's a race between rows.Close() and the start of another query.

@c4milo
Copy link

c4milo commented Jun 18, 2017

this one was a tricky one to debug and fix, @whrool's #81 (comment) worked for me. I'm ok with the current behavior as long as lib/pq gives better hints regarding how to fix it.

@timbunce
Copy link

timbunce commented Oct 5, 2017

An option to fetch and buffer all the rows, would be simple to implement, non-magical, consistent with the majority of other Postgres drivers, and give people a simple way forward when they hit this problem.

It could also improve performance on both the client and server sides for some workloads.

It would also let this issue be closed after more than 4 years.

@unlucio
Copy link

unlucio commented Jul 25, 2018

Hello,
I'm too trying to run 2 queries on the same transaction.
It's effectively an upsert, but since AWS redshift does not complain about duplicates (but happily appends) I've to run a select before in order to decide what to do.
my entry point:

func UpsertRS(transaction *db.Tx, queries RSUpsertQueries, args ...interface{}) (db.Result, error) {

(RSUpsertQueries is just a dumb struct with a bunch of query strings)

Here's what I'm doing in a nutshell:

  • get transaction going on
redshiftTransaction, rsTransactionError := db.Begin()
  • check if the data is already there:
func checkCount(rows *db.Rows) (count int) {
	if rows == nil {
		return 0
	}

	for rows.Next() {
		err := rows.Scan(&count)
		rows.Close()
		log.Debug(fmt.Sprintf("Error while counting rows: %+v", err), "SYSTEM")
	}

	return count
}

func checkDataExistance(transaction *db.Tx, query string, args ...interface{}) bool {
	result, err := transaction.Query(query, args...)

	if err != nil {
		log.Error(fmt.Errorf("Redshift Upsert: error while checing existance: %+v", err), "SYSTEM")
	}

	return (checkCount(result) > 0)
}
  • switch to an update query if data exists:
query := queries.Insert

if checkDataExistance(transaction, queries.Check, args...) {
	fmt.Println("--> Data exists, switching to update query")
	query = queries.Update
} else {
	fmt.Println("--> Data does not exists, we can keep the insert")
}
  • and finally run my Exec:
result, err := transaction.Exec(query, args...)

and here is where everything goes BOOM and I get:

sql: Transaction has already been committed or rolled back

where the commit is actually outside this scope, so I'm really puzzled by who commited it 🤔

And googling took me on this case which seems kind of appropriate but honestly after going throughout it all I fail to understand what's the outcome, what was choose out of the whole "buffering/non-buffering" diatribe and most of all:

what's the solution to this problem?

@ghost
Copy link

ghost commented Sep 21, 2018

Has this moved at all in the favor of @timbunce 's solution?

@ghost
Copy link

ghost commented Sep 22, 2018

(or any solution, rather?)

D3luxee pushed a commit to hbo/soma that referenced this issue Mar 25, 2019
…ctTX()

We have to call rows.Close() before we issue another query in the same transaction.
See lib/pq#81
tsenart pushed a commit to sourcegraph/sourcegraph-public-snapshot that referenced this issue Nov 14, 2019
This PR introduces dbutil.DBTx which implements the dbutil.DB interface
backed by either a sql.DB or a sql.Tx, with additional support for
nested transactions using Postgres savepoints.

This allows us to run a suite of integration tests in the context of
single transaction (that will be rolled back at the end), even if those
tests start transactions themselves.

It also allows us to explicitly rollback to a previous savepoint in
tests when we know a given test will result in a SQL level exception
/ error and proceed safely from there within the context of a parent
transaction.

One situation where using a single transaction for integration tests
isn't currently supported is when concurrency is involved, due to this
limitation: lib/pq#81

That's why we didn't change the integration tests of the Bitbucket
Server provider store, which do lots of concurrency and fail when
operating with a DBTx.
bartekn added a commit to stellar/go that referenced this issue Dec 9, 2019
This commit adds `Synchronized` flag to `support/db.Session`. When set
to `true` and `Session` runs a transaction all `Exec*` and `Select*`
methods are protected by mutex what allows running them in multiple
goroutines.

This is an experimental feature (see below) and not a breaking change
(default is `false`).

Postgres protocol does not allow sending Exec query results if
previously sent Query haven't been fully read. This issue manifested
itself when a PR that's sending read and write queries in multiple
goroutines was merged.

More info: lib/pq#81 lib/pq#635

Known limitations:

* It's possible that it will not be needed if we decide to remove
concurrency from ingestion pipeline (see #1983). We are adding this to
unblock development of new ingestion processors with a more readable
code (currently all database processors are merged into one big
processor that is hard to read and test). Splitting `DatabaseProcessor`
will be done in a separate PR/PRs.
* During Horizon Team meeting we agreed to create a new
`SynchronizedSession` struct embedding `db.Session` inside that would
not be placed in a shared `support` package. The problem is that
`history.Q` embeds `db.Session` inside. Changing this to
`db.SessionInterface` requires updating a lot of files that create
`history.Q` objects, tests and mocks. Given that we may want to revert
this change in the future, the change in this commit seems to be
simpler.
@tommie
Copy link

tommie commented Apr 30, 2022

I'm a bit late, but I just hit the Bind response: 'C'. At first I thought "doesn't PostgreSQL support interleaved responses?', but the answer is no.

The only official (AFAIK) client is libpq, and the interface it provides is strict ordering of responses. You can use the singlerow flag to avoid it doing query result buffering, but it doesn't change the semantics; a previous command must complete before another can return results. The default in libpq is safe: by buffering the query results, there's no chance that the connection will be left in a state where you attempt to read the result of the next command before you're done with the first. Using singlerow could put you in the same position as with lib/pq, i.e. misinterpreting the response to a previous query as if it's for the new command. So with this added power, you have to be careful, just like this issue is stating.

PostgreSQL 14 added pipeline support, but I think that just means PQgetResult properly makes a "boundary" between two commands that are producing results. It's said to be a client-only change.

This left me believing the only way to support large query responses in PostgreSQL is to implement chunking (or "custom cursors"):

  1. Query for a chunk of rows, either limited by LIMIT or memory footprint. Close the sql.Rows.
  2. Remove/update as needed.
  3. Goto 1, but use OFFSET or lower bounds in WHERE to start off from after the last row.

OFFSET is clearly bad if you're removing the row you just queried, so I think WHERE is where it's at.

Any other solutions people have been employing?

As for lib/pq, I think the right approach is to make the driver warn when a transaction with open sql.Rows is being used for something else. This is not a supported use-case. It seems like pure chance that it works. Whether that warning is a direct error or something else is an open question. Perhaps it should be an error returned by Rows.Close?

@iamSuva
Copy link

iamSuva commented Sep 19, 2024

func (o *OrderService) UpdateOvershootProduct(products model.UpdateOvershootRequest, sellerID *uuid.UUID) error {
tx, err := o.db.BeginTx(context.Background(), nil)
if err != nil {
return errors.New("failed to begin transaction")
}
defer tx.Rollback()

for _, product := range products.Products {
    fmt.Println(product.Quantity, product.ProductVersionID)
    quantity := product.Quantity
    productVersionID := product.ProductVersionID

    sqlQuery := `
        SELECT "subOrderID", "productOvershootedQuantity"
        FROM suborderproducts AS sp
        JOIN suborders AS so ON so."ID" = sp."subOrderID"
        WHERE "productVersionID" = $1
        AND so."sellerID" = $2
        ORDER BY sp."createdAtUTC";
    `
    rows, err := tx.Query(sqlQuery, productVersionID, sellerID)
    if err != nil {
        log.Printf("failed to fetch suborder products: %v", err)
        return ErrFailedToUpdateOverShootProduct
    }
    defer rows.Close()
    
    remainingQuantity := quantity
    for rows.Next() {
        var suborderID *uuid.UUID
        var overShootedQuantity int

        if err := rows.Scan(&suborderID, &overShootedQuantity); err != nil {
            log.Printf("failed to scan suborder product: %v", err)
            return ErrFailedToUpdateOverShootProduct
        }
        
        if overShootedQuantity < remainingQuantity {
            updateQuery := `
                UPDATE suborderproducts
                SET "productOvershootedQuantity" = 0
                WHERE "subOrderID" = $1
                AND "productVersionID" = $2
            `
            log.Println("pvid: ", productVersionID, suborderID, overShootedQuantity)
            _, err := tx.Exec(updateQuery, suborderID, productVersionID)
            fmt.Println(updateQuery)
            if err != nil {
                log.Printf("failed to update suborder product: %v", err)
                return ErrFailedToUpdateOverShootProduct
            }
            remainingQuantity -= overShootedQuantity
        } else {
            updateQuery := `
                UPDATE suborderproducts
                SET "productOvershootedQuantity" = "productOvershootedQuantity" - $1
                WHERE "subOrderID" = $2
                AND "productVersionID" = $3
            `
            _, err := tx.Exec(updateQuery, remainingQuantity, suborderID, productVersionID)
            if err != nil {
                log.Printf("failed to update suborder product: %v", err)
                return ErrFailedToUpdateOverShootProduct
            }
            remainingQuantity = 0
            break
        }
    }
    if remainingQuantity > 0 {
        tx.Rollback()
        log.Printf("insufficient overshoot quantity for productVersionID %v", productVersionID)
        return ErrFailedToUpdateOverShootProduct
    }
}
if err := tx.Commit(); err != nil {
    log.Printf("failed to commit transaction: %v", err)
    return ErrFailedToUpdateOverShootProduct
}

return nil

}

for this code in my update query ,I am getting this error

"pq: unexpected Parse response 'D'"
can some one tell me what is wrong

@sbowman
Copy link
Author

sbowman commented Sep 19, 2024

You can't call a new SQL command when you're in the middle of reading the results for another.

In the middle of looping over the results of your query, you're trying to call UPDATE. You have to loop over all the results of your SELECT query before you can make another SQL call on the same connection. Basically, call SELECT, read all the results into a slice with the changes you'd like to make, then loop over the slice and call the UPDATE statement.

@iamSuva
Copy link

iamSuva commented Sep 19, 2024

thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.