Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Concurrent requests to the cardano-wallet will cause sqlite to error out and enpoints return non-json error messages #2280

Closed
jbax opened this issue Oct 29, 2020 · 10 comments
Labels
Bug SEVERITY:LOW Small defects which does not prevent any crucial functionality to work.

Comments

@jbax
Copy link

jbax commented Oct 29, 2020

Context

When invoking multiple endpoints of the cardano-wallet API via HTTP in parallel (i.e. from multiple external processes) the endpoint will return a plain string with the message "something went wrong" which is NOT a valid JSON string and does not conform to the documented behavior.

The cardano-wallet will also log this message:

SQLite3 returned ErrorBusy while attempting to perform step: database is locked

Expected behavior

Ideally, the cardano-wallet should synchronize database access as sqlite3 does not handle concurrency and prevent errors altogether.

If that's too tricky, then when such error happens, the failing request should at least return an error message in a parseable JSON message like:

{
    "message": "something went wrong",
    "code": "an_error_code"
}
Information -
Version 2020.10.13 (git revision: 77e04fc)
Platform Ubuntu Linux 20.02
Installation From Github Release

Steps to Reproduce

I have this happen on unit tests of a client software I'm writing. Tests run in parallel where possible. Hopefully it's not too much effort for you to try this out:

1 - Clone this project with git clone https://github.com/uniVocity/cardano-wallet.git
2 - Open project with IntelliJ as a "Maven" Project (File -> New -> Project from existing sources)
3 - Start the cardano-wallet somewhere
4 - Execute unit tests in class TestRemoteServer (I highlighted the line where the cardano-wallet server and port are set)

As this is a concurrency issue, you may not see the error in the first run, but a second on third run will make it show up.

@KtorZ
Copy link
Member

KtorZ commented Oct 29, 2020

@jbax that's interesting. Especially because each wallet has its own database and calls run in atomic transactions. So in principle, there's no parallelism on the database, only concurrency.

May I ask if any of these calls involved deleting wallets? There are some known issue with this particular call which could yield to the behavior your describe above (especially on Windows).

@jbax
Copy link
Author

jbax commented Oct 29, 2020

Yes it seems to happen when there is deletion of wallets. I disabled the test that deletes wallets and the error doesn't seem to occur anymore.

Just to highlight, I'm on Linux.

@KtorZ
Copy link
Member

KtorZ commented Oct 29, 2020

@jbax Okay thanks so this confirms my doubts and is "somewhat expected". We've been struggling getting a reliable approach to wallet deletion that works transparently between all OS. Linux is definitely the easy one on the matter but we still have some work to do here (it is known for a while, but pretty benign to deserve more attention in front of other matters at this stage I am afraid :/)

@KtorZ KtorZ added Bug SEVERITY:LOW Small defects which does not prevent any crucial functionality to work. labels Oct 29, 2020
@jbax
Copy link
Author

jbax commented Oct 29, 2020 via email

@Anviking
Copy link
Member

I've seen it specifically in GET /wallets / GET /byron-wallets and should have a fix for it.

But I'm curious, did you see it occurring with other endpoints (in conjunction with deleting wallets)?

@Anviking Anviking self-assigned this Oct 30, 2020
@Anviking
Copy link
Member

Anviking commented Nov 12, 2020

In the commit 9f7d8fc (on master but not in a release yet) a race condition matching this description was fixed for GET /wallets and GET /byron-wallets.

Though I presume the problem remains for other endpoints.

Last week I tried writing a monadic property test for get+deleteWallet and listAddress+deleteWallet race conditions to reproduce this. I saw json 500 errors (wallet_not_responding, which is another problem), but never anything non-json.

@Anviking Anviking removed their assignment Nov 12, 2020
@hasufell
Copy link
Contributor

hasufell commented Dec 9, 2020

@Anviking so can this and ADP-536 be closed for now?

@Anviking
Copy link
Member

Anviking commented Dec 9, 2020

I don't think so.

@KtorZ
Copy link
Member

KtorZ commented Dec 9, 2020

@hasufell unfortunately no. Quite easy to reproduce locally if you create a scenario that create and delete wallets concurrently. At some point, everything turns red and thread starts interleaving fatal errors from SQLite in the least readable way possible.

@runeksvendsen
Copy link
Contributor

Quite easy to reproduce locally if you create a scenario that create and delete wallets concurrently.

Does this only cover creating a deleting the same wallet? Or can it also problematic to delete an existing wallet while trying to create a new (unrelated) wallet?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug SEVERITY:LOW Small defects which does not prevent any crucial functionality to work.
Projects
None yet
Development

No branches or pull requests

6 participants