-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
lotus-miner / markets subsystem crash - keywords: 'panic: must not use a read-write blockstore after finalizing' #6968
Comments
I have the full log file for the complete miner run - start to crash (if needed by the devs) |
@benjaminh83 thanks for bug report. Please do post the full log file here. |
I think what's happening is
Step 4 could be because eg a block has arrived while the cancel message is being processed, and graphsync does a read on the blockstore to check if the block is a duplicate. Currently there's no way to know when the event queue has completely drained after cancelling the graphsync context. So at the moment we already have hacks in place to wait 100ms after cancelling the graphsync context when restarting a data transfer. I suggest we add a |
I had the same error message occur overnight on my split miner lotus-miner process (the Miner/Storage/Sealing miner stayed up and operational.)
|
Think this is something deal related external to our miners. If you look at the time stamps, both Benjamin and I crashed at the same time. I'm EDT, so it occurred at 2:24 AM UTC, I think Benjamin is an hour or two ahead of UTC and his crashed at 4:24 Local Time. Also verified that TippyFlits crashed at 2:24 AM UTC. |
@Meatball13 please provide your full logs, at least for a couple of hours before the crash |
Unfortunately I was half asleep when I noticed it was down and restarted them without keeping the logs. Seems like Stuberman also had the same issue, at the same time, and it might have been related to a big batch of deals that came out to all of us at that time. Discussion ongoing here: https://filecoinproject.slack.com/archives/C029ETPJ6BB/p1627877131340100 |
Thanks for the report. Resolved in m1.3.4 after merging filecoin-project/go-data-transfer#229 and ipld/go-car#195. |
Checklist
Lotus component
lotus miner - mining and block production
Lotus Tag and Version
Describe the Bug
Seems like I experienced some instability with M1.3.3
Basically the lotus-miner process (I only have one, as currently running monolith) crashed at 4.30AM this morning, only 1.5h from my first deadline.
It seems to have something to do with this, but it a pretty big problem that the failure leads to a total crash of the miner. I think this is the key event:
Logging Information
Repo Steps
Run M1.3.3 in a monolith setup.
Multiple incoming (more or less faulty deals from Estuary)
The text was updated successfully, but these errors were encountered: