Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: panic when update and iterate simultaneously #232

Merged
merged 4 commits into from
Jul 1, 2020

Conversation

WideLee
Copy link
Contributor

@WideLee WideLee commented Jun 25, 2020

  1. Fix panic when update and iteration simultaneously. Related to Iterator readEntry crashes #222.
  2. Add panic recover in cleanUp to prevent the program exited. Related to panic: runtime error: index out of range [7] with length 1 #226, panic out of range in bytes queue #148, just protect the main program not exited.
  3. Byte queue set full is false after allocated addition memory. Also added a test case reproduce this problem.

WideLee added 3 commits June 25, 2020 19:04
1. Copy keys' hashed value instead of copy keys' index in ByteQueue.
2. Skip ErrNotFound during iteration when the key has been evicted.
shard.go Outdated
@@ -253,6 +254,15 @@ func (s *cacheShard) onEvict(oldestEntry []byte, currentTimestamp uint64, evict
}

func (s *cacheShard) cleanUp(currentTimestamp uint64) {
defer func() {
// panic recover
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this?

Copy link
Contributor Author

@WideLee WideLee Jun 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In #226 panic in cleanUp.

If CleanWindows is not zero, bigcache starts a goroutine, if cleanUp panic, the whole program will be exited.

bigcache/bigcache.go

Lines 85 to 99 in bbbffd3

if config.CleanWindow > 0 {
go func() {
ticker := time.NewTicker(config.CleanWindow)
defer ticker.Stop()
for {
select {
case t := <-ticker.C:
cache.cleanUp(uint64(t.Unix()))
case <-cache.close:
return
}
}
}()
}

This is a temporary solution to prevent crash of whole program, actually, find out the reason why panic is important.

Copy link
Collaborator

@siennathesane siennathesane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it looks good to me, it could just use adjustments based off @janisz's comments.

@WideLee
Copy link
Contributor Author

WideLee commented Jun 29, 2020

I think it looks good to me, it could just use adjustments based off @janisz's comments.

Do you mean that I should rollback the commit about panic recover in CleanUp?

@WideLee
Copy link
Contributor Author

WideLee commented Jun 30, 2020

@janisz @mxplusb I just reverted the panic recover commit, please review it again 😄

Copy link
Collaborator

@siennathesane siennathesane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks okay to me. I wanted to take the panic recovery out because I want to see how much it's happening across the board and fix the root of the problem when it becomes a much larger issue.

@siennathesane siennathesane merged commit 48f0a54 into allegro:master Jul 1, 2020
@exobin
Copy link

exobin commented Aug 20, 2020

Is there any progress on solving this underlying issue? We are still getting these crashes 🤦

@siennathesane
Copy link
Collaborator

@alexi you're still having this same issue?

@exobin
Copy link

exobin commented Sep 9, 2020

@mxplusb now getting this error:

runtime error: slice bounds out of range [:46697172] with capacity 585000
goroutine [running]:
       runtime/panic.go:969 +0x166
github.com/allegro/bigcache/queue.(*BytesQueue).peek(0xc06f6d0008, 0x22eec, 0x145676292821921e, 0xc06f6cb148, 0xc0be8f2600, 0xc0be8f2720, 0x2, 0x0)
        github.com/allegro/bigcache/queue/bytes_queue.go:240 +0x17c
github.com/allegro/bigcache/queue.(*BytesQueue).Get(...)
        github.com/allegro/bigcache/queue/bytes_queue.go:191
github.com/allegro/bigcache.(*cacheShard).getWrappedEntry(0xc06f6d0000, 0x145676292821921e, 0x0, 0xc127e1cbf8, 0x407a09, 0x0, 0xc0be8f25d0)
       github.com/allegro/bigcache/shard.go:110 +0x6e
github.com/allegro/bigcache.(*cacheShard).get(0xc06f6d0000, 0xc1280310c8, 0x8, 0x145676292821921e, 0xc167cc8840, 0x6bd0dc28, 0xc0c0b19078, 0x149c2f0, 0xc1492bdbd0)
       github.com/allegro/bigcache/shard.go:64 +0x64
github.com/allegro/bigcache.(*BigCache).Get(0xc000933110, 0xc1280310c8, 0x8, 0xf, 0x1, 0xc0001de3a0, 0xf, 0xf)
       github.com/allegro/bigcache/bigcache.go:117 +0x8b

@siennathesane
Copy link
Collaborator

@alexi can you please verify you are using this version or latest?

@exobin
Copy link

exobin commented Sep 28, 2020

Yes, using commit 7bf29c0 (most recent).

The issues section has multiple examples of this bug: #226, and #148 has an open pull request fixing a uint32 overflow which looks like the likely culprit.

@siennathesane
Copy link
Collaborator

@alexi yeah, I think #148 could fix this.

@exobin
Copy link

exobin commented Oct 6, 2020

@mxplusb Any idea when it might be merged?

@siennathesane
Copy link
Collaborator

No ETA, we're prepping for v3, which should include a fix for this. I'll try and get some time spent on this soon.

@fpessolano
Copy link

Any news, we have now the same issues on production servers (which we let crash and restart at the moment)

@exobin
Copy link

exobin commented Nov 3, 2020

Hey @mxplusb any update here? We are in the same boat as @fpessolano. If it's possible to deploy it as a smaller patch that would be very valuable.

@siennathesane
Copy link
Collaborator

We're prepping for v3 in the versions/v3 branch, which should include the fix for this, but no current ETA on the release. I might be able to spend some time on it this month.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants