Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: concurrent reconfiguration causes data race #204

Open
JosteinLindhom opened this issue Jan 21, 2025 · 1 comment
Open

bug: concurrent reconfiguration causes data race #204

JosteinLindhom opened this issue Jan 21, 2025 · 1 comment

Comments

@JosteinLindhom
Copy link

JosteinLindhom commented Jan 21, 2025

A data race may occur in cases where a reconfiguration is triggered in separate goroutines.

Cause:

// mgr.go
func (m *RawManager) AddNode(node *RawNode) error {
	...
	m.mu.Lock()
	defer m.mu.Unlock()
	m.lookup[node.id] = node
	m.nodes = append(m.nodes, node) // write to node
	return nil
}
// config_opts.go
func (o nodeList) newConfig(mgr *RawManager) (nodes RawConfiguration, err error) {
	...
	OrderedBy(ID).Sort(mgr.nodes) // not using mutex before mutating
	OrderedBy(ID).Sort(nodes)
	return nodes, nil
}

If AddNode and newConfig run at the same time, a race condition will occur.

Run this test with -race:

func TestConfigConcurrentReconfig(t *testing.T) {
	addrs, teardown := gorums.TestSetup(t, 1, func(_ int) gorums.ServerIface {
		return initServer()
	})
	defer teardown()

	mgr := gorumsTestMgr()

	errCh := make(chan error, 5)
	var wg sync.WaitGroup
	for j := 0; j < 5; j++ {
		wg.Add(1)
		go func() {
			_, err := mgr.NewConfiguration(gorums.WithNodeList(addrs))
			if err != nil {
				errCh <- err
			}
			wg.Done()
		}()
	}
	wg.Wait()
	close(errCh)
	for err := range errCh {
		t.Error(err)
	}
}
=== RUN   TestConfigConcurrentReconfig
==================
WARNING: DATA RACE
Write at 0x00c000151ad8 by goroutine 13:
  github.com/relab/gorums.(*RawManager).AddNode()
      /home/jostein/code/gorums/mgr.go:130 +0x72a
  github.com/relab/gorums.nodeList.newConfig()
      /home/jostein/code/gorums/config_opts.go:65 +0x190
  github.com/relab/gorums.(*nodeList).newConfig()
      <autogenerated>:1 +0x5e
  github.com/relab/gorums.NewRawConfiguration()
      /home/jostein/code/gorums/config.go:21 +0x44
  github.com/relab/gorums/tests/dummy.(*Manager).NewConfiguration()
      /home/jostein/code/gorums/tests/dummy/dummy_gorums.pb.go:110 +0x244
  github.com/relab/gorums_test.TestConfigConcurrentReconfig.func2()
      /home/jostein/code/gorums/config_test.go:256 +0x106

Previous read at 0x00c000151ad8 by goroutine 11:
  github.com/relab/gorums.nodeList.newConfig()
      /home/jostein/code/gorums/config_opts.go:74 +0x3e4
  github.com/relab/gorums.(*nodeList).newConfig()
      <autogenerated>:1 +0x5e
  github.com/relab/gorums.NewRawConfiguration()
      /home/jostein/code/gorums/config.go:21 +0x44
  github.com/relab/gorums/tests/dummy.(*Manager).NewConfiguration()
      /home/jostein/code/gorums/tests/dummy/dummy_gorums.pb.go:110 +0x244
  github.com/relab/gorums_test.TestConfigConcurrentReconfig.func2()
      /home/jostein/code/gorums/config_test.go:256 +0x106

Goroutine 13 (running) created at:
  github.com/relab/gorums_test.TestConfigConcurrentReconfig()
      /home/jostein/code/gorums/config_test.go:255 +0xec
  testing.tRunner()
      /usr/local/go1.23.1.linux-amd64/src/testing/testing.go:1690 +0x226
  testing.(*T).Run.gowrap1()
      /usr/local/go1.23.1.linux-amd64/src/testing/testing.go:1743 +0x44

Goroutine 11 (finished) created at:
  github.com/relab/gorums_test.TestConfigConcurrentReconfig()
      /home/jostein/code/gorums/config_test.go:255 +0xec
  testing.tRunner()
      /usr/local/go1.23.1.linux-amd64/src/testing/testing.go:1690 +0x226
  testing.(*T).Run.gowrap1()
      /usr/local/go1.23.1.linux-amd64/src/testing/testing.go:1743 +0x44
==================
    testing.go:1399: race detected during execution of test
--- FAIL: TestConfigConcurrentReconfig (0.01s)
FAIL
FAIL    github.com/relab/gorums 0.029s
FAIL
@JosteinLindhom
Copy link
Author

One way to fix this is to use the manager mutex, e.g.:

// config_opts.go
func (o nodeList) newConfig(mgr *RawManager) (nodes RawConfiguration, err error) {
	...
        mgr.mu.Lock()
	OrderedBy(ID).Sort(mgr.nodes) // not using mutex before mutating
        mgr.mu.Unlock()
	OrderedBy(ID).Sort(nodes)
	return nodes, nil
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant