diff options
author | Josh Bleecher Snyder <josh@tailscale.com> | 2021-01-19 09:02:16 -0800 |
---|---|---|
committer | Josh Bleecher Snyder <josh@tailscale.com> | 2021-02-08 10:32:07 -0800 |
commit | 0bcb822e5b4ee6408c5bcb5ad4d4e61b394a834e (patch) | |
tree | a7fc1d8ff7806e58104d06aee4859fbe89c8c25e /device/peer.go | |
parent | da956772030b8b1fcbd37f82f08863070c93aa0f (diff) |
device: overhaul device state management
This commit simplifies device state management.
It creates a single unified state variable and documents its semantics.
It also makes state changes more atomic.
As an example of the sort of bug that occurred due to non-atomic state changes,
the following sequence of events used to occur approximately every 2.5 million test runs:
* RoutineTUNEventReader received an EventDown event.
* It called device.Down, which called device.setUpDown.
* That set device.state.changing, but did not yet attempt to lock device.state.Mutex.
* Test completion called device.Close.
* device.Close locked device.state.Mutex.
* device.Close blocked on a call to device.state.stopping.Wait.
* device.setUpDown then attempted to lock device.state.Mutex and blocked.
Deadlock results. setUpDown cannot progress because device.state.Mutex is locked.
Until setUpDown returns, RoutineTUNEventReader cannot call device.state.stopping.Done.
Until device.state.stopping.Done gets called, device.state.stopping.Wait is blocked.
As long as device.state.stopping.Wait is blocked, device.state.Mutex cannot be unlocked.
This commit fixes that deadlock by holding device.state.mu
when checking that the device is not closed.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
Diffstat (limited to 'device/peer.go')
-rw-r--r-- | device/peer.go | 8 |
1 files changed, 4 insertions, 4 deletions
diff --git a/device/peer.go b/device/peer.go index 0bf19fd..abe8a08 100644 --- a/device/peer.go +++ b/device/peer.go @@ -62,7 +62,7 @@ type Peer struct { } func (device *Device) NewPeer(pk NoisePublicKey) (*Peer, error) { - if device.isClosed.Get() { + if device.isClosed() { return nil, errors.New("device closed") } @@ -107,7 +107,7 @@ func (device *Device) NewPeer(pk NoisePublicKey) (*Peer, error) { device.peers.empty.Set(false) // start peer - if peer.device.isUp.Get() { + if peer.device.isUp() { peer.Start() } @@ -121,7 +121,7 @@ func (peer *Peer) SendBuffer(buffer []byte) error { if peer.device.net.bind == nil { // Packets can leak through to SendBuffer while the device is closing. // When that happens, drop them silently to avoid spurious errors. - if peer.device.isClosed.Get() { + if peer.device.isClosed() { return nil } return errors.New("no bind") @@ -152,7 +152,7 @@ func (peer *Peer) String() string { func (peer *Peer) Start() { // should never start a peer on a closed device - if peer.device.isClosed.Get() { + if peer.device.isClosed() { return } |