msg deserialization error causes peer thread to loop continuously #2793

antiochp · 2019-05-02T09:43:34Z

We fail to deserialize a p2p msg if the msg_len exceeds a defined max_len (per msg type).

If this happens we have read the 11 bytes of the msg header off the stream but we have not read the msg body itself.

We require every msg to begin with a pair of "magic bytes" that indicate the start of a msg header.

we attempt to deserialize the msg header
- 11 bytes (fixed size) read from stream
- deserialization error due to msg_len exceeding max_len
try_break! loops again
we then attempt to deserialize the next msg header
- 11 bytes (fixed size) read from stream
- the magic bytes will not match as these 11 bytes are not actually a msg header
try_break! is now effectively looping looking for a valid 11 byte chunk starting with magic bytes

If the next bytes are not the magic bytes then we are going to read chunks of 11 bytes at a time and if we are misaligned by even one byte we will not identify the next msg header correctly.

In fact this is not limited to the msg_len check. Any error occurring during deserialization of the msg header or the msg itself will cause this.

Is my understanding here correct?

Exceeding max_len is treated as a serialization error
We do not drop the connection on serialization errors
Our msg polling logic will loop again on a serialization error
Once we are out of alignment" we are effectively stuck reading garbage off the stream

A couple of alternative approaches -

If we don't read a full msg successfully we need to drop the connection. We cannot rely on the next bytes read being the next pair of magic bytes.
Alternatively - consider adding some logic to read (and discard) bytes until we encounter the next pair of magic bytes? I'm not sure if this is an approach often used in practice?

Related #2791.

The text was updated successfully, but these errors were encountered:

antiochp · 2019-05-02T09:43:46Z

cc @hashmap

antiochp · 2019-05-02T09:55:44Z

False alarm.
The internal logic in try_break! does not suppress ser::Error and will close the connection -

grin/p2p/src/conn.rs

Lines 47 to 64 in b9db129

    
           // Macro to simplify the boilerplate around async I/O error handling, 
        
           // especially with WouldBlock kind of errors. 
        
           macro_rules! try_break { 
        
           	($chan:ident, $inner:expr) => { 
        
           		match $inner { 
        
           			Ok(v) => Some(v), 
        
           			Err(Error::Connection(ref e)) if e.kind() == io::ErrorKind::WouldBlock => None, 
        
           			Err(Error::Store(_)) 
        
           			| Err(Error::Chain(_)) 
        
           			| Err(Error::Internal) 
        
           			| Err(Error::NoDandelionRelay) => None, 
        
           			Err(e) => { 
        
           				let _ = $chan.send(e); 
        
           				break; 
        
           				} 
        
           			} 
        
           	}; 
        
           }

antiochp · 2019-05-02T10:02:30Z

This stuff is really confusing - we put an error on the error channel in the above case (the connection close logic is asynchronous).

This is then picked up in peer.check_connection() that reads off the error channel.

But the logic here does subtly different things for a serialization error vs. any other kind of error -

grin/p2p/src/peer.rs

Lines 416 to 463 in b9db129

    
           	fn check_connection(&self) -> bool { 
        
           		let connection = match self.connection.as_ref() { 
        
           			Some(conn) => conn.lock(), 
        
           			None => return false, 
        
           		}; 
        
           		match connection.error_channel.try_recv() { 
        
           			Ok(Error::Serialization(e)) => { 
        
           				let need_stop = { 
        
           					let mut state = self.state.write(); 
        
           					if State::Banned != *state { 
        
           						*state = State::Disconnected; 
        
           						true 
        
           					} else { 
        
           						false 
        
           					} 
        
           				}; 
        
           				if need_stop { 
        
           					debug!( 
        
           						"Client {} corrupted, will disconnect ({:?}).", 
        
           						self.info.addr, e 
        
           					); 
        
           					stop_with_connection(&connection); 
        
           				} 
        
           				false 
        
           			} 
        
           			Ok(e) => { 
        
           				let need_stop = { 
        
           					let mut state = self.state.write(); 
        
           					if State::Disconnected != *state { 
        
           						*state = State::Disconnected; 
        
           						true 
        
           					} else { 
        
           						false 
        
           					} 
        
           				}; 
        
           				if need_stop { 
        
           					debug!("Client {} connection lost: {:?}", self.info.addr, e); 
        
           					stop_with_connection(&connection); 
        
           				} 
        
           				false 
        
           			} 
        
           			Err(_) => { 
        
           				let state = self.state.read(); 
        
           				State::Connected == *state 
        
           			} 
        
           		} 
        
           	} 
        
           }

Specifically - we will close a connection for a serialization error unless the peer is banned.
Whereas we will close a connection for any other kind of error unless the peer is disconnected.

~~We will not close a connection for a non-serialization error if the peer is banned.~~
Edit:

If the error is a serialization error and the peer is not currently banned then we will update the state to disconnected and close the connection.
If the error is a non-serialization error and the peer is not currently disconnected (it could be banned) then set the state to disconnected and close the connection.

i.e. We will unban a currently banned peer as part of disconnecting it if we encounter a non-serialization error here.

~~This does not immediately seem correct to me.~~
This seems clearly wrong.

antiochp · 2019-05-16T09:38:03Z

This is resolved now I think (went down various related rabbit holes).

antiochp added the bug label May 2, 2019

This was referenced May 2, 2019

cleanup check_connections #2794

Closed

Wrap MsgHeader in MsgHeaderWrapper for Known/Unknown msg type support #2791

Merged

antiochp closed this as completed May 16, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

msg deserialization error causes peer thread to loop continuously #2793

msg deserialization error causes peer thread to loop continuously #2793

antiochp commented May 2, 2019 •

edited

Loading

antiochp commented May 2, 2019

antiochp commented May 2, 2019 •

edited

Loading

antiochp commented May 2, 2019 •

edited

Loading

antiochp commented May 16, 2019

msg deserialization error causes peer thread to loop continuously #2793

msg deserialization error causes peer thread to loop continuously #2793

Comments

antiochp commented May 2, 2019 • edited Loading

antiochp commented May 2, 2019

antiochp commented May 2, 2019 • edited Loading

antiochp commented May 2, 2019 • edited Loading

antiochp commented May 16, 2019

antiochp commented May 2, 2019 •

edited

Loading

antiochp commented May 2, 2019 •

edited

Loading

antiochp commented May 2, 2019 •

edited

Loading