Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PVF: more filesystem sandboxing #1373

Merged
merged 22 commits into from
Sep 28, 2023
Merged
Show file tree
Hide file tree
Changes from 7 commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
99a9efd
rsync `mrcnski/pvf-sandbox-whole-process` branch from Polkadot
mrcnski Aug 28, 2023
c8f2962
Do the rest
mrcnski Sep 3, 2023
6f7d3fe
Merge branch 'master' into mrcnski/pvf-worker-sandbox-with-pivot-root
mrcnski Sep 3, 2023
9d2ce42
Some fixes
mrcnski Sep 3, 2023
15897c3
Merge branch 'master' into mrcnski/pvf-worker-sandbox-with-pivot-root
mrcnski Sep 4, 2023
f926505
cargo fmt
mrcnski Sep 4, 2023
32cfbcb
Fix clippy error
mrcnski Sep 5, 2023
eacb956
Address review comments; refactor/simplify Landlock code
mrcnski Sep 10, 2023
45b99a9
Merge branch 'master' into mrcnski/pvf-worker-sandbox-with-pivot-root
mrcnski Sep 10, 2023
dc6fe04
cargo fmt
mrcnski Sep 10, 2023
ed344ab
Address most review comments (will do the last one after lunch)
mrcnski Sep 11, 2023
8cedd7b
Use cache as location for worker dirs
mrcnski Sep 11, 2023
396a7b6
Merge branch 'master' into mrcnski/pvf-worker-sandbox-with-pivot-root
mrcnski Sep 12, 2023
ccc329e
Clear env vars when spawning process
mrcnski Sep 12, 2023
2e6bb65
Fix compiler error, add comment
mrcnski Sep 14, 2023
a5efc37
Rearrange worker startup
mrcnski Sep 14, 2023
b413c27
Fix runtime crash (tokio socket not created in context of a runtime)
mrcnski Sep 14, 2023
70e62d8
Fix compiler error
mrcnski Sep 14, 2023
ea1b48d
Fix pivot_root and add assertions
mrcnski Sep 17, 2023
a696d40
Add a couple of tests to de-befuddle myself
mrcnski Sep 17, 2023
ad0ed8a
Move socket to beginning of worker startup
mrcnski Sep 17, 2023
279db25
Use cfg_if macro
mrcnski Sep 25, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 38 additions & 3 deletions polkadot/node/core/pvf/common/src/error.rs
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,17 @@ pub enum PrepareError {
/// The response from the worker is received, but the file cannot be renamed (moved) to the
/// final destination location. This state is reported by the validation host (not by the
/// worker).
RenameTmpFileErr(String),
RenameTmpFileErr {
err: String,
// Unfortunately `PathBuf` doesn't implement `Encode`/`Decode`, so we do a fallible
// conversion to `Option<String>`.
src: Option<String>,
dest: Option<String>,
},
/// The response from the worker is received, but the worker cache could not be cleared. The
/// worker has to be killed to avoid jobs having access to data from other jobs. This state is
/// reported by the validation host (not by the worker).
ClearWorkerDir(String),
}

impl PrepareError {
Expand All @@ -58,7 +68,11 @@ impl PrepareError {
use PrepareError::*;
match self {
Prevalidation(_) | Preparation(_) | Panic(_) => true,
TimedOut | IoErr(_) | CreateTmpFileErr(_) | RenameTmpFileErr(_) => false,
TimedOut |
IoErr(_) |
CreateTmpFileErr(_) |
RenameTmpFileErr { .. } |
ClearWorkerDir(_) => false,
// Can occur due to issues with the PVF, but also due to local errors.
RuntimeConstruction(_) => false,
}
Expand All @@ -76,7 +90,9 @@ impl fmt::Display for PrepareError {
TimedOut => write!(f, "prepare: timeout"),
IoErr(err) => write!(f, "prepare: io error while receiving response: {}", err),
CreateTmpFileErr(err) => write!(f, "prepare: error creating tmp file: {}", err),
RenameTmpFileErr(err) => write!(f, "prepare: error renaming tmp file: {}", err),
RenameTmpFileErr { err, src, dest } =>
write!(f, "prepare: error renaming tmp file ({:?} -> {:?}): {}", src, dest, err),
ClearWorkerDir(err) => write!(f, "prepare: error clearing worker cache: {}", err),
}
}
}
Expand All @@ -89,8 +105,17 @@ impl fmt::Display for PrepareError {
pub enum InternalValidationError {
/// Some communication error occurred with the host.
HostCommunication(String),
/// Host could not create a hard link to the artifact path.
CouldNotCreateLink(String),
/// Could not find or open compiled artifact file.
CouldNotOpenFile(String),
/// Host could not clear the worker cache after a job.
CouldNotClearWorkerDir {
err: String,
// Unfortunately `PathBuf` doesn't implement `Encode`/`Decode`, so we do a fallible
// conversion to `Option<String>`.
path: Option<String>,
},
/// An error occurred in the CPU time monitor thread. Should be totally unrelated to
/// validation.
CpuTimeMonitorThread(String),
Expand All @@ -104,8 +129,18 @@ impl fmt::Display for InternalValidationError {
match self {
HostCommunication(err) =>
write!(f, "validation: some communication error occurred with the host: {}", err),
CouldNotCreateLink(err) => write!(
f,
"validation: host could not create a hard link to the artifact path: {}",
err
),
CouldNotOpenFile(err) =>
write!(f, "validation: could not find or open compiled artifact file: {}", err),
CouldNotClearWorkerDir { err, path } => write!(
f,
"validation: host could not clear the worker cache ({:?}) after a job: {}",
path, err
),
CpuTimeMonitorThread(err) =>
write!(f, "validation: an error occurred in the CPU time monitor thread: {}", err),
NonDeterministicPrepareError(err) => write!(f, "validation: prepare: {}", err),
Expand Down
2 changes: 1 addition & 1 deletion polkadot/node/core/pvf/common/src/execute.rs
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ use polkadot_primitives::ExecutorParams;
use std::time::Duration;

/// The payload of the one-time handshake that is done when a worker process is created. Carries
/// data from the host to the worker.
/// data from the host to the worker that would be too large for CLI parameters..
alindima marked this conversation as resolved.
Show resolved Hide resolved
#[derive(Encode, Decode)]
pub struct Handshake {
/// The executor parameters.
Expand Down
10 changes: 10 additions & 0 deletions polkadot/node/core/pvf/common/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ pub mod executor_intf;
pub mod prepare;
pub mod pvf;
pub mod worker;
pub mod worker_dir;

pub use cpu_time::ProcessTime;

Expand All @@ -41,6 +42,15 @@ pub mod tests {
pub const TEST_PREPARATION_TIMEOUT: Duration = Duration::from_secs(30);
}

/// Status of security features on the current system.
#[derive(Debug, Clone, Default)]
pub struct SecurityStatus {
/// Whether the landlock features we use are fully available on this system.
pub can_enable_landlock: bool,
// Whether we are able to unshare the user namespace and change the filesystem root.
pub can_unshare_user_namespace_and_change_root: bool,
}

/// Write some data prefixed by its length into `w`.
pub async fn framed_send(w: &mut (impl AsyncWrite + Unpin), buf: &[u8]) -> io::Result<()> {
let len_buf = buf.len().to_le_bytes();
Expand Down
159 changes: 97 additions & 62 deletions polkadot/node/core/pvf/common/src/worker/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@

pub mod security;

use crate::LOG_TARGET;
use crate::{worker_dir, SecurityStatus, LOG_TARGET};
use cpu_time::ProcessTime;
use futures::never::Never;
use std::{
Expand All @@ -41,6 +41,9 @@ macro_rules! decl_worker_main {
}

fn main() {
#[cfg(target_os = "linux")]
use $crate::worker::security;

// TODO: Remove this dependency, and `pub use sp_tracing` in `lib.rs`.
// See <https://github.com/paritytech/polkadot/issues/7117>.
$crate::sp_tracing::try_init_simple();
Expand All @@ -60,6 +63,34 @@ macro_rules! decl_worker_main {
println!("{}", $worker_version);
return
},

"--check-can-enable-landlock" => {
#[cfg(target_os = "linux")]
let status = if security::landlock::status_is_fully_enabled(
&security::landlock::get_status(),
) {
0
} else {
-1
};
#[cfg(not(target_os = "linux"))]
let status = -1;
std::process::exit(status)
},
"--check-can-unshare-user-namespace-and-change-root" => {
#[cfg(target_os = "linux")]
let status = if security::unshare_user_namespace_and_change_root(&std::env::temp_dir())
.is_ok()
{
0
} else {
-1
};
#[cfg(not(target_os = "linux"))]
let status = -1;
std::process::exit(status)
},

subcommand => {
// Must be passed for compatibility with the single-binary test workers.
if subcommand != $expected_command {
Expand All @@ -71,18 +102,39 @@ macro_rules! decl_worker_main {
},
}

let mut worker_dir_path = None;
let mut node_version = None;
let mut socket_path: &str = "";
let mut can_enable_landlock = false;
let mut can_unshare_user_namespace_and_change_root = false;

for i in (2..args.len()).step_by(2) {
let mut i = 2;
alindima marked this conversation as resolved.
Show resolved Hide resolved
while i < args.len() {
match args[i].as_ref() {
"--socket-path" => socket_path = args[i + 1].as_str(),
"--node-impl-version" => node_version = Some(args[i + 1].as_str()),
"--worker-dir-path" => {
worker_dir_path = Some(args[i + 1].as_str());
i += 1
},
"--node-impl-version" => {
node_version = Some(args[i + 1].as_str());
i += 1
},
"--can-enable-landlock" => can_enable_landlock = true,
"--can-unshare-user-namespace-and-change-root" =>
can_unshare_user_namespace_and_change_root = true,
arg => panic!("Unexpected argument found: {}", arg),
}
i += 1;
}
let worker_dir_path =
worker_dir_path.expect("the --worker-dir-path argument is required");

let worker_dir_path = std::path::Path::new(worker_dir_path).to_owned();
let security_status = $crate::SecurityStatus {
can_enable_landlock,
can_unshare_user_namespace_and_change_root,
};

$entrypoint(&socket_path, node_version, Some($worker_version));
$entrypoint(worker_dir_path, node_version, Some($worker_version), security_status);
}
};
}
Expand All @@ -91,32 +143,28 @@ macro_rules! decl_worker_main {
/// child process.
pub const JOB_TIMEOUT_OVERHEAD: Duration = Duration::from_millis(50);

/// Interprets the given bytes as a path. Returns `None` if the given bytes do not constitute a
/// a proper utf-8 string.
pub fn bytes_to_path(bytes: &[u8]) -> Option<PathBuf> {
std::str::from_utf8(bytes).ok().map(PathBuf::from)
}

// The worker version must be passed in so that we accurately get the version of the worker, and not
// the version that this crate was compiled with.
pub fn worker_event_loop<F, Fut>(
debug_id: &'static str,
socket_path: &str,
#[cfg_attr(not(target_os = "linux"), allow(unused_mut))] mut worker_dir_path: PathBuf,
node_version: Option<&str>,
worker_version: Option<&str>,
#[cfg_attr(not(target_os = "linux"), allow(unused_variables))] security_status: &SecurityStatus,
mut event_loop: F,
) where
F: FnMut(UnixStream) -> Fut,
F: FnMut(UnixStream, PathBuf) -> Fut,
Fut: futures::Future<Output = io::Result<Never>>,
{
let worker_pid = std::process::id();
gum::debug!(target: LOG_TARGET, %worker_pid, "starting pvf worker ({})", debug_id);
gum::debug!(target: LOG_TARGET, %worker_pid, ?worker_dir_path, "starting pvf worker ({})", debug_id);

// Check for a mismatch between the node and worker versions.
if let (Some(node_version), Some(worker_version)) = (node_version, worker_version) {
if node_version != worker_version {
gum::error!(
target: LOG_TARGET,
%debug_id,
%worker_pid,
%node_version,
%worker_version,
Expand All @@ -129,16 +177,45 @@ pub fn worker_event_loop<F, Fut>(
}
}

remove_env_vars(debug_id);
// Enable some security features.
//
// Landlock is enabled in the prepare- or execute-worker-specific code since we restrict the
// access rights based on whether we are preparing or executing. We also need to remove the
// socket before applying Landlock restrictions.
{
// Call based on whether we can change root. Error out if it should work but fails.
#[cfg(target_os = "linux")]
if security_status.can_unshare_user_namespace_and_change_root {
if let Err(err_ctx) = security::unshare_user_namespace_and_change_root(&worker_dir_path)
{
let err = io::Error::last_os_error();
gum::error!(
target: LOG_TARGET,
%debug_id,
%worker_pid,
%err_ctx,
?worker_dir_path,
"Could not change root to be the worker cache path: {}",
err
);
worker_shutdown_message(debug_id, worker_pid, err);
return
}
worker_dir_path = std::path::Path::new("/").to_owned();
}

security::remove_env_vars(debug_id);
}

// Run the main worker loop.
let rt = Runtime::new().expect("Creates tokio runtime. If this panics the worker will die and the host will detect that and deal with it.");
let err = rt
.block_on(async move {
let stream = UnixStream::connect(socket_path).await?;
let _ = tokio::fs::remove_file(socket_path).await;
let socket_path = worker_dir::socket(&worker_dir_path);
let stream = UnixStream::connect(&socket_path).await?;
let _ = tokio::fs::remove_file(&socket_path).await;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the above connect fails, we will never remove the socket? Isn't there some RAII primitive available?

Copy link
Contributor Author

@mrcnski mrcnski Sep 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or defer like in Go. :P (One of the few nice things about that language.) There are some crates for defer in Rust, but I don't want to introduce a dep or extra code just for this one case.

Anyway, good point, but I think here it's fine because 1. the socket is always created in the worker dir, which we remove on worker death as well as host startup, and has a random name so it can't be reused, and 2. if we fail to connect, the worker dies and we try to re-create it from scratch, with a new worker dir and socket and everything.


let result = event_loop(stream).await;
let result = event_loop(stream, worker_dir_path).await;

result
})
Expand All @@ -153,48 +230,6 @@ pub fn worker_event_loop<F, Fut>(
rt.shutdown_background();
}

/// Delete all env vars to prevent malicious code from accessing them.
fn remove_env_vars(debug_id: &'static str) {
for (key, value) in std::env::vars_os() {
// TODO: *theoretically* the value (or mere presence) of `RUST_LOG` can be a source of
// randomness for malicious code. In the future we can remove it also and log in the host;
// see <https://github.com/paritytech/polkadot/issues/7117>.
if key == "RUST_LOG" {
continue
}

// In case of a key or value that would cause [`env::remove_var` to
// panic](https://doc.rust-lang.org/std/env/fn.remove_var.html#panics), we first log a
// warning and then proceed to attempt to remove the env var.
let mut err_reasons = vec![];
let (key_str, value_str) = (key.to_str(), value.to_str());
if key.is_empty() {
err_reasons.push("key is empty");
}
if key_str.is_some_and(|s| s.contains('=')) {
err_reasons.push("key contains '='");
}
if key_str.is_some_and(|s| s.contains('\0')) {
err_reasons.push("key contains null character");
}
if value_str.is_some_and(|s| s.contains('\0')) {
err_reasons.push("value contains null character");
}
if !err_reasons.is_empty() {
gum::warn!(
target: LOG_TARGET,
%debug_id,
?key,
?value,
"Attempting to remove badly-formatted env var, this may cause the PVF worker to crash. Please remove it yourself. Reasons: {:?}",
err_reasons
);
}

std::env::remove_var(key);
}
}

/// Provide a consistent message on worker shutdown.
fn worker_shutdown_message(debug_id: &'static str, worker_pid: u32, err: io::Error) {
gum::debug!(target: LOG_TARGET, %worker_pid, "quitting pvf worker ({}): {:?}", debug_id, err);
Expand Down Expand Up @@ -301,7 +336,7 @@ pub mod thread {
Arc::new((Mutex::new(WaitOutcome::Pending), Condvar::new()))
}

/// Runs a worker thread. Will first enable security features, and afterwards notify the threads
/// Runs a worker thread. Will run the requested function, and afterwards notify the threads
/// waiting on the condvar. Catches panics during execution and resumes the panics after
/// triggering the condvar, so that the waiting thread is notified on panics.
///
Expand Down
Loading