-
Notifications
You must be signed in to change notification settings - Fork 799
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PVF: Throw error if worker process cannot spawn necessary threads #659
Comments
Hey @mrcnski, I'm going to start working on this one now. I'm probably gonna take a little longer just to get the context on this one haha. Let me know if you have any suggestions. |
Hey @jpserrat! As there might be a refactor of relevant code coming soon with this issue, I would suggest waiting on this one right now. If you want, you can try to think about how to approach this in the meantime - I would suggest familiarizing yourself with the current code that deals with threads and learning about Rust's synchronization primitives, that might give you some ideas ;) Otherwise I can help you select a different issue. Up to you! |
I'm going to follow your advice regarding the current code and Rust threads. Thank you very much for your guidance. It would also be great if you could recommend a different one. |
@jpserrat Awesome! This one would be pretty quick and easy: #695. It may not end up being needed, but I think it probably will, and definitely wouldn't hurt to have. You can even add a sanity-check before each execution to make sure that the current node version matches the version recorded in the filename. This is another easy one: #622 I found a couple more but they are somewhat non-trivial, would be a big help if you dove into them though: |
…h#637) * WIP: attempt at new data structures * Specify path to config trait * Check AccountId's traits for EnqueuedMessage * Reorder MessageBundle type params * Add trait bounds to AccountId * Don't store AccountId in MessageBundle messages The AccountId is already in the MessageBundle, so no need to add it to every message. * Ensure that AccountId implements MaxEncodedLen * Finish rename to Nonces * Whitespace * Remove use of principal * Fix struct access in average_payload_size * WIP Collect MessageBundles partitioned by AccountId * Convert message queue to map of message bundles * Fix account id used for nonces in test * Fix warnings * Convert message bundles to Vec * Remove last uses of principal in basic channel * Found a better way to handle this Just use BoundedVec * Insert spaces to help with Markdown folding * Fix BTreeMap import * Revert "Remove last uses of principal in basic channel" This reverts commit 2f5d6e07a56bf74baeaf6f0421b68beb77b17644. * Fix some benchmarking types * Remove last uses of principal in basic channel Leave the set_principal call in place as a no-op to avoid removing the config for Call. We'll re-add it when we add the leaf node proof anyway. * First stab at Merklizing all message bundles I mean, if it compiles... 🤷 * Make keccak256 public to try resolve import issues * Use configured Hash type * rustfmt * Fix tests in merkle_proof * Fix nonce update * Remove not_authorized test * Add TODO * Ignore large tests * rustfmt * Store all message bundles in the event * Clean up commit() a bit * Extract function make_message_bundles * Eth-encode message bundles before merklizing them * Update comments * Remove unused function * Persist ABI-encoded message bundles * Use SCALE encoding for off-chain message bundles * Fix debug import * Switch to Token::FixedBytes * s/map/for/ * Remove loop implementation * Remove generic output type * rustfmt * Add StoredLeaves struct to communicate intent * WIP Custom RPC and runtime API * fixes (paritytech#647) * Update snow{blink,bridge} with fixes * Move merkle_proof to a separate crate * Add RPC call stub * Add runtime API implementation * Remove basic channel's RPC runtime API * WIP SCALE-encode Merkle proof from RPC call * Move StoredLeaves struct into primitives crate * Disable default features in basic channel crates * Add import for function * Use u64 instead of usize for proofs * Fix missed issues from removing the runtime API * Fix merkle-proof tests * Update basic outbound channel's tests for commit() * rustfmt * Clean up RPC call * Fix typo & remove unused import & comment * Replace encode with as_ref for ethabi encoding * Remove AsRef implementation for MessageBundle * Fix up RPC crate * Remove unused dependencies in basic channel * Remove redundant package names * Use import alias * Reorder rpc pallet * rustfmt * Mention outbound in Merkle proof RPC * Remove unused error * Remove set_principal from basic outbound channel * Update docstring for commit() * Ignore unused result * Shorten type * Remove unused dependency * Add prefix to incentivized channel in runtimes * Remove unused import * Remove primitives crate * Add V2 of basic inbound channel * Fix typos * Re-order message types * Fix use of nonces * Generate commitment from message bundle & proof * Fix contract name * Generate basic inbound channel v2 go bindings * Fix RPC name * Fix rpc handler (paritytech#659) * Detect leaf index out of bounds in RPC * Add tests for RPC handler * Overwrite BasicInboundChannel instead of using V2 * Link MerkleProof library for testing * Typos * Add account id filter to parachain relayer * Add MerkleProof library to channel deployment * Add MerkleProof RPC to parachain relayer * Extract account id decoding to getter * Simplify search for matching bundle * Forward Merkle proof to inbound channel contract * Fix mapstructure tag * Move decoded account id to listener struct * Tweak error messages * Extract finding bundle * Extract fetchBundleProof * Use existing parachain connection * Fix account id decoding * Fetch basic channel nonce by accountID * Reuse accountID stored on startup * Fix listener decoding & writer bounds check * Add account id to message bundle log * Remove TODO This is only relevant if we expect to have around 2**63 different accounts with messages in the same commitment. * Add RPC name to error messages * Use the Alice account id as default for ACCOUNT_ID * Mention build-essential for setting up lodestar * Fix log file names * Log leafProof and hashSides in relayer * Fix hash side value Should be hashed on the left when the side is true, which is when the index is even. * Use same names and param order as contract * Update contract fixture data * Revert "Fix hash side value" This reverts commit ec0b2e9aa7ee2a8605f68152392983e344f0057f. * Add reminder to use a port of a relay chain node * Fix basic inbound channel contract tests * Mention port forwarding in port reminder * Replace pointer with byte array * Improve error logs * Return proof value instead of mutating pointer * Add second message bundle to basic channel test * WIP Add second account to basic channel e2e test * Start accountID with lowercase * Update new dependencies to 0.9.25 * Fix RPC handler tests * Update parachain metadata * Allow multiple accountIDs in parachain relayer * Fix scanning logic * Update basic nonces log * Fix initial len of nonces slice * Fix jq for multiple account ids * Add bob key for E2E test * Check correct colletion's length to stop scanning * Fix param description * Whitespace * Ignore local asdf versions * Replace base 2 log with simpler bit calculation * Remove resolved TODOs * Typo * Add comment explaining ACCOUNT_IDS * s/accountIDs/accounts/g * Add issue key to TODO * Typo * Add geth version to README * Fix templating * Update parachain lockfile * Fix recipient address * Fix dotapp tests after rename Co-authored-by: Vincent Geddes <[email protected]> Co-authored-by: Vincent Geddes <[email protected]>
* fixed messages count check * explicit check of `messages_count` in the receive_messages_proof * change messages_count to be u32 * Update modules/message-lane/src/lib.rs Co-authored-by: Hernando Castano <[email protected]> Co-authored-by: Hernando Castano <[email protected]>
ISSUE
Overview
This issue is about the following quote from
tokio::task::spawn_blocking
(used by us to create threads in the worker process):Why is this an issue? Because for deterministic results we require that all worker threads are spawned at the same time, and not staggered. For example, if the CPU time monitor thread is late to start, we may timeout later than other validators, and accept candidates that others do not. (Also, we may at some point start using the measurements from the memory stats thread to reject PVFs, though right now this is preparation-only so inaccurate measurements cannot lead to disputes.)
Proposal
If we cannot spawn all the threads immediately, we should return an internal error (i.e. not dispute). This would lead to a retry (since paritytech/polkadot#7011).
That said, as the default limit in
tokio
is very high, it is not clear how it can ever be reached in practice. Each execution job uses 2 threads, prepare jobs use 3. So I would say this is low priority, but something to keep in mind.The text was updated successfully, but these errors were encountered: