feat (chain): implement genesis UTXO extraction workaround for missing events#100
feat (chain): implement genesis UTXO extraction workaround for missing events#100
Conversation
|
Great job, no security vulnerabilities found in this Pull Request |
hseeberger
left a comment
There was a problem hiding this comment.
Quite some comments. Please address, then I need to review again.
| if let Err(e) = | ||
| extract_genesis_unshielded_utxos(&mut block, &ledger_state, network_id).await | ||
| { | ||
| warn!("genesis UTXO extraction failed: {}", e); |
There was a problem hiding this comment.
Why? I think we should fail if this important feature fails.
There was a problem hiding this comment.
So please follow the so far consistently applied approach for error handling to fail fast also here.
There was a problem hiding this comment.
Changed from warning to fail-fast error handling.
| #[trace] | ||
| pub async fn extract_genesis_unshielded_utxos( | ||
| block: &mut Block, | ||
| _ledger_state: &LedgerState, |
There was a problem hiding this comment.
Why pass the ledger state if you do not use it?
There was a problem hiding this comment.
Removed unused _ledger_state parameter
| let genesis_transaction = block | ||
| .transactions | ||
| .get_mut(0) | ||
| .context("cannot get genesis transaction")?; |
There was a problem hiding this comment.
We follow the convention to not use "cannot" for context invocations, because it would lead to a chain of "cannot"s. So please remove here and elsewhere.
There was a problem hiding this comment.
Removed "cannot" from all .context() calls
| ledger_state: &LedgerState, | ||
| genesis_transaction: &mut Transaction, | ||
| ) -> Result<usize, anyhow::Error> { | ||
| use std::ops::Deref; |
There was a problem hiding this comment.
Please change your IDE settings to not always add the use lines in the function scope. This is annoying to review and we likely miss some.
There was a problem hiding this comment.
Moved all imports to file top - no more function-scoped imports anywhere.
| let utxo_state = &midnight_ledger_state.utxo; | ||
| let mut extracted_count = 0; | ||
| for utxo in utxo_state.utxos.iter() { | ||
| let utxo_data = utxo.deref(); |
There was a problem hiding this comment.
Rewrite this line and the following to not explicitly call deref. I guess you can omit this line and for the below one use convert_midnight_utxo_to_indexer_utxo(&*utxo_data, ...
There was a problem hiding this comment.
Fixed - removed the explicit deref and simplified
| fn convert_user_address_to_unshielded( | ||
| user_address: &midnight_coin_structure::coin::UserAddress, | ||
| ) -> Result<indexer_common::domain::UnshieldedAddress, anyhow::Error> { | ||
| let address_bytes: [u8; 32] = user_address.0.0; |
There was a problem hiding this comment.
Remove type annotation and fold this simple expression into the next line.
| spent_unshielded_utxos_info.insert(tx_hash.into(), abstracted_spent); | ||
| } | ||
| // Use the most recent transaction hash, or fallback for genesis/failed transactions | ||
| let tx_hash = current_tx_hash.unwrap_or_else(|| [0u8; 32].into()); |
There was a problem hiding this comment.
I do not understand this change. Please add a better/more descriptive comment.
I thought that for the genesis block there are not events, so how is this change related to the topic of this PR?
There was a problem hiding this comment.
I do not understand this change. Please add a better/more descriptive comment.
Enhanced comment with context.
I thought that for the genesis block there are not events, so how is this change related to the topic of this PR?
You're absolutely correct that genesis blocks don't emit events - that's exactly why we need the workaround in application.rs.
This runtimes.rs change is not related to genesis events. It's a separate fix for failed transactions in general (any block height) that don't have transaction context.
The issue: When transactions fail and no TxStart/TxSuccess event provides context, current_tx_hash becomes None. But UnshieldedTokens events can still be emitted for these failed transactions. Without a transaction hash, these events would be dropped.
The fix: Use a fallback hash [0u8; 32] to group events from failed transactions, ensuring they're still processed even without transaction context.
This change improves event processing reliability for failed transactions across all blocks, not just genesis. The genesis-specific logic is entirely in application.rs where we extract UTXOs directly from ledger state.
There was a problem hiding this comment.
I realise my explanation was incorrect.
the [0u8; 32] fallback isn't about 'completely failed transactions' having valid UTXOs.
What's actually happening:
- From my investigation: Unshielded functionality in midnight-node is in a transitional/unstable state - all unshielded tests are ignored with #[ignore = "TODO UNSHIELDED"]
- Event emission issues: There are known gaps in UnshieldedTokens event emission, particularly around:
- Empty transactions (no UTXO changes)
- Partial transaction failures where successful segments still emit events
- Event ordering issues - The fallback hash: Groups 'orphaned' UnshieldedTokens events that lack proper transaction context - not from failed transactions, but from events that get emitted when current_tx_hash is None due to processing issues.
Given that unshielded functionality appears to be incomplete/unstable in midnight-node, this change might be addressing event processing gaps rather than legitimate UTXO creation from failed transactions.
Should I investigate the specific scenarios where current_tx_hash becomes None to verify these events are actually valid? The midnight-node codebase suggests there may be broader unshielded functionality issues we need to account for.
There was a problem hiding this comment.
I suggest you ask Andy.
There was a problem hiding this comment.
Also, while not part of this PR, I think that there is no code setting the current_tx_hash back to None at the end of processing the UnshieldedTokens event. So if the "in the next round" there is no TxApplied or TxPartialSuccess, then the old current_tx_hash will be used.
There was a problem hiding this comment.
I am doing an investigation now.
There was a problem hiding this comment.
You're absolutely right to question this, and I've now done a thorough investigation in the midnight-node codebase to understand exactly when current_tx_hash is None.
Key Finding: The [0u8; 32] fallback hash addresses legitimate system transactions, not failed user transactions.
Primary scenario: System transactions like block rewards that create UTXOs via mint_coins without going through normal transaction processing:
// In on_finalize - Block reward minting (pallets/midnight/src/lib.rs)
match LedgerApi::mint_coins(&network_id, &state_key, reward, &beneficiary[..]) {
Ok(new_state_key) => {
Self::deposit_event(Event::PayoutMinted(...));
}
}
These system operations:
- Create legitimate unshielded UTXOs
- Don't have transaction hashes (they're not user transactions)
- Represent valid state changes that should be indexed
Your concern about completely failed transactions having valid UTXOs doesn't apply here because:
- Completely failed transactions return early and emit no events
- Partial failures only emit events for successful segments (failed segments filtered out)
- UTXOs with [0u8; 32] hash come from system transactions, not failed user transactions
The fallback hash is legitimate and correctly handles system transactions that create UTXOs outside the normal transaction flow. These represent actual ledger state changes that the indexer should capture.
There was a problem hiding this comment.
Please add a (short) respective comment.
There was a problem hiding this comment.
while not part of this PR, I think that there is no code setting the current_tx_hash back to None at the end of processing the UnshieldedTokens event. So if the "in the next round" there is no TxApplied or TxPartialSuccess, then the old current_tx_hash will be used.
You're absolutely right - that's a real bug I missed! The current_tx_hash isn't reset to None after processing, so subsequent UnshieldedTokens events without corresponding TxApplied/TxPartialSuccess events would incorrectly use stale transaction hashes from previous transactions.
This means the fallback hash mechanism isn't working as intended. We need to fix this by resetting the transaction hash context appropriately.
The fix should be:
Event::Midnight(midnight::Event::TxApplied(tx_applied)) => {
current_tx_hash = Some(tx_applied.tx_hash);
}
Event::Midnight(midnight::Event::TxPartialSuccess(tx_partial)) => {
current_tx_hash = Some(tx_partial.tx_hash);
}
Event::Midnight(midnight::Event::UnshieldedTokens(event_data)) => {
let tx_hash = current_tx_hash.unwrap_or_else(|| [0u8; 32].into());
// ... process event
current_tx_hash = None; // Reset after processing
}
Good catch - this explains why the fallback might not be working as expected in practice!
made a commit
| let extracted_count = extract_utxos_from_ledger_state(&clean_ledger_state, genesis_transaction) | ||
| .context("cannot extract UTXOs from ledger state after genesis transaction")?; | ||
|
|
||
| if extracted_count == 0 { |
There was a problem hiding this comment.
Please remove this, because for mainnet there will not be any pre-funded wallets and even for the other environments it is not the responsibility of the Indexer to expect transactions there; it should index these, but not warn if they are absent.
There was a problem hiding this comment.
Now extracted_count becomes obsolete and the function extract_utxos_from_ledger_state can be changed to return the extracted utxos instead of unit, making it less imperative. Then the mutation of the transaction can happen here.
There was a problem hiding this comment.
Removed assumption about genesis UTXOs.
Now extracted_count becomes ....
Yes - changed extract_utxos_from_ledger_state to return Vec instead of usize, removed the genesis_transaction mutation parameter, and moved the mutation to the caller:
let utxos = extract_utxos_from_ledger_state(&clean_ledger_state)?;
genesis_transaction.created_unshielded_utxos.extend(utxos);
This makes it functional instead of imperative, and extracted_count is now obsolete.
| // Handle genesis block: extract any pre-funded unshielded UTXOs | ||
| if block.height == 0 { | ||
| if let Err(e) = | ||
| extract_genesis_unshielded_utxos(&mut block, &ledger_state, network_id).await |
There was a problem hiding this comment.
Given the signature of extract_genesis_unshielded_utxos only taking shared references and returning a "unit result", also looking at its implementation below: where are the extracted UTXOs stored?
There was a problem hiding this comment.
Good catch! The original signature was incorrect - it took &Block (shared reference) but needed to mutate the genesis transaction.
Fixed by changing to &mut Block and implementing the mutation inside the function
| ); | ||
| } | ||
|
|
||
| // Handle genesis block: extract any pre-funded unshielded UTXOs |
There was a problem hiding this comment.
Add trailing dot (please start memoizing this, I am adding this comment in too many reviews).
|
I wonder whether we should add a test to |
Added a test in e2e.rs |
| let genesis_transaction = &genesis_block.transactions[0]; | ||
|
|
||
| // Genesis transaction should have created unshielded UTXOs. | ||
| assert!(!genesis_transaction.unshielded_created_outputs.is_empty(),); |
There was a problem hiding this comment.
Please remove the trailing comma.
| ); | ||
| } | ||
|
|
||
| // Handle genesis block: extract any pre-funded unshielded UTXOs. |
There was a problem hiding this comment.
The called function uses a clean ledger state. But why? The transactions of the genesis block are applied to the ledger state in zswap::apply_transaction_mut. I suggest to remove this section and the extract_genesis_unshielded_utxos (which should have been called inject_...) and then put
let utxos = extract_utxos_from_ledger_state(&clean_ledger_state)?;
genesis_transaction.created_unshielded_utxos.extend(utxos);
as well as extract_utxos_from_ledger_state and needed functions to zswap::apply_transaction_mut. There line 155. Check if the block_parent_hash is [0; 32].
There was a problem hiding this comment.
You're right! Using a clean ledger state was inefficient and architecturally wrong.
Problem with original approach:
- Applied genesis transaction to clean state just to extract UTXOs
- Separate concern from natural transaction processing flow
- Duplicate work since transaction was already being applied in zswap::apply_transaction_mut
Fixed with your suggested approach:
- Moved logic to zswap::apply_transaction_mut at line 155+ where transaction processing naturally occurs
- Uses block_parent_hash == ByteArray([0; 32]) for genesis detection as suggested
- Extracts UTXOs from existing ledger state after transaction application
- Removed separate extract_genesis_unshielded_utxos function entirely
- Inlined helper functions as you noted they were unnecessary
The architecture is much cleaner now - genesis UTXO extraction happens naturally within transaction processing rather than as a separate step. Thanks for the excellent architectural insight!
| ) -> Result<Vec<UnshieldedUtxo>, anyhow::Error> { | ||
| let midnight_ledger_state = &ledger_state.0.0; | ||
| let utxo_state = &midnight_ledger_state.utxo; | ||
| let mut utxos = Vec::new(); |
There was a problem hiding this comment.
Change imperative to Rust idiomatic Iterator pattern.
There was a problem hiding this comment.
Converted to functional iterator pattern
| } | ||
|
|
||
| /// Convert midnight-ledger UserAddress to indexer UnshieldedAddress. | ||
| fn convert_user_address_to_unshielded( |
There was a problem hiding this comment.
We do not need a function for this one-liner.
| } | ||
|
|
||
| /// Convert a midnight-ledger Utxo to the indexer's UnshieldedUtxo format. | ||
| fn convert_midnight_utxo_to_indexer_utxo( |
There was a problem hiding this comment.
I think we do not need a function for this (almost) one-liner.
hseeberger
left a comment
There was a problem hiding this comment.
Please address the additional nit-picks, then merge.
| @@ -1,3 +1,16 @@ | |||
| // This file is part of midnight-indexer. | |||
There was a problem hiding this comment.
Examples do not need this header. How was it added? Manually? Usually you should run https://github.com/midnightntwrk/midnight-indexer/blob/main/.license_headers.sh to add headers. Please remove this one.
| @@ -0,0 +1,80 @@ | |||
| // This file is part of midnight-indexer. | |||
| /// Extract UTXOs from the midnight-ledger state and convert them to indexer format. | ||
| fn extract_utxos_from_ledger_state( | ||
| ledger_state: &LedgerState, | ||
| ) -> Result<Vec<UnshieldedUtxo>, Error> { |
There was a problem hiding this comment.
There seems to be no error case. Remove the Result here and the result handling above.
| let tx_hash = current_tx_hash.unwrap_or_else(|| [0u8; 32].into()); | ||
|
|
||
| if !event_data.created.is_empty() { | ||
| let abstracted_created = event_data.created |
There was a problem hiding this comment.
Remove the abstracted prefix, because it causes confusion.
| created_unshielded_utxos_info.insert(tx_hash.into(), abstracted_created); | ||
| } | ||
| if !event_data.spent.is_empty() { | ||
| let abstracted_spent = event_data.spent |

Implements #98 & ticket : https://shielded.atlassian.net/browse/PM-17350
Summary
Implements a workaround to extract genesis UTXOs by applying the genesis transaction to a clean ledger state, enabling the indexer to detect pre-funded accounts that are missing due to Substrate's event blocking during genesis.
Context
Problem: The midnight-indexer cannot detect pre-funded UTXOs in the genesis block because midnight-node does not emit
UnshieldedTokensevents during genesis block construction. This is due to Substrate PR #5463 which intentionally blocks all events whenblock_number.is_zero().Impact: Pre-funded development accounts (e.g.,
0000000000000000000000000000000000000000000000000000000000000001) with 5M tokens each are invisible to wallets and applications relying on the indexer.Technical Solution
Root Cause Analysis
Implementation Approach
Added special-case logic in
chain-indexer/src/application.rsthat:LedgerStateusing existingapply_transactionmethodcreated_unshielded_utxosfieldKey Changes
Files Modified:
chain-indexer/src/application.rs(+99 lines): Core extraction logicchain-indexer/src/domain/zswap.rs(+1 line): ExportLedgerStatetypechain-indexer/src/infra/node/runtimes.rs(+30/-26 lines): Support for both runtime versionschain-indexer/examples/test_genesis.rs(+89 lines): Integration testArchitecture Decision:
Test Results
Successfully tested with the example program:
GraphQL query shows genesis UTXOs are now properly indexed and queryable, with correct values (5M tokens each) and owner addresses.
note 'ignore the ordering' : #99
Impact
Future Considerations
This is a workaround for a node-side limitation. A follow-up spike ticket (PM-17351) explores potential node-side solutions like:
The current solution provides immediate functionality while keeping long-term architectural options open.