Skip to content

Support receiving verified transactions from the vortexor#5321

Merged
lijunwangs merged 9 commits intoanza-xyz:masterfrom
lijunwangs:vortexor_verified_packets_receiver
Apr 21, 2025
Merged

Support receiving verified transactions from the vortexor#5321
lijunwangs merged 9 commits intoanza-xyz:masterfrom
lijunwangs:vortexor_verified_packets_receiver

Conversation

@lijunwangs
Copy link
Copy Markdown

Problem

This support receiving verified transactions from the vortexor.

Summary of Changes

The additional argument in the validator can start service receiving the verified transactions:

e.g
:
--tpu-vortexor-receiver-address 10.138.0.136:8100

And the validator can use existing TPU customize address to point its TPU addresses to the vortexor:

e.g.
--public-tpu-address 10.138.0.136:9194 --public-tpu-forwards-address 10.138.0.136:9195

Fixes #

@lijunwangs lijunwangs force-pushed the vortexor_verified_packets_receiver branch from 70dcb08 to bf932aa Compare April 9, 2025 19:22
Comment thread vortexor-receiver/Readme.md Outdated
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 9, 2025

Codecov Report

❌ Patch coverage is 32.11382% with 167 lines in your changes missing coverage. Please review.
✅ Project coverage is 82.9%. Comparing base (36b3586) to head (59b8dca).
⚠️ Report is 3608 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff            @@
##           master    #5321     +/-   ##
=========================================
- Coverage    83.0%    82.9%   -0.1%     
=========================================
  Files         828      830      +2     
  Lines      375858   376142    +284     
=========================================
+ Hits       312113   312182     +69     
- Misses      63745    63960    +215     
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@lijunwangs lijunwangs force-pushed the vortexor_verified_packets_receiver branch from 461cfc7 to ab02dd8 Compare April 10, 2025 04:14
@lijunwangs lijunwangs requested a review from jstarry April 10, 2025 04:31
Comment thread core/src/tpu.rs

let (forward_stage_sender, forward_stage_receiver) = bounded(1024);
let sigverify_stage = {
let sig_verifier = if let Some(vortexor_receivers) = vortexor_receivers {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should optionally spawn vortexor receiver, but we should still spawn the local sigverify stage.

AFAICT the quic and fetch stages still exist, and even if the ports are not advertised on gossip it's possible to still send to them.

This also makes upgrades tied together. I can't restart my vortexor instance because my validator is relying on it.
We should let the operator switch their advertised tpu port(s) at runtime, so they can switch back to local mode to upgrade vortexor without screwing up the validator.
Or switch to another vortexor instance.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great point. We plan to support the dynamic management of subscriptions via Admin RPC on the validator. See https://github.com/anza-xyz/agave/blob/master/vortexor/Readme.md

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right - but we shouldn't just let the quic & fetch stages send to a SV that doesn't exist. It's likely we will cause a panic in that case soon, and then it's a vulnerability if I can guess the ports your tpu is on.

Even if we can't switch off of it in this PR, we shouldn't remove SV imo

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! Will address

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disable TPU streamers when vortexor receiver is configured.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay, but this seems the wrong direction to me.
If the vortexor goes down, the operator is forced to restart their node to go back to normal TPU. This seems like a necessary feature to me, not something that we should do as follow-up.

I'm happy to hear other's opinions, as maybe I'm being overly cautious about this.

Copy link
Copy Markdown
Author

@lijunwangs lijunwangs Apr 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feature is only enabled when tpu-vortexor-receiver-address enabled. Auto fullback and heartbeat feature will be delivered in follow-on PRs.

Comment thread core/src/vortexor_receiver_adapter.rs Outdated
@mergify
Copy link
Copy Markdown

mergify Bot commented Apr 12, 2025

⚠️ The sha of the head commit of this PR conflicts with #5775. Mergify cannot evaluate rules on this PR. ⚠️

@lijunwangs lijunwangs requested a review from apfitzge April 14, 2025 17:07
@lijunwangs lijunwangs force-pushed the vortexor_verified_packets_receiver branch from d956aae to e86c96f Compare April 15, 2025 18:01
@apfitzge apfitzge requested review from apfitzge and removed request for apfitzge April 15, 2025 19:27
Comment thread core/src/vortexor_receiver_adapter.rs Outdated
($sender:expr, $batch:expr, $count:expr) => {
match $sender.send($batch) {
Ok(_) => {
trace!("Sent batch: {} received from vortexor successfully", $count);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will we ever turn this on? this seems excessive.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was only used for debugging. It will be enhanced to metrics.

Comment thread core/src/vortexor_receiver_adapter.rs Outdated
Comment thread core/src/vortexor_receiver_adapter.rs Outdated
Self::recv_send(
batch_receiver,
recv_timeout,
8,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extract this to a constant. can you comment where it came from?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will do. It is rough estimate to limit the memory usage to max PACKETS_PER_BATCH*8 = 512 PACKETS in the total batches.

Comment thread core/src/tpu.rs
))
};

let vote_sigverify_stage = {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Votes do not go through vortexor? will they in the future? if I want to offload the sigverify task it seems it should be fully offloaded.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I think we can offload votes to vortexor as well. To be done in future PRs.

@lijunwangs lijunwangs force-pushed the vortexor_verified_packets_receiver branch from fbe8b06 to 59b8dca Compare April 17, 2025 09:45
@lijunwangs lijunwangs requested a review from apfitzge April 17, 2025 16:59
Copy link
Copy Markdown

@apfitzge apfitzge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approach seems fine as an intermediate step. We can spawn quic &fetch & sigverifier threads on detecting shutdown or admin rpc change in future PRs.

LGTM - @lijunwangs please get approval from a second relevant person. These are larger arch changes.

@lijunwangs
Copy link
Copy Markdown
Author

Approach seems fine as an intermediate step. We can spawn quic &fetch & sigverifier threads on detecting shutdown or admin rpc change in future PRs.

LGTM - @lijunwangs please get approval from a second relevant person. These are larger arch changes.

@sakridge @bw-solana can you help review these as well? Thanks!

Copy link
Copy Markdown

@sakridge sakridge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me

@lijunwangs lijunwangs merged commit e44c17d into anza-xyz:master Apr 21, 2025
41 checks passed
steviez added a commit that referenced this pull request Jun 12, 2025
)" (#6525)

The reverted commit introduced the solana-vortexor-receiver crate which
was used by solana-core. solana-vortexor-receiver was not set to
publish which means solana-core cannot be published either. So, back
out solana-vortexor in order to unblock crate publishing

This reverts commit e44c17d
mergify Bot pushed a commit that referenced this pull request Jun 12, 2025
)" (#6525)

The reverted commit introduced the solana-vortexor-receiver crate which
was used by solana-core. solana-vortexor-receiver was not set to
publish which means solana-core cannot be published either. So, back
out solana-vortexor in order to unblock crate publishing

This reverts commit e44c17d

(cherry picked from commit 63cf093)

# Conflicts:
#	Cargo.lock
#	Cargo.toml
#	programs/sbf/Cargo.lock
#	svm/examples/Cargo.lock
#	vortexor-receiver/Cargo.toml
steviez added a commit that referenced this pull request Jun 14, 2025
)" (#6525)

The reverted commit introduced the solana-vortexor-receiver crate which
was used by solana-core. solana-vortexor-receiver was not set to
publish which means solana-core cannot be published either. So, back
out solana-vortexor in order to unblock crate publishing

This reverts commit e44c17d

(cherry picked from commit 63cf093)
steviez added a commit that referenced this pull request Jun 17, 2025
…or (#5321)" (backport of #6525) (#6529)

The reverted commit introduced the solana-vortexor-receiver crate which
was used by solana-core. solana-vortexor-receiver was not set to
publish which means solana-core cannot be published either. So, back
out solana-vortexor in order to unblock crate publishing

This reverts commit e44c17d

(cherry picked from commit 63cf093)

---------

Co-authored-by: steviez <steven@anza.xyz>
lijunwangs added a commit that referenced this pull request Jun 25, 2025
#6542)

* Revert "Revert "Support receiving verified transactions from the vortexor (#5321)" (#6525)"

This reverts commit 63cf093.

* publish vortexor-receiver as it is used by core

* Rename solana-vortexor-receiver to agave-verified-packet-receiver

* Missed Cargo.lock update
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants