add in metrics for detecting redundant pulls#113
add in metrics for detecting redundant pulls#113gregcusack wants to merge 1 commit intoanza-xyz:masterfrom
Conversation
298964f to
0f4dcb9
Compare
So this requires a lot of offline processing. Basically, we can change
(also have to accordingly update a lot of other places which use Then if you receive the same value again from I would suggest lets hold on to this pr for now, but first implement above simpler approach in a separate pr and lets see how that looks like. |
My only concern here is that in a simple test, there appeared to be a lot of redundant pulls, so it is possible this would be very heavy on metrics server. But I can give it a shot and run some tests with your suggested changes and see what we see. If it's too much we can then sample. |
|
Closing in favor of PR: #139 |
Originally written by Andrew Fitzgerald <apfitzge@gmail.com> on Wed Aug
9 14:57:55 2023 -0700.
Previous version:
commit 86a2b8f8aa19e606bd6396dfdbc6f35950b23ee9
Author: Andrew Fitzgerald <apfitzge@gmail.com>
Date: Wed Aug 9 14:57:55 2023 -0700
Spawn adversarial and normal banking stages (anza-xyz#113)
Rewritten to match the upstream scheduler code as of anza-xyz#5467 by Illia
Bobyr <illia.bobyr@anza.xyz>.
This change includes all of the following changes:
---
Author: Illia Bobyr <illia.bobyr@solana.com>
Date: Mon Oct 2 14:03:46 2023 -0700
adversary: test_scheduler => attack_scheduler (anza-xyz#175)
We mostly talk about attacks, when we discuss the functionality this
code supports. Considering that we have a lot of other kinds of tests,
it seems a bit clearer to call use "attack" in this part of the code.
Author: Illia Bobyr <illia.bobyr@solana.com>
Date: Tue Oct 3 15:21:33 2023 -0700
adversary: test_generators => transaction_generators (anza-xyz#178)
We mostly talk about "attacks" rather than "tests" in this part of the
code. And even the main type in the `test_generators` module is called
`TransactionGenerator`.
Author: kirill lykov <kirill.lykov@solana.com>
Date: Thu Feb 8 10:52:48 2024 +0100
replay: atomicbool instead of singleton for dropping packets (anza-xyz#224)
* use atomicbool instead of singleton to drop packets
* add use for Ordering
Co-authored-by: Illia Bobyr <ilya.bobyr@gmail.com>
Signed-off-by: kirill lykov <lykov.kirill@gmail.com>
* rename drop_packets
---------
Signed-off-by: kirill lykov <lykov.kirill@gmail.com>
Co-authored-by: Illia Bobyr <ilya.bobyr@gmail.com>
Author: Brennan <brennan.watt@anza.xyz>
Date: Fri Mar 22 06:45:29 2024 -0700
remove dead code (anza-xyz#298)
Author: Andrew Fitzgerald <apfitzge@gmail.com>
Date: Tue Jul 16 14:49:59 2024 -0500
AdversarialBankingStage: Remove warning (anza-xyz#370)
Remove warning. Adjust names
Problem
We had previously added in a metric for tracking gossip push messages through the network in PR: #32725. However, this metric does not account for redundant pull requests.
Redundant Pull: A node receives a message via
PullResponseand then receives the same message viaPush.Redundant Pulls prevent us from accurately calculating how well messages are propagating via
Push.Summary of Changes
Add in metric to report when a node receives a NEW message via
PullResponse(gossip_crds_sample_pull).Add in a metric to report when a node receives a message via
Pushbut fails to insert (gossip_crds_sample_fail).Identifying redundant Pulls:
gossip_crds_sample_pullgossip_crds_sample_failSimulation Results
In a 100 node simulation, I saw Redundant Pulls occur somewhat frequently. This indicates Redundant Pulls may be the reason for the discrepancy between the simulated
Pushcoverage and measuredPushcoveragePossible Issues