SIMD-0061: Fair congestion control with intra-block exponential local base fee#61
SIMD-0061: Fair congestion control with intra-block exponential local base fee#61ryoqun wants to merge 165 commits into
Conversation
| to *determiniscally* define active thread count (`TC_a`), additionally record | ||
| transaction termination events into poh stream. |
There was a problem hiding this comment.
How does this give us thread count? PoH recording is single-threaded.
There was a problem hiding this comment.
hey, thanks for jumping in.
like this?
serialized new ledger entries:
- SanitizedTransactionA
- TerminationMarkerForTransactionA // maybe just holds
Signatureor transaction hash. - SanitizedTransactionB
- SanitizedTransactionC
- TerminationMarkerforTransactionB
- TerminationMarkerForTranactiionC
..1: TC_a: 0
1..2: TC_a: 1
2..3: TC_a: 0
3..4: TC_a: 1
4..5: TC_a: 2
5..6: TC_a: 1
6.. : TC_a: 0
this proposal assumes a single threaded banking scheduler, not thread-local-MI.
so, banking threads newly start to record transaction terminations via the scheduler thread. and single threaded poh can be able to linearize these events without ambiguity and replay stage can reconstruct TC_a just given list of these ledger entries.
There was a problem hiding this comment.
It seems as a greedy leader I can lie about the termination in order to maximize parallelism, in order to raise fees. Seems like it'd be easy to just keep a cache of the terminations and just record them right at the end of the slot.
Even with conflicts, could just record & (remove from cache) cached tx-terminations that are in conflict.
There was a problem hiding this comment.
yeah, so, this proposal strongly encodes disincentivation of parallelism manipulation. as you pointed out base fee will be indeed raised, but throughput will be forcibly limited on the other hand. the rate limit is adjusted, so that total transaction fee after manipulation would never greater than scheduled normally.
in this proposal, cu & active thread are both available for determisnic replaying.
| ## Security Considerations | ||
|
|
||
| What security implications/considerations come with implementing this feature? | ||
| Are there any implementation-specific guidance or pitfalls? |
There was a problem hiding this comment.
this creates opportunity for a new type of MEV where an actor purposefully schedules transactions in a way that creates larger base fees. mango's validator could try to increase the fees for zeta to gain monetary benefit through attracting traders with lower fees.
There was a problem hiding this comment.
hey, thanks for jumping in.
the actual base fee calculation will be defined so that these arbitrary re-ordering of txes won't result in higher base fees, and larger total (base + priority) fees.
so, given 10 zeta honest transactions, if naive scheduling results in 1000 lamports are burnt, any reordering also results in 1000 lamports.
the base fee in this proposal is intended to throttle particularly hot address's tx throughput as the first objective.
also, tx specifies fixed base fee and fixed priority fee and it's 100% paid.
There was a problem hiding this comment.
you can still re-order across slots, e.g stuff everything into the last slot
There was a problem hiding this comment.
no worries. i haven't got time to write details, but i think this proposal can address this just like intra-block malicious reorderings.
as a teaser, a general idea is that base fee is reset across block boundaries unlike eip-1559. and raising base fees like that won't help because these hot addresses are forced to throttle executions according to new consensus logic. so, best strategy for leaders is spread out hot txes as much as possible across ledger ticks.
|
|
||
| also requested fee is basis for fee cals, block fullness calc, not the actual | ||
| cu. | ||
| - to prevent bad behavior, rebate 50% of (requested CU - actual CU)? |
There was a problem hiding this comment.
This is in conflict of move toward asynchronous execution
There was a problem hiding this comment.
quick counterargument for async exec. is unpredicatable latency for simple payments amid nft mint / volatile market. i need to grok the details, but i think merchant can't exclude the possibility of classic double-spend when some delay is occurring in async exec, maybe?
There was a problem hiding this comment.
also, market makers don't like async exec as far as i had chat with subject-matter experts at #16 , right? :)
btw, if not, i can strongly push my scramble tx thing: solana-labs/solana#23837
There was a problem hiding this comment.
I don't see the discussion of async in #16, can you link? There's several hundred comments now, and after expanding all and searching I'm not seeing any "aync"
There was a problem hiding this comment.
i remember we discussed the following issue with async (but in other SIMD):
estimating CU is terribly hard for HFT, they often need to over-request because failing TX are way worse than slightly more expensive gas fees, unused CU accounting for the purpose of block packing requires a sync leader
There was a problem hiding this comment.
also, market makers don't like async exec as far as i had chat with subject-matter experts at #16 , right? :)
I don't see the discussion of async in #16, can you link? There's several hundred comments now, and after expanding all and searching I'm not seeing any "aync"
there's no explicit mention about async. but, i generally sensed that praw is so deadly wanted because market maker want to refresh their orders periodically / reliably to begin with. and arb bots and async exec are enemy in that regard, i think.
There was a problem hiding this comment.
estimating CU is terribly hard for HFT, they often need to over-request because failing TX are way worse than slightly more expensive gas fees
yeah, i want to improve the situation as well.... however, this is very difficult. used-cu based fee calc creates no incentive for users to estimate their cu correctly. likewise, even a small portion of fees of unused cu can't be rebated to users as well. That's because the rest of it would still go to the leaders partially, meaning leaders are incentivized to reorder txes to fail.
unused CU accounting for the purpose of block packing requires a sync leader
... also unused cu can't be utilized for bankingstage scheduling (block packing). this also would create an incentive for leaders to make txes to fail by reordering for more packed block, even if there's no incentive directly from the fee collection manipulation by reordering. (note that intrinsic value of block is the sum of burnt fee; i.e. tx count in a block; those will be preferred by cluster's fork choice.) however, unused CU accounting will be fully exploited for fastest replayingstage scheduling (block verifying). new ledger format allows leaders to embed actual used cu after tx execution completion (still half-baked idea)
btw, i prefer a sync leader for a different reason. :)
|
@ryoqun why is it necessary? Digital systems aren't heat engines, they can safely run at 100% capacity. The problem with automatic increases is that 1 single use case will increase the price of using the chain for everything. If there is spare capacity, I would expect the capacity to filed up with opportunistic users that use the chain up to price $X per tx. That doesn't mean they want to price out anyone else that arrives first at $X, they simply wan to use up spare capacity. There is nothing wrong with that kind of user, and because of defi, there is basically always demand for any slack capacity. If the floor price is forced to automatically increase until that user is priced out, that means that any chain that has defi is going to be more expensive then a dedicated chain for payments, or some other non defi use case, as defi will always consume slack capacity. The challenge for us is to construct a fee market for multiple use cases where no use case would have a better price in a dedicated chain. By aggregating resources and sharing the cost of the validation, they should all actually enjoy a cheaper price assuming there is equal capacity. That means the state of 1 usecase can't impact global pricing. |
| when not full, maximize throughput of each of any single threaded transaction | ||
| executions. note that, this mode exponentially cools down any hot addresses if | ||
| any. |
hey, thanks for jumping in while this is still in very much draft.. as i noted here: https://github.com/solana-foundation/solana-improvement-documents/pull/61/files#r1263813462, slack capacity will be fully utilized and local base fee won't be increased by this proposal:
so, automatic increase comes into effect only if there's more than concurrently executable series of transactions than the given cluster's super-majority can timely replay with their multi-core hardware. other than that, i strongly agree with your comment. this will be my best to achieve the challenge under these mentioned nuances.
yeah, these kind of transactions will be categorized as idling prioritized transactions in this proposal. These transactions will only start to compete with each other when there are more than that aforementioned cluster's hw capacity. so, this proposal tries evenly (read: fairly) distribute block space across more-than-cluster-core-count on-chain programs which want to utilize the idling capacity. Otherwise, only well-capitalized top cluster-core-count on-chain programs will consume the idling block space. poor, indie block-chain games. :) |
|
fyi, i just pushed a simulation rust code. i need to write down proper wordz from it... it has minor shortcomings, but overall i think it should be good enough to remove block/write cu limits altogether. |
|
I’ll try to summarize what I understood, I must admit I’m not sure I can fully follow:
Would appreciate clarification |
|
@mschneider yeah, that's almost correct! from now on, i'll try to fill this proposal with proper words. other's high-level perception helps to guide the upcoming arrangement of the detailed design writings.. thanks!
at initial stage, this will be some hard-coded value like 4 (most conservative by being aligned with current default non-vote thread count) or 10 (should be safe with my replay scheduler), given current leader's hw. dunno what do you mean by vote. but thread count will be automatically derived/updated at epoch boundaries. no human interaction is needed. |
| (still in half-finished...) | ||
|
|
||
| - Write lock cu limit is bad (bot can lock out at the very first of block for | ||
| the entire duration of whole blocktime (400ms) |
There was a problem hiding this comment.
In order to lock out an account for entire slot, wouldn't bot need to time they 10M cu txs are at the top of the queue when a block starts, by paying the highest priority fee for that account? If SIMD #50 is approved, bot would be economically disincentivized, wouldn't it?
There was a problem hiding this comment.
(thanks for peeking into this half-based proposal)
In order to lock out an account for entire slot, wouldn't bot need to time they 10M cu txs are at the top of the queue when a block starts, by paying the highest priority fee for that account?
yes. i think this is problem per se. this creates uneven value of time only around block boundaries. ideally, there should be some measure to unlock the limit if economically desired by users.
If SIMD #50 is approved, bot would be economically disincentivized, wouldn't it?
i guess that's limited. with 10M cus, it's just auction by the increments of 10_000 lamports? I think it's still cheap, considering arb's profits could be unbounded.
There was a problem hiding this comment.
it's still cheap.
it really depends on who you asking 😉
| is newly introduced by this proposal. This increase will be calculated | ||
| exponentially, measured by the CU consumed by each addresses at the moment. | ||
| This means a transaction must cost the sum of `requested_cu * base_cu_price` | ||
| for all of its write-locked addresses at least. This results in selectively |
There was a problem hiding this comment.
is base_cu_price per writ-locked account, will/should there be a base-cu-price for the block itself (in case the congestion is for block)?
There was a problem hiding this comment.
is
base_cu_priceper writ-locked account
yes.
will/should there be a
base-cu-pricefor the block itself (in case the congestion is for block)?
no, there shouldn't be no such thing as written below in the proposal.
| This means a transaction must cost the sum of `requested_cu * base_cu_price` | ||
| for all of its write-locked addresses at least. This results in selectively | ||
| pricing out crowded subset of transactions waiting for block inclusion, while | ||
| allowing other transactions to be processed for block inclusion. |
There was a problem hiding this comment.
Not perfectly clear the particular scenario trying to solve, but my intuition would be make base fee dynamic while leave priority ass is.
There was a problem hiding this comment.
assume block is saturated like this: L highly lucrative trading pairs like sol/usdc, btc/usdc, eth/usdc and banking thread N. and L >= N with 100 market makers quoting 10/s paying 1 cents for priority fee.
so, normal spl transfers will cost at least 1 cents (assuming quoting and transfer consume around same amount of cu).
also, when market gets volatile for some economic news, normal spl transfers wil need to pay the more priority fee during that period, than normal days.
I think these are bad.
There was a problem hiding this comment.
I see. If all defi transactions take up block space (after taking up all write-lock account limit), one might be able to say to transfer transaction that "block limit is reached" therefore indeed time to use priority fee. There probably room in scheduler to optimize it.
|
|
||
| On top of the direct appreciation of aforementioned fairness, this proposal | ||
| also obsoletes both the existing block-wide CU limit and the account-write CU | ||
| limit to overcome their inherent unfairness and problems. Also, no global |
There was a problem hiding this comment.
Will the dynamic base-cu-price fully safe-guard reply timeliness (eg <=400ms)? Is it still somewhat desirable to have a hard limit to be safe?
There was a problem hiding this comment.
hmm no? being not bankless, leaders are responsible and incentivised to generate block which should be replayed around 400ms. otherwise, they just risk missed block rewards.
likewise, replay should discard long-running blocks, assuming others won't vote as well.
| Less if self.is_congested => (false, self.reset_counter + 1), | ||
| Equal | Greater if !self.is_congested => (true, self.reset_counter), | ||
| _ => (self.is_congested, self.reset_counter), | ||
| }; |
There was a problem hiding this comment.
Wondering the reason of determine congestion by number of txs instead of accumulated CUs?
There was a problem hiding this comment.
nice question. and really thanks for even taking look into the sim code. :)
accumulated cu could be a possibility. however, i think active number of running tx is more direct. and accumulated cu is hard to use to define congestion with very short feedback delay and could be skewed due to overpaid txes so far.
There was a problem hiding this comment.
this is definitely an open question. Appreciate for putting forward your thoughts.
|
Well, I noticed this idea is broken. closing and won't revisit this.... sybil attack is too easy. anyone can submit cu-heavy fake txes which locks unrelated addresses. so that their real tx's write address get cooled down... |
just starting. :)
this is my pet idea, been sitting in my mind for awhile, considering recent talk at discord's #ecnomics, seems it's prime time to engage in this battle field. :)
this is much serious rambling than this (https://discord.com/channels/428295358100013066/987445907031199844/1080717702672429107):
related (read: competing) proposals.
dynamic base fees
#4
program rebatable account write fees:
#16
asynchronous program execution:
#45
increase prioritization fee:
#50
bankless
#5
dynamic base fee2?
#19