-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
improve(BundleDataClient,SpokePoolClient): Log about duplicate events and getLatestProposedBundleData
should rarely load data from scratch
#884
base: master
Are you sure you want to change the base?
Conversation
We want to know when these events happen but we currently believe this is possible with the Indexed spoke pool client so we shouldn't throw.
src/clients/SpokePoolClient.ts
Outdated
return e.transactionHash === deposit.transactionHash && e.logIndex === deposit.logIndex; | ||
}) | ||
) { | ||
this.logger.warn({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is ever logged, this basically means that there is a bug somewhere. I think this should be an error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this might take down the fast relayer though, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the relayers are the only bots which do not stop on errors, since they error a lot when txns revert in simulation. @pxrl could probably confirm that I'm not mistaken, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure we should crash here. I'd rather just ignore the event and try our best to keep the events stored in memory correct by filtering out duplicates
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally, this should never trigger, though. I'd argue that if this is logged, then we introduced a bug somewhere and should look into recent commits. Right now, we know that this would occasionally log in the relayer, but if this happened somewhere else in the future, we'd probably want to look into it. Warn logs just seem to be drowned out by the many other warn logs we emit elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hm so if we crashed here, this would bring down the fast relayer which seems to handlese duplicate events ok today. Are we ok with that? yes it'd help us debug, but also be disprutive to production service. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the fast relayer crashes on errors, then I agree, we should not log an error and go with what you currently have. I don't think the fast relayer crashes on logged errors, though. Am I wrong?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Errors result in a page being emitted and they're very visible in Slack, but the bot should continue to run.
Warnings are also reasonably visible in Slack but don't result in anyone getting paged.
Depending on how frequently this gets emitted, it could be a blessing or a curse. It'd be nice to avoid behaviour where it triggered a lot of rapid alerts. Given that it's in the SpokePoolClient update() function, I think it'd probably alert once and then disappear, so I think error should be OK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getLatestProposedBundleData
should never load data from scratch
(d) => d.transactionHash === deposit.transactionHash && d.logIndex === deposit.logIndex | ||
) | ||
) { | ||
throw new Error("Duplicate deposit in bundleDeposits"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a log message here to identify the relevant deposits? That will speed up debugging time significantly if we're able to isolate them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let n = hubPoolClient.hasPendingProposal() ? 1 : 2; | ||
|
||
// eslint-disable-next-line no-constant-condition | ||
while (true) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we maybe make this conditional on n < something? It'd suck to somehow end up in an infinite loop here.
while (true) { | |
while (n++ < 4) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wdyt of faebdb8 where we fallback to just loading from scratch.
src/clients/SpokePoolClient.ts
Outdated
if ( | ||
this.fills[fill.originChainId] !== undefined && | ||
this.fills[fill.originChainId].some( | ||
(f) => f.transactionHash === fill.transactionHash && f.logIndex === fill.logIndex | ||
) | ||
) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prettier probably complains about this formatting, but it should be possible to drop the first check and simplify the statements.
if ( | |
this.fills[fill.originChainId] !== undefined && | |
this.fills[fill.originChainId].some( | |
(f) => f.transactionHash === fill.transactionHash && f.logIndex === fill.logIndex | |
) | |
) { | |
const duplicate = this.fills[fill.originChainId]?.find((f) => f.transactionHash === fill.transactionHash && f.logIndex === fill.logIndex); | |
if (duplicate) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getLatestProposedBundleData
should never load data from scratchgetLatestProposedBundleData
should rarely load data from scratch
// Check if bundle data exists on arweave, otherwise fallback to last published bundle data. If the | ||
// first bundle block range we are trying is the pending proposal, then we'll grab the most recently | ||
// validated bundle, otherwise we'll grab the second most recently validated bundle. | ||
let n = hubPoolClient.hasPendingProposal() ? 1 : 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it OK to do this? This function now won't necessarily return the latest proposed bundle data, and will instead likely return an older bundle's arweave data. Wouldn't this mean that calculating the "latest" pool rebalance root isn't actually going to calculate the latest root, and instead some older one (unless the latest data has been published).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we should re-name this function to getLatestProposedRootWithArweaveData
because if you look at how this function is used, its used by the InventoryClient here to get the best approximation of the latest running balances in order to compute which chains are "over-allocated" here.
So, when a bundle is first proposed but doesn't have its bundle data published to arweave, there is really little negative consequence for the InventoryClient. The InventoryClient will use an outdated running balance until the arweave bundle data is published. If we assume the arweave data gets published once a challenge period, then at worst the "latest" pool rebalance root will be one bundle behind. In reality, it only lags by ~15 mins until the bundle data is very likely published to Arweave.
Improves relayer and monitor performance by loading more data from arweave optimistically Depends on across-protocol/sdk#884
This PR will make the relayer faster when the current bundle proposal is pending but not published to Arweave and more resilient to errors thrown during bundle reconstruction from scratch.
loadDataFromScratch
is called by the proposer and disputer