Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

eth_getLogs block range limitations on BSC #113

Closed
dvcrn opened this issue Mar 15, 2021 · 16 comments
Closed

eth_getLogs block range limitations on BSC #113

dvcrn opened this issue Mar 15, 2021 · 16 comments

Comments

@dvcrn
Copy link

dvcrn commented Mar 15, 2021

I started getting error messages about block range exceeded when querying events and on googling I found 2 threads on the binance community forums:

So it looks like there was a change to bsc nodes to no longer return more than 5000 blocks worth of event/log data.

I'm opening this issue here because BSC claims (also in the readme):

EVM-compatible: Supports all the existing Ethereum tooling along with faster finality and cheaper transaction fees.

However with this change to eth_getlogs, this statement is no longer true because a lot of tooling around events doesn't work anymore, like in the thread above mentioned:

    const events = await contract.getPastEvents('eventName', {
      fromBlock: 0,
      toBlock: 'latest',
    });

or

await contract.queryFilter(contract.filters.EventName());

This is problematic because now getting all (indexed) events/logs from a contract requires querying 5000 blocks at a time, even when those blocks don't contain a single event/log from the specified contract. And with the amount of blocks getting created every hour, the amount of requests to be sent to the nodes is absurd.

Currently https://docs.binance.org/smart-chain/developer/rpc.html suggests:

You can make eth_getLogs requests with up to a 5K block range. If you need to pull logs frequently, we recommend using WebSockets to push new logs to you when they are available.

But this now also requires off-chain solutions like building a separate server/node just to listen for events to store them. It also recommends using websockets, but no official BSC websocket endpoint is available besides the community provided one that has

no quality promised

Expected behaviour

  • Return all events/logs

Actual behaviour

  • Errors when range is over 5000 blocks
@yutianwu
Copy link
Contributor

yutianwu commented Mar 15, 2021

Running your own node is always recommended if you want all of the features. The limit is meant to improve the experience of all of the community users since the request like yours costs too much resources.

@dvcrn
Copy link
Author

dvcrn commented Mar 15, 2021

Even if I'd run my own node, users that use bsc wallet or metamask will be connected to the main nodes and not my own one, and BSC wallet doesn't even allow changing networks. So using the provider that the user uses to interact with my app will result in the errors above

@yutianwu
Copy link
Contributor

Can you improve the request so you do not need to start from the first block?

@dvcrn
Copy link
Author

dvcrn commented Mar 15, 2021

In my specific case, the information emitted by the contract as events is crucial to the operation of the dApp, which is why they were defined as indexed. But getting the request small enough to fit into 5000 blocks (4h) is not doable for this (and shouldn't be needed because Ethereum has getLogs with filters like 'latest' and 'earliest' for this). I wouldn't even know which blocks I should check that could have data related to my contracts

As a workaround I could store all of that in the smart contract itself, but that would result in expensive-getting transactions over time so isn't ideal either.

So the only viable way to do this currently on BSC would be to move this off-chain and into own infrastructure which is a bit sad because it has been working great on ETH and BSC onchain without extra tools.

What resulted in this sudden change? Is it related to the smaller network size on BSC relative to the load? But I can't imagine that creating numerous requests with 5000 block range to get all data is any better in terms of load

@yutianwu
Copy link
Contributor

It's kinda like the tragedy of the commons, not everyone needs to query from the first block(there are 5697392 blocks so far), but these requests do affect the other users' experience. So if you need the data from the first block, it would be better to store the data you need in your service.

@dvcrn
Copy link
Author

dvcrn commented Mar 15, 2021

Yeah I completely understand you and will be implementing a offchain solution for my specific usecase to keep the data from contract creation to latest block in sync with my server.

But to the core of the issue, I think 5000 blocks (roughly 4h) is not a lot to work with because, like you said, there are a lot of blocks on this chain. To query 1 day worth of data, it would already result in 6 requests to getlogs per contract to query, per user if this data isn't cached on a server somewhere in the middle, that's a lot of requests

And because of the nature of this api, we just don't have a way to know which blocks include events related to the contract and which don't. This makes using indexed events much less useful, because if the most-recent event is from last week on a small contract, I'd have to just keep doing requests in 5000 block intervals until I find it which would be with the math above 42 requests. That's almost like a brute-force search through all the blocks, and doesn't use the index

So this still breaks standardized Ethereum tooling. I think changes like this need to be communicated a bit more prominently, potentially with a recommended workaround available

@saurik
Copy link

saurik commented Apr 2, 2021

(I wrote this as a comment on #101 before realizing an issue had been filed about it.)

This limit essentially breaks the ability to write sane GUIs using your chain, at least if the user is going to use your public endpoints :(. I do understand the interest in limiting the amount of I/O from these calls, but I will strongly argue that the correct way to approach this problem is to limit the number of results or the complexity of the query, not having a hard cap on the range of requests.

The way this works on Alchemy's Ethereum endpoints, for example, is that they have a hybrid strategy: they have an even tighter range if you expect to get a lot (and it is a lot...) of logs, but if you aren't getting many logs (which is going to, by far, be the common case: these queries should be "easy") you have no range limit (allowing you to build a way to quickly scour history for rare events, or do "paging" to find all events).

You can make eth_getLogs requests with up to a 2K block range and no limit on the response size. You can also request any block range with a cap of 10K logs in the response.

To put it strongly: if my contract has logged five events ever for some specific log index, the whole point of this feature is that I am paying a non-trivial amount of gas (aka, money ;P these logs cost 375 gas per index and 256 gas per word) for nodes to index this information and let me find it later: if Binance is going to run public endpoints for clients, but not let me find that information, they just aren't really correctly functional.

Think about what this means for dapps that are trying to show you the history of multisig transactions on your contract: BSC's 5000 block "range limit", given your very fast three second block time, means that it could only look at just over four hours of data per query... to find some confirmation event that happened even just in the past month would require almost two hundred requests!

@saurik saurik mentioned this issue Apr 2, 2021
3 tasks
@dvcrn
Copy link
Author

dvcrn commented Apr 2, 2021

You can make eth_getLogs requests with up to a 2K block range and no limit on the response size. You can also request any block range with a cap of 10K logs in the response.

It's worth noting that bscscan is doing a 10k cap on the results, not 5k block limit. So if you do earliest to latest, you'll get max 10k, and to get the next you'll have to add paging with startblock. So in a way, bscscan is the only way to get out-of-the-box event-related standard functionality without running a own event cache

@BennyTheDev
Copy link

BennyTheDev commented Apr 21, 2021

Ethereum is processing around 1.5mln tx / dax atm and you can easily use "earliest" to "latest" with the nodes that are baked in Metamask and not fearing any rate limits. Saying to build your own node is not the right strategy, especially if you want your dAPP to be fully decentralized (hence the "d") and if that's not possible and you need to run certain things on a server, it makes it centralized and not a dApp no longer. I think we can agree that this wasn't the plan.

Besides that it is almost impossible to index contract event data properly without massive block delays which wouldn't add any value to the user experience anyway.

However, interesting aspect is that things start to grow exponentially from around block 5.5mln.

If you perform a getPastEvents using web3 on a contract with fully enabled features from block 0, you will see what I mean roundabout at that block height. Things get looked up pretty fast and suddenly slowing incredibly down.

When you plot that, you will see exponential growth, which is bad and needs to be addressed. Just putting stronger hardware will only help for so long until the entire things comes to a halt. So going the optimization route is imho the way to go (we are using already some really powerful metal and still run into tons of issues).

I posted the issue today below and I strongly feel this needs to be cross-checked. If confirmed and fixable, a good chunk of the issues with lagging blocks could be solved:

#160

Besides that it would help not to use leveldb but something more robust and faster. I am pretty sure leveldb can be swapped out but I am not a Go nor a geth pro to do it myself yet.

@shivam0x
Copy link

@dvcrn Can you elaborate a bit on your bscscan comment above.

@ghost ghost mentioned this issue Nov 18, 2021
@unclezoro
Copy link
Collaborator

closing this. A lot of RPC provider get such limitations for security concern.

@saurik
Copy link

saurik commented Nov 29, 2021

@guagualvcha FWIW, it isn't at all common to have the kind of restriction BSC implemented. As noted earlier in this thread, other providers tend to have a multi-modal limit that allows for wide searches in the case of a small number of results (I would presume under the belief that these can and should be served by a trivial index). I know MATIC had issues, but they claim those were a bug they were working on, and rsk's public endpoint simply is missing eth_getLogs, but after having tested on numerous other chains I've never seen someone with the kind of restriction implemented by BSC (though if you have some examples I'm totally willing to believe we are looking at entirely disjoint chains or something).

@Garito
Copy link

Garito commented Dec 9, 2021

Giving the fact that ethereum has no limitation on this sense, it looks like a bad excuse that this network has
When I query ethereum it cost miliseconds while when I query bsc it cost tents of seconds making this network useless (I have to look for the contract's first and last transaction, divide them in 4999 blocks parts and loop for this range asking for getpastevents)
how pathetic is this?

@niZmosis
Copy link

niZmosis commented Jan 28, 2022

const getPastEvents = async ({
chainId,
provider,
contractAddress,
contractAbi,
event,
fromBlock,
toBlock = "latest",
chunkLimit = 0
}) => {
try {
if (!provider && !!chainId) {
provider = getProviderForChainId(chainId)
} else if (!provider) {
return { events: [], errors: ["Assign chainId or provider"], lastBlock: null }
}

    const contract = new provider.eth.Contract(contractAbi, contractAddress)

    const fromBlockNumber = +fromBlock
    const toBlockNumber =
        toBlock === "latest" ? +(await provider.eth.getBlockNumber()) : +toBlock
    const totalBlocks = toBlockNumber - fromBlockNumber
    const chunks = []

    if (chunkLimit > 0 && totalBlocks > chunkLimit) {
        const count = Math.ceil(totalBlocks / chunkLimit)
        let startingBlock = fromBlockNumber

        for (let index = 0; index < count; index++) {
            const fromRangeBlock = startingBlock
            const toRangeBlock =
                index === count - 1 ? toBlockNumber : startingBlock + chunkLimit
            startingBlock = toRangeBlock + 1

            chunks.push({ fromBlock: fromRangeBlock, toBlock: toRangeBlock })
        }
    } else {
        chunks.push({ fromBlock: fromBlockNumber, toBlock: toBlockNumber })
    }

    const events = []
    const errors = []
    for (const chunk of chunks) {
        await contract.getPastEvents(
            event,
            {
                fromBlock: chunk.fromBlock,
                toBlock: chunk.toBlock
            },
            async function (error, chunkEvents) {
                if (chunkEvents?.length > 0) {
                    events.push(...chunkEvents)
                }

                if (error) errors.push(error)
            }
        )
    }

    return { events, errors, lastBlock: toBlockNumber }
} catch (error) {
    return { events: [], errors: [error], lastBlock: null }
}

}

@ScottWallace
Copy link

Cleaning up the correct answer by @LarryRyan0824 by putting it within a complete code block.

const getPastEvents = async ({
    chainId,
    provider,
    contractAddress,
    contractAbi,
    event,
    fromBlock,
    toBlock = "latest",
    chunkLimit = 0
  }) => {
    try {
      if (!provider && !!chainId) {
        provider = getProviderForChainId(chainId)
      } else if (!provider) {
        return { events: [], errors: ["Assign chainId or provider"], lastBlock: null }
      }

      const contract = new provider.eth.Contract(contractAbi, contractAddress)

      const fromBlockNumber = +fromBlock
      const toBlockNumber =
        toBlock === "latest" ? +(await provider.eth.getBlockNumber()) : +toBlock
      const totalBlocks = toBlockNumber - fromBlockNumber
      const chunks = []

      if (chunkLimit > 0 && totalBlocks > chunkLimit) {
        const count = Math.ceil(totalBlocks / chunkLimit)
        let startingBlock = fromBlockNumber

        for (let index = 0; index < count; index++) {
          const fromRangeBlock = startingBlock
          const toRangeBlock =
            index === count - 1 ? toBlockNumber : startingBlock + chunkLimit
          startingBlock = toRangeBlock + 1

          chunks.push({ fromBlock: fromRangeBlock, toBlock: toRangeBlock })
        }
      } else {
        chunks.push({ fromBlock: fromBlockNumber, toBlock: toBlockNumber })
      }

      const events = []
      const errors = []
      for (const chunk of chunks) {
        await contract.getPastEvents(
          event,
          {
            fromBlock: chunk.fromBlock,
            toBlock: chunk.toBlock
          },
          async function (error, chunkEvents) {
            if (chunkEvents?.length > 0) {
              events.push(...chunkEvents)
            }

            if (error) errors.push(error)
          }
        )
      }

      return { events, errors, lastBlock: toBlockNumber }
    } catch (error) {
      return { events: [], errors: [error], lastBlock: null }
    }
  }

@aalmada
Copy link

aalmada commented Feb 17, 2023

I'm using TypeChain and Ethers so I adapted the code by @niZmosis to the following:

import { BaseContract, providers } from "ethers";
import { TypedEvent, TypedEventFilter } from "../typechain-types/common";

// Some node providers have a limit on the block range that can be queried.
// This version of queryFilter splits the query into multiple calls but
// only if the single call fails.

const singleCall = async <TEvent extends TypedEvent>(
	contract: BaseContract,
	event: TypedEventFilter<TEvent>,
	fromBlockOrBlockhash?: string | number | undefined,
	toBlock?: string | number | undefined
): Promise<Array<TEvent>> =>
	(await contract.queryFilter(
		event,
		fromBlockOrBlockhash,
		toBlock
	)) as Array<TEvent>;

const multipleCalls = async <TEvent extends TypedEvent>(
	provider: providers.BaseProvider,
	contract: BaseContract,
	event: TypedEventFilter<TEvent>,
	chunkLimit: number,
	fromBlockOrBlockhash?: string | number | undefined,
	toBlock?: string | number | undefined
): Promise<Array<TEvent>> => {
	const fromBlockNumber = Number(fromBlockOrBlockhash);
	const toBlockNumber = Number(
		!toBlock || toBlock === "latest"
			? await provider.getBlockNumber()
			: toBlock
	);
	const totalBlocks = toBlockNumber - fromBlockNumber;

	let startingBlock = fromBlockNumber;
	const count = Math.ceil(totalBlocks / chunkLimit);

	let events: Array<TEvent> = [];
	for (let index = 0; index < count; index++) {
		const fromRangeBlock = startingBlock;
		const toRangeBlock =
			index === count - 1 ? toBlockNumber : startingBlock + chunkLimit;
		startingBlock = toRangeBlock + 1;

		events = events.concat(
			await singleCall(contract, event, fromRangeBlock, toRangeBlock)
		);
	}
	return events;
};

const queryFilter = async <TEvent extends TypedEvent>(
	provider: providers.BaseProvider,
	contract: BaseContract,
	event: TypedEventFilter<TEvent>,
	fromBlockOrBlockhash?: string | number | undefined,
	toBlock?: string | number | undefined,
	chunkLimit?: number
): Promise<Array<TEvent>> => {
	try {
		return await singleCall(contract, event, fromBlockOrBlockhash, toBlock);
	} catch {}

	if (chunkLimit && chunkLimit > 0) {
		return await multipleCalls(
			provider,
			contract,
			event,
			chunkLimit,
			fromBlockOrBlockhash,
			toBlock
		);
	}

	return [];
};

export default queryFilter;

galaio pushed a commit to galaio/bsc that referenced this issue Jul 31, 2024
…nfig

params: source chainConfig and genesis from superchain-registry
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants