-
Notifications
You must be signed in to change notification settings - Fork 386
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Specify Client Versions on Engine API #517
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While this can be faked, it's a strict improvement over status quo with no downsides, so I support
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be easy to implement, and improve a clear and present problem on the network.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally support this. I don't think we should expose under the provided name though. I would rather expose under engine_*
. For one, the web3_*
namespace isn't defined anywhere in this repo. Second, if we account for the possibilities of specialized engine server (thinking like a client multiplexer) then this response is extremely engine oriented.
If that sounds reasonable, can you update this PR with the name and add the schema for the method in both the openrpc spec and in engine common spec? It should look similar to engine_exchangeCapabilities
I think.
Adding this to the engine api is definitely an improvement. In order to fit within the graffiti, we should specify a field size limit or a strategy to encode the version info. Otherwise we might end up with responses like: |
I'm personally more in favor of standardizing |
5e25013
to
70c3f0e
Compare
In light of comments received so far I've pushed an alternative specification called
Personally I lean towards taking The definitions of
If desired, the space could be further reduced so that the bytes of the commit hash are embedded directly into the graffiti bytes (allowing the full consensus and execution client versions to be specified in 12 bytes). Standardizing the version specifications this way makes graffiti analysis easy regardless of what client pairs are used. Based on my testing, Side note: by design, none of the proposed client codes are valid hex so they they won't be confused with the commit hash. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Standarizing the version format return is a great improvement. I would switch to name to "Client Version" engine_clientVersion
instead of using the term identification. Same meaning but better memetics
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the new dedicated method, and agree with Lion that it should be called engine_clientVersion
.
I'm in favour of making it mandatory after an appropriate adoption period.
Also specify Grandine abbreviation and accomodate other versioning systems.
Very supportive of this change. Agree with previous comments around naming. Unsure if we should introduce versioning for identification. we already have unversioned |
I believe This doesn't necessarily mean that we couldn't also agree to never allow the |
Supportive of this change. Maybe the method could be renamed to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I support this proposal.
Not as a part of this one, but when it comes to versions, I'd also like to have a standardized version format for clients. Something like what browsers have for their user agent string defined in RFC 9110. That would be helpful for the network stats handling as well. Currently, we mostly have name/version/platform/lang
, but it slightly varies from client to client.
Is there an advantage of being prescriptive about the format the client returns it's version in? I was imagining something much closer to I guess there isn't much downside as it is mainly just upfront cost of spec'ing things out. |
Having a conformed format will only help in identification, and in 'economy of graffiti'. Ensuring there is a predictable portion of graffiti consumed by client identification makes it more palatable IMO. The primary downside is just the gatekeeping required to maintain the list. I like the human readable and more verbose bits, but I think that might only be useful in CL logs, since it is behind JWT secured endpoint. |
It seems like the other missing part that would be nice in graffiti would be the builder and version if used (mostly used, but not always) |
@rolfyone We already have data on the builders because they fill the execution payload's The only thing the local BN could read would be a list of relays from |
Okay I just want to take the temperature of the room. Please react to this to vote: ❤️ - vote for reusing |
Prysm supports this proposal. I filed this issue to start recording our user-agent info in graffiti by default: prysmaticlabs/prysm#13558 |
That's a nice option if we want to always provide version information while taking up minimal space. But it does limit us to only 16 execution / consensus clients, and based on my conversations a lot of people seem to prefer readability. In practice, the versioning information is really just nice to have for some debugging cases. But it is not strictly necessary. It is much less important than knowing the implementation itself for the purposes of measuring EL client diversity. That's why I like the flexible standard because it allows users |
I've renamed the method to
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you 🙏
LAST CALL FOR CONCERNS I believe we've addressed all concerns that have been raised at this point. The core idea of this PR has widespread support and most disagreement is minor & related to small implementation details. There's no reason to spend weeks bikeshedding about this optional feature. CURRENT PLAN
If you agree with this plan, please give a 👍, otherwise comment your objection |
Please add spell check errors to |
How does everyone plan to represent this in flags? Are we going to artificially limit the size of user-specified graffiti flags? Otherwise, what would users expect the behavior to be if they specify 20-32 bytes of graffiti"? For instance do we truncate the version string from right to left? Doing that would assume precedence in importance of the information: CL impl > CL git hash > EL impl > EL git hash. It would also be hard for software parsing this field to differentiate user-specified graffiti that flows into this section and looks like an ident/hash from the real thing. If this is required then IMO we should limit the size of user-provided graffiti to 20 bytes and be very explicit about the fact that 1) this will be a breaking config change for users and 2) there is no opting out by "overriding" the default with a flag. |
One thing you could do is continue dropping client version data up until you can no longer store the EL+CL code combo. I think version / commit is nice to have but as critical. |
Yeah the EL is most important (hardest to determine through other means), then CL, then versions. So if the plan is to truncate, I think we should just order it that way: (EL|CL|el-hash|cl-hash). You could get fancy and interleave the EL/CL hash bytes one-by-one but maybe that's overkill :) |
Exactly. This is what I've been referring to as a flexible standard. The version information is nice to have but not critical. |
I'm not sure I'd interleave, but shortening does make sense. If the version of a client is present, it being in the same logical chunk will be easier to search for. eg. searching for
I think this is a sensible approach, and if the user graffiti is beyond 28 characters just not having the data... Do we know what percentage of blocks have more than 28 bytes of graffiti? seems like we could get a fairly good estimation of how useful this would be... |
Co-authored-by: lightclient <[email protected]>
Co-authored-by: lightclient <[email protected]>
I've compiled the discussion around choosing a graffiti standard into a single document: https://hackmd.io/@wmoBhF17RAOH2NZ5bNXJVg/BJX2c9gja I welcome any comments. Also, there haven't been any requests for changes in days. Seems like we can merge this? |
@lightclient it's been about a week since people agreed we should merge this and there haven't been any objections. Is that enough time to merge? |
By analyzing the structure of beacon blocks on the network, we are able to obtain fairly accurate data on consensus layer client diversity. Unfortunately, do to the fact that the overwhelming majority of validators use
mev-boost
, their execution clients do not leave any fingerprint behind in block proposals. Thus we are forced to rely on limited self-reporting data from staking pools. Many pools do not participate, and we often have outdated statistics for the pools that do. Worse yet, we have no data on client diversity for home stakers.This PR can change that by allowing consensus clients to learn which execution client they are connected with.
Consensus clients can then embed this in their
graffiti
field by default when the user doesn't bother to set it. A quick survey of recent proposal graffiti reveals that:already embed their client and version by default. It would be great to add the execution client to this. Perhaps prysm could be convinced to join as well.
An analysis of ~2000 recent blocks indicated that nearly half of all validators don't bother to change their graffiti from the default so the potential to gather data here is huge.