-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use simpler dissection model #5
Open
immanuelhume
wants to merge
78
commits into
ghpr-asia:main
Choose a base branch
from
immanuelhume:proper-impls
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Some methods truly only concern numeric and bytes types. We'll house them in a separate trait.
This will be used to mark a Vec<u8>, [u8; _], &[u8] as proper bytes types, rather than lists of u8s.
We also check with wireshark's proto_registrar
Many things are marked with todo!() for now.
Just to make clippy happy
@oswal-dheeraj how do you feel about the general approach? |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
First, sorry for the big PR. But this is a major improvement (and simplification) to the code which I simply cannot pass up 😬. This PR does not introduce any breaking changes - it adds new APIs but the old ones remain usable.
Motivation
As mentioned in #4, the existing code deeply parses the syntax tree of the struct/enum and tries to form a full picture of each field. This results in a) "stringy" and brittle parsing, and b) an unwieldly data model. This patch flips the approach completely. We no longer care what the type of each field is, and assume it is dissectable via the new
Dissect
trait. If the field's type does not implementDissect
, then we get a compile error as expected.The new code, using well-defined traits, is much easier to reason about. We also shed ~2500 lines of old data modelling code which was rather hard to grok as well as probably ~1000 lines of other helper code. Although this PR adds ~3000 lines, I expect the total code size to fall slightly after we remove the old stuff.
New traits
Four traits are introduced in the
wsdf
crate.Dissect
- a type which can be registered and dissectedPrimitive
- an extension ofDissect
for basic types (integers, bytes)Subdissect
- a type which can be given to subdissectors (bytes)SubdissectKey
- a type which can be used to search for a subdissector and invoke itThe
wsdf
crate also provides impls of these traits for some types (integers, bytes). In the future if we really wanted to be very generic, we can also implement them for stuff likeBox<T>
since from a protocol description perspective, aBox<T>
is justT
.For a sense of how this works, see this (simplified) example.
will expand to (with many details omitted)
This is the main idea behind this update. Everything else is just stuff to make it happen.
Other notable changes
get_variant
attribute to replacedispatch_field
for decoding enums, following a tap-like interface and returning a static string instead of an integer. See the DNS dissector for an example. (closes Feature requests: a (more) dynamic enum variant decoder #7)StructInnards
andEnum
types to handle codegen for structs and enums. These ~1000 lines can replace ~2500 lines of old data model logic.protocol!()
macro, and multiple protocols can be placed in one dylib (e.g.protocol!(Udp, UdpLite, Tcp)
would register three protocols in one dylib.Tradeoffs
A consequence of not having a "full picture of each field" is that for vector types, we can no longer validate that users have provided a length field. The following struct would not have compiled previously as the size of the bytes is unknown even at runtime. With the new trait-based approach, we lose the ability to check this at compile time, since the type
Vec<u8>
is unknown to us.Under the new API, this would be a runtime panic (can be refined in the future to not panic but raise a flag in Wireshark). However, given that this new code is vastly simpler than the old one, this tradeoff okay in my view.
Next steps
Once this is merged, we can proceed to mark the old API as deprecated and make a release (v0.2.0?). If there are no major issues, we can proceed with deleting the old code so that everything is cleaner and easier to maintain.