-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Refactor to a plugin/middleware architecture #167
Comments
This seems reasonable, in particular, in seems that this would improve testability and maintainability. Extensibility remains an open question for me though. In most contexts extensibility, especially runtime extensibility allows for adaptability by users. However, in this case, Verified Fetch aims to adhere as closely as possible to the gateway specs, so I wonder if this is really a bundling question (gorilla banana problem) rather then extensibility per se. In other words, it seems that it's modularity we seek rather than extensibility. Since this came up in the context of directory listing, do you see html rendering of directories as part of Verified Fetch's scope? I ask because the specs are pretty vague:
The spec isn't strict about what you do in the case of no explicit Accept header and a directory terminal element without an index.html. Might this in part be a problem with the specs that aren't specific enough? On one hand, it may make sense to return a structured response with the directory listing (when there's no index.html), allowing consumers to process and render directory listings html. However, this deviates from how UnixFS is handled in the gateway spec (where mostly bytes are streamed to the user). On the other hand, if you don't return something useful, you waste useful work you've done and require another roundtrip by consumers. |
I don't. The whole goal of this change isn't just to satisfy the dir-index-html listing.. but implementing that is difficult, which speaks to the lack of flexibility of verified-fetch for any changes we want to make. Improving testability, maintainability, and ensuring the library can be extended -- to add dir-index-html listing, or to extend for any unforeseen future use-cases -- are all important and enabled by the proposed changes here.
Shmemantics.. I think the library also needs modularity, but being able to extend the library, even with core functionality, is important. Verified-fetch is in a unique position that's a little different than a typical library.. we want devs to be able to benefit from it, but we also depend on it very directly with two projects that may conflict with typical verified-fetch programmatic use-cases. With that said, it sounds like you feel that adding an |
I was mostly curious to better understand how plugins would be concretely used. Based on the proposed interface, |
yep, pretty much a "state machine" except instead of hardcoding a simple map where all things are known, we can flexibly call a method for each handler and let them decide if it's appropriate for them to handle the request. we could go more of the express-middleware route where we just call "handle" on all plugins and maybe
I don't have a full vision for all the potential benefits, but I know the flexibility discussed in this issue would allow a variety of future pivots. A few things off the top of my head for helia-http-gateway specifically:
|
One thing I would like to resolve before starting this work is whether benchmarking is needed or if these changes are desirable enough, regardless of the cost. What types of performance or benchmarks can we use to establish a basic understanding of how long it takes verified fetch to do its work (outside of networking delays) so we can ensure there is no considerable degradation with these changes? If we do want to do some benchmarking, I can request some of the fixtures (in the interop package) over 1000+ runs and see where the average processing time is. What fixtures should we check? I think the most time-consuming requests are:
The rest of the handlers are fairly lightweight.. but some of the tar/car stuff could probably use benchmarking. I don't imagine many of the handlers' logic will be changing with these proposed changes, though. It would encapsulate the same logic elsewhere, but anything could go wrong. We will be switching from a simple if expression to multiple function executions, which will be slightly, but likely not noticeably, slower. Honestly, I think these changes will make verified-fetch much better, and spending time on benchmarking would be unnecessary given all the potential benefits. |
My hunch is that that networking delays account for most of the delay. So I'm tempted to say —in the spirit of shipping— let's go through with this and if needed introduce benchmarks later. I don't think we're at the performance optimisation stage yet, for which benchmarking would be helpful. The tracing work that's mostly done serves us better right now in terms of understanding the time distribution in the request lifecycle and allows us to later be more targeted in our benchmarking, if needed. |
i've got car, tar, raw, and ipns record
|
I've already mentioned this a few times in Helia working group and to @2color and @achingbrain directly, but I wanted to formalize a plan.
Summary
The current VerifiedFetch class (in verified-fetch.ts) handles various content types (dag-cbor, dag-pb, etc.) but has grown large and somewhat monolithic. It uses a procedural structure (if/switch) that makes adding new codecs and behaviors cumbersome. This proposal introduces a plugin/pipeline approach to make VerifiedFetch more extensible and maintainable.
Current issues
Monolithic VerifiedFetch class
All logic for handling IPNS records, CAR files, DAG-CBOR, raw blocks, etc., is in a single class. This makes it difficult to add new handlers (e.g., new codecs, new output formats) without modifying the core class.
Hardcoded “if/else” logic
We have many if (accept === '...') branches, or switch statements keyed by CID codec. The logic is scattered throughout large private methods, reducing clarity and making changes or additions risky.
Limited Extensibility
Users and future contributors cannot easily plug in their own custom handlers. Hard-coded references to existing handlers make it impossible to replace them without forking the repo.
This is very relevant for #86. If we implement the below changes, it would be trivial to add dir-index-listing support.
Proposed changes
Define a plugin interface
example:
Break out each handler into its own plugin
Convert handleIPNSRecord, handleCar, handleTar, handleJson, handleDagCbor, handleDagPb, and handleRaw into separate plugin/middleware classes implementing the above interface.
Example:
Create a pipeline in VerifiedFetch's
fetch
methodShare utilities with a “context”
Allow external/custom plugins
Benefits
Better maintainability and clarity
Easily extensible
Improved Testability
Open Questions
Plugin registration
handleDagPb
has a try/catch where dirIndexHtml was added in thecatch
method.. How do we implementcanHandle
in a futurehandleDagPb
plugin so that if findingindex.html
fails, it can move on to ahandleDirIndexHtml
? I feel like for this to work we would need a context that can evolve and be modified as certain processing is done. (i.e. handleDagPb.canHandle=true, but fails due to being a directory and missing an index.html)Prioritization
FetchHandlerFunctionArg
combined with the determinedaccept
header should be sufficient.canHandle
prior tohandle
-ing a request?Performance considerations
Next Steps
Gather feedback
Please share your thoughts on the plugin interface design, naming conventions, and plugin priorities.
Implementation
When there is consensus, we’ll proceed to:
Feel free to comment or suggest changes. Once we finalize the approach, we can open a PR with incremental commits for easier review.
The text was updated successfully, but these errors were encountered: