-
Notifications
You must be signed in to change notification settings - Fork 70
Device wasi_nn for AI inference #393
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: edge
Are you sure you want to change the base?
Conversation
| * format: The format string for the message. | ||
| * ...: The variables to be printed in the format. | ||
| */ | ||
| void beamr_print(int print, const char* file, int line, const char* format, ...); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a blocker on this PR, but perhaps if we are using these debugging prints more widely now, they should be abstracted into utility HB_PRINT (etc) functions?
src/dev_wasi_nn.erl
Outdated
| read_model_by_ID(TxID) -> | ||
| %% Start the HTTP server (required for gateway access) | ||
| hb_http_server:start_node(#{}), | ||
| %% Configure store with local caching for model files | ||
| LocalStore = #{ | ||
| <<"store-module">> => hb_store_fs, | ||
| <<"name">> => <<"model-cache">> | ||
| }, | ||
| Opts = #{ | ||
| store => [ | ||
| %% Try local cache first | ||
| LocalStore, | ||
| #{ | ||
| <<"store-module">> => hb_store_gateway, | ||
| %% Cache results here | ||
| <<"local-store">> => LocalStore | ||
| } | ||
| ] | ||
| }, | ||
| %% Attempt to read the model from cache or download from Arweave | ||
| case hb_cache:read(TxID, Opts) of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Best practice is just to pass the Opts to your helper function, then you can do hb_cache:read(TXID, Opts). I guess the idea here was to add the specific FS model store to the opts? In which case, you are better off making a function like this:
opts(BaseOpts) ->
ModelStore = ..., % Ideally checking the opts for a `model_store` param so that the user can configure it if they want.
NewOpts#{
store = [ModelStore|hb_opts:get(store, [], BaseOpts)]
}.
If the concern is just the size of the models though, it might make more sense to just work with us on the relevance filter for the hb_store interface. We want to add the ability to have filters based on size or path prefix (e.g, data/) to hb_store:write, such that bigger items go to the FS, etc.
src/dev_wasi_nn.erl
Outdated
|
|
||
| %% Extract the data reference from the message | ||
| %% This could be either a link to existing cached data or binary data | ||
| DataLink = maps:get(<<"data">>, Message), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we just use hb_maps:get directly here instead? Destructuring and reorganizing links breaks the abstractions and is likely to cause lots of pain later when new stores are introduced, etc.
| _ -> | ||
| cache_owner_loop() | ||
| after | ||
| 3600000 -> % Stay alive for a long time (1 hour), then check again |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of interest, what is the motivation for this? It doesn't hurt, but I can't see intuitively how it helps either?
|
Awesome first PR @Alex-wuhu ! Thank you for contributing! I left a bunch of notes throughout, but most should be relatively minor. Ping me a DM on Slack if you are up for working together on the One thing that is normally worth doing is writing Eunit tests that use Major 🫡s. This is super cool! |
…ns support and cleaning up code in dev_wasi_nn.erl and dev_wasi_nn_nif.erl
Summary
This PR introduces the WASI-NN Device (dev_wasi_nn). The device provides AI inference capabilities for HyperBEAM, supporting loading models from Arweave transactions and performing inference with session management for optimal performance. It provides an Erlang interface for AI model inference, leveraging a NIF backend for the actual inference logic and automatic model caching to avoid repeated downloads.
Key features:
API Endpoints
GET /[email protected]/inferSetup Environment
File Change Description
Models
Unit test
During call this device , logs will show using cuda for inference
