Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Emscripten build (demo, quick and dirty) #12

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

ggerganov
Copy link

This is a very ugly hack to demonstrate build with Emscripten

Not intended to merge as it will just ruin the beauty of the implementation and most likely there is a much better way to do this. Mainly for educational purposes.

emcc -O3 run.c \
  -o web/llama2.js \
  -s EXPORTED_FUNCTIONS='["_main", "_main_loop", "_malloc", "_free"]' \
  -s EXPORTED_RUNTIME_METHODS='["ccall"]' \
  -s ALLOW_MEMORY_GROWTH=1 \
  --preload-file model.bin \
  --preload-file vocab.bin

The vocab.bin is generated in a similar way explained in #9

The produced artifacts will be generated in the web subfolder.

Example: https://ggerganov.com/llama2.c

Copy link

@alexeykudinkin alexeykudinkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth merging IMO even if not for Emscripten, then for cleaning up the main sequence at the very least

checkpoint = "model.bin";
FILE *file = fopen(checkpoint, "rb");
if (!file) {
printf("Unable to open file!");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
printf("Unable to open file!");
printf("Unable to open file!");
return 1;

{
FILE *file = fopen("vocab.bin", "r");
if (!file) {
printf("Unable to open file!");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
printf("Unable to open file!");
printf("Unable to open file!");
return 1;

@karpathy
Copy link
Owner

Very cool ty! I'm not as familiar with emscripten, I'll take some time to look at this and whether there is a way to support it gracefully.

@python273 python273 mentioned this pull request Jul 23, 2023
@karpathy
Copy link
Owner

@ggerganov one thing I really like is the tokenizer inside C here, removing the need to use the run_wrap.py and sentencepiece. But it's not clear how you produced vocab.bin, presumably it's an export script?

@karpathy
Copy link
Owner

@ggerganov one thing I really like is the tokenizer inside C here, removing the need to use the run_wrap.py and sentencepiece. But it's not clear how you produced vocab.bin, presumably it's an export script?

nvm got it working in 3bfa566

@ggerganov
Copy link
Author

Yes, for this PR I had exported the vocab from llama.cpp, which has been generated with a similar script as yours.

I'm not sure about the cause of the leading space - we have it in llama.cpp output as well and haven't dug to understand how to fix it yet.

@kroggen
Copy link
Contributor

kroggen commented Jul 26, 2023

The tokenizer stores a leading space on some tokens.

In python we use the Decode function, that removes the leading space on the first token

But this repo is using the id_to_piece() function to export the tokenizer

python3
>>> import sentencepiece as spm
>>> sp = spm.SentencePieceProcessor(model_file='tokenizer.model')
>>> sp.id_to_piece([9038])
['▁Once']

Then it is simply printed, without considering if it is the first token:

llama2.c/run.c

Line 471 in f565089

printf("%s", vocab[next]);

The PR #89 fixes it

@gohai
Copy link
Contributor

gohai commented Jul 28, 2023

I recreated @ggerganov's changes against the current tree (with support for prompt): https://github.com/gohai/llama2.c/commits/emscripten

@rahuldshetty
Copy link

I've built a JS framework around this idea to run Language Models on Web.
This also leverages Emscripten to compile llama2.c into WASM JS.

llama2.c on Web Demo: https://rahuldshetty.github.io/ggml.js-examples/llama2_tinystories.html
ggml.js Framework: https://github.com/rahuldshetty/ggml.js

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants