-
Notifications
You must be signed in to change notification settings - Fork 6
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Upgrade to llama.cpp git sha 615212 (#7)
* Update to llama.cpp sha 615212 main_.cpp * build-native : files compile * format * mockic.exe builds !! * register CPU backend * Update memory management for model into Orthogonal Persistence * Format * Default behavior: -no-cnv * Upgrade to icpp-pro 5.0.2 * wasm now builds * free model only once * Scripts to build & deploy * tinystories is working in canister! * Some small updates * Do not call load_model from upload.py It is better to separate this out into another step. * Update comment * For clarity, dfx.json uses the .did file in 'build' folder * remove_log_file Logging changed in this version, and we need to provide mechanism to use remove log files. * Update native & pytests * CI/CD - use different branch while working on upgrade * format include * Update READMEs * Update table of llama.cpp upgrades * Running LLMs on-chain solves your cybersecurity problem * Upgrade to llama.cpp sha 615212 All done...
- Loading branch information
Showing
35 changed files
with
1,840 additions
and
771 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,5 @@ | ||
# Misc | ||
llama_cpp_onicai_fork | ||
llama_cpp_onicai_fork* | ||
*.code-workspace | ||
x | ||
y | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,110 @@ | ||
# DETAILS FOR UPGRADE from llama.cpp sha `615212` to `b841d0` | ||
|
||
### cpp_paths | ||
|
||
#### main_.cpp | ||
`meld main_.cpp llama_cpp_onicai_fork/examples/main/main.cpp` | ||
- use `main_` instead of `main` | ||
- A few items related to console & ctrl+C need to be outcommented | ||
|
||
|
||
#### llama_cpp_onicai_fork/src/llama.cpp | ||
- add `#include "ic_api.h"` | ||
- replace `throw std::runtime_error(format` with `IC_API::trap(std::string("RUNTIME ERROR: ") + format` | ||
- replace `throw` with `IC_API::trap` | ||
- outcomment `try - catch`. The program will abrupt in case of thrown exceptions. | ||
- outcomment threading related items: | ||
- `#include <future>` | ||
- `#include <mutex>` | ||
- `#include <thread>` | ||
- outcomment these functions completely: | ||
- `llama_tensor_quantize_internal` | ||
- `llama_model_quantize_internal` | ||
|
||
|
||
#### llama_cpp_onicai_fork/src/llama-vocab.cpp | ||
- add `#include "ic_api.h"` | ||
- replace `throw std::runtime_error(format` with `IC_API::trap(std::string("RUNTIME ERROR: ") + format` | ||
- outcomment `try - catch`. The program will abrupt in case of thrown exceptions. | ||
- add a check on `llama_token_bos(model)`, else the llama2.c models never stop generating: | ||
``` | ||
bool llama_token_is_eog_impl(const struct llama_vocab & vocab, llama_token token) { | ||
return token != -1 && ( | ||
token == llama_token_eos_impl(vocab) || | ||
token == llama_token_eot_impl(vocab) || | ||
token == llama_token_bos_impl(vocab) // ICPP-PATCH: the llama2.c model predicts bos without first predicting an eos | ||
); | ||
} | ||
``` | ||
|
||
#### llama_cpp_onicai_fork/src/llama-grammar.cpp | ||
No changes needed | ||
|
||
#### llama_cpp_onicai_fork/src/llama-sampling.cpp | ||
No changes needed | ||
|
||
#### llama_cpp_onicai_fork/src/unicode-data.cpp | ||
- no modifications needed for the IC | ||
|
||
#### llama_cpp_onicai_fork/src/unicode.cpp | ||
- add `#include "ic_api.h"` | ||
- replace `throw` with `IC_API::trap` | ||
|
||
#### llama_cpp_onicai_fork/common/json-schema-to-grammar.cpp | ||
- add `#include "ic_api.h"` | ||
- replace `throw` with `IC_API::trap` | ||
- outcomment `try - catch`. The program will abrupt in case of thrown exceptions. | ||
|
||
|
||
#### llama_cpp_onicai_fork/common/build-info.cpp | ||
- run this command to create it: | ||
``` | ||
make build-info-cpp-wasm | ||
``` | ||
|
||
#### llama_cpp_onicai_fork/common/grammar-parser.cpp | ||
- add `#include "ic_api.h"` | ||
- replace `throw` with `IC_API::trap` | ||
- outcomment `try - catch`. The program will abrupt in case of thrown exceptions. | ||
|
||
#### llama_cpp_onicai_fork/common/sampling.cpp | ||
- add `#include "ic_api.h"` | ||
- replace `throw` with `IC_API::trap` | ||
|
||
#### llama_cpp_onicai_fork/common/common.cpp | ||
- add `#include "ic_api.h"` | ||
- replace `throw` with `IC_API::trap` | ||
- outcomment all code related to `<pthread.h>` | ||
- outcomment `try - catch`. The program will abrupt in case of thrown exceptions. | ||
- outcomment `std::getenv` | ||
|
||
|
||
--- | ||
### c_paths | ||
|
||
#### llama_cpp_onicai_fork/ggml/src/ggml.c | ||
- outcomment all code related to signals | ||
- `#include <signal.h>` | ||
- Many threading outcomments. | ||
|
||
#### llama_cpp_onicai_fork/ggml/src/ggml-alloc.c | ||
No updates needed for icpp-pro | ||
|
||
#### llama_cpp_onicai_fork/ggml/src/ggml-backend.c | ||
No updates needed for icpp-pro | ||
|
||
#### llama_cpp_onicai_fork/ggml/src/ggml-quants.c | ||
No updates needed for icpp-pro | ||
|
||
#### llama_cpp_onicai_fork/ggml/src/ggml-aarch64.c | ||
No updates needed for icpp-pro | ||
|
||
--- | ||
### headers to modify | ||
|
||
#### llama_cpp_onicai_fork/common/log.h | ||
- `#include <thread>` | ||
- Some other threading code | ||
|
||
#### llama_cpp_onicai_fork/common/common.h | ||
- `#include <thread>` |
Oops, something went wrong.