-
Notifications
You must be signed in to change notification settings - Fork 4
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
69 changed files
with
112 additions
and
115 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,36 +1,57 @@ | ||
[data:image/s3,"s3://crabby-images/7a5fe/7a5feeaf0778a49238d845e717258054f7ae38da" alt="icpp_llm"](https://github.com/icppWorld/icpp_llm/actions/workflows/cicd.yml) | ||
|
||
# LLMs for the Internet Computer | ||
|
||
<img src="./assets/icpp-llm-logo.png" alt="icpp-llm logo" width="200"> | ||
|
||
Try it out in [ICGPT](https://icgpt.icpp.world) ! | ||
Try it out in [ICGPT](https://icgpt.icpp.world) ! | ||
|
||
*The LLMs of this repo run in it's back-end canisters.* | ||
_The LLMs of this repo run in it's back-end canisters._ | ||
|
||
# Getting Started | ||
|
||
A step-by-step guide to deploy your first LLM to the internet computer is provided in [icpp_llama2/README.md](https://github.com/icppWorld/icpp_llm/blob/main/icpp_llama2/README.md). | ||
A step-by-step guide to deploy your first LLM to the internet computer is provided in [llama2_c/README.md](https://github.com/icppWorld/icpp_llm/blob/main/llama2_c/README.md). | ||
|
||
# The Benefits of Running LLMs On-Chain | ||
|
||
The canisters within the Internet Computer have certain constraints. They come with memory restrictions, and there's a cap on the number of instructions one can execute per message, as discussed [here](https://forum.dfinity.org/t/instruction-limit-is-crushing-me/22070/10?u=icpp). | ||
|
||
This might lead one to question the rationale behind operating an LLM within an Internet Computer's canister. | ||
|
||
For me, the primary incentive is the unparalleled simplicity of using the IC in comparison to conventional cloud platforms. You develop, deploy & test using a local replica of the cloud, and when everything is ready, you deploy it to the IC with just one command. Everything becomes instantly and securely accessible online. You can very easily restrict access to the endpoints in case you don't want to make it fully public yet and want to share it with a smaller group instead. | ||
For me, the primary incentive is the unparalleled simplicity of using the IC in comparison to conventional cloud platforms. You develop, deploy & test using a local replica of the cloud, and when everything is ready, you deploy it to the IC with just one command. Everything becomes instantly and securely accessible online. You can very easily restrict access to the endpoints in case you don't want to make it fully public yet and want to share it with a smaller group instead. | ||
|
||
Thanks to the Internet Computer's foundational cryptographic and blockchain technologies, concerns related to IT and security vanish. It's truly remarkable. | ||
|
||
With such user-friendliness, the IC canister runtime serves as an ideal environment for my research pursuits. It complements the type of research presented in this paper that offers a dataset designed to boost the creation, examination, and study of Language Models for areas with scarce resources or specific niches: | ||
|
||
> [TinyStories: How Small Can Language Models Be and Still Speak | ||
Coherent English?](https://arxiv.org/pdf/2305.07759.pdf) | ||
> [TinyStories: How Small Can Language Models Be and Still Speak | ||
> Coherent English?](https://arxiv.org/pdf/2305.07759.pdf) | ||
Besides the ease of use and the enhanced security, running LLMs directly on-chain also facilitates a seamless integration of tokenomics, eliminating the need to juggle between a complex blend of web3 and web2 components, and I believe it will lead to a new category of Generative AI based dApps. | ||
|
||
## QA | ||
|
||
We use MiniConda and run the QA locally like this: | ||
|
||
- Create a conda environment, and install icpp-pro and other python dependencies: | ||
|
||
```bash | ||
conda create --name icpp_llm python=3.11 | ||
conda activate icpp_llm | ||
pip install -r requirements.txt | ||
``` | ||
|
||
- This installs icpp-pro. Next install wasi-sdk, dfx & clang++ as explained in [icpp-pro Installation](https://docs.icpp.world/installation.html) | ||
|
||
- Run the full QA via the Makefile: | ||
```bash | ||
make all-tests | ||
``` | ||
|
||
You can also peak in `.github/workflows/cicd.yml` to see how the QA is run as part of a GitHub actions workflow. | ||
|
||
More details are provided in the README of the sub-folders, which are standalone icpp-pro projects. | ||
|
||
## Support | ||
|
||
For support, kindly create a GitHub Issue as outlined in the [Support](https://docs.icpp.world/support.html) documentation page. | ||
|
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,66 @@ | ||
# Canister resource requirements for llama2_c. | ||
|
||
Do not edit this file. It is created with the command: | ||
|
||
```bash | ||
python -m scripts.icpp_llama2_sizer | ||
``` | ||
|
||
### Tokenizer Memory (per model) | ||
|
||
| Memory Type | 260K<br>(MB) | 15M<br>(MB) | 42M<br>(MB) | 110M<br>(MB) | | ||
| ------------------- | ------------ | ----------- | ----------- | ------------ | | ||
| vocab_memory | 0.00 | 0.12 | 0.12 | 0.12 | | ||
| vocab_scores_memory | 0.00 | 0.12 | 0.12 | 0.12 | | ||
| Total | 0.00 | 0.24 | 0.24 | 0.24 | | ||
|
||
### TransformerWeights Memory (per model) | ||
|
||
| Memory Type | 260K<br>(MB) | 15M<br>(MB) | 42M<br>(MB) | 110M<br>(MB) | | ||
| --------------------- | ------------ | ----------- | ----------- | ------------ | | ||
| token_embedding_table | 0.12 | 35.16 | 62.50 | 93.75 | | ||
| rms_att_weight | 0.00 | 0.01 | 0.02 | 0.04 | | ||
| wq | 0.08 | 1.90 | 8.00 | 27.00 | | ||
| wk | 0.04 | 1.90 | 8.00 | 27.00 | | ||
| wv | 0.04 | 1.90 | 8.00 | 27.00 | | ||
| wo | 0.08 | 1.90 | 8.00 | 27.00 | | ||
| rms_ffn_weight | 0.00 | 0.01 | 0.02 | 0.04 | | ||
| w1 | 0.21 | 5.06 | 21.50 | 72.00 | | ||
| w2 | 0.21 | 5.06 | 21.50 | 72.00 | | ||
| w3 | 0.21 | 5.06 | 21.50 | 72.00 | | ||
| rms_final_weight | 0.00 | 0.00 | 0.00 | 0.00 | | ||
| wcls | 0.12 | 35.16 | 62.50 | 93.75 | | ||
| Total | 1.12 | 93.11 | 221.53 | 511.57 | | ||
|
||
### RunState Memory (per user) | ||
|
||
| Memory Type | 260K<br>(MB) | 15M<br>(MB) | 42M<br>(MB) | 110M<br>(MB) | | ||
| ----------- | ------------ | ----------- | ----------- | ------------ | | ||
| x | 0.00 | 0.00 | 0.00 | 0.00 | | ||
| xb | 0.00 | 0.00 | 0.00 | 0.00 | | ||
| xb2 | 0.00 | 0.00 | 0.00 | 0.00 | | ||
| hb | 0.00 | 0.00 | 0.01 | 0.01 | | ||
| hb2 | 0.00 | 0.00 | 0.01 | 0.01 | | ||
| q | 0.00 | 0.00 | 0.00 | 0.00 | | ||
| k | 0.00 | 0.00 | 0.00 | 0.00 | | ||
| v | 0.00 | 0.00 | 0.00 | 0.00 | | ||
| att | 0.02 | 0.01 | 0.03 | 0.05 | | ||
| logits | 0.00 | 0.12 | 0.12 | 0.12 | | ||
| key_cache | 0.31 | 1.69 | 16.00 | 36.00 | | ||
| value_cache | 0.31 | 1.69 | 16.00 | 36.00 | | ||
| Total | 0.65 | 3.52 | 32.18 | 72.20 | | ||
|
||
### Total Memory | ||
|
||
| Memory Type | 260K<br>(MB) | 15M<br>(MB) | 42M<br>(MB) | 110M<br>(MB) | | ||
| ------------------------------------------- | ------------ | ----------- | ----------- | ------------ | | ||
| Total Tokenizer Memory (per model) | 0.00 | 0.24 | 0.24 | 0.24 | | ||
| Total TransformerWeights Memory (per model) | 1.12 | 93.11 | 221.53 | 511.57 | | ||
| Total RunState Memory (per user) | 0.65 | 3.52 | 32.18 | 72.20 | | ||
| Overall Total Memory | 1.76 | 96.62 | 253.71 | 583.78 | | ||
|
||
### Canister Metrics | ||
|
||
| Canister Metrics | 260K<br>(MB) | 15M<br>(MB) | 42M<br>(MB) | 110M<br>(MB) | | ||
| ------------------------------ | ------------ | ----------- | ----------- | ------------ | | ||
| Max number of concurrent users | 6347 | 1138 | 120 | 49 | |
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,4 @@ | ||
-r icpp_llama2/requirements.txt | ||
-r llama2_c/requirements.txt | ||
|
||
# to lint python scripts | ||
black | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters