TrelisResearch / code-llama-32k Public

Notifications You must be signed in to change notification settings
Fork 2
Star 12

Run code-llama with 50k tokens using flash attention and better transformer

12 stars 2 forks Branches Tags Activity

Notifications

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
Code_Llama_32k.ipynb		Code_Llama_32k.ipynb
LICENSE		LICENSE
README.md		README.md
berkshire23.txt		berkshire23.txt

Repository files navigation

code-llama-32k

Run code-llama with 32k tokens using flash attention and better transformer

Basic Jupyter Notebook (only works on Nvidia GPUs, not Mac).

Option 1 - Google Colab:

Download the ipynb notebook
Select a GPU
- A100 with 40 GB will allow for 25k context length

Option 2 - Run on a server (e.g. AWS or RunPod (affiliate link))

Spin up an A100 80 GB server
Run the notebook and select 50,000 context length

PRO Notebooks

Allows for saving and re-loading of conversations
Allows for uploading and analysis of documents
Works on Google Colab or on a Server (e.g. AWS, Azure, RunPod)
Purchase here

About

Run code-llama with 50k tokens using flash attention and better transformer

Custom properties

Report repository

Releases

No releases published

Packages

No packages published

Languages

Jupyter Notebook 100.0%