Skip to content

Commit a849f1b

Browse files
Update README.md (#656)
1 parent 793cc20 commit a849f1b

File tree

1 file changed

+5
-4
lines changed

1 file changed

+5
-4
lines changed

README.md

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -17,16 +17,17 @@ Everything runs locally with no server support and accelerated with local GPUs
1717
* NVIDIA GPUs via CUDA on Windows and Linux;
1818
* WebGPU on browsers (through companion project [WebLLM](https://github.com/mlc-ai/web-llm/tree/main)).
1919

20-
**[Click here to join our Discord server!][discord-url]**
21-
22-
**[News] MLC LLM now supports 7B/13B/70B Llama-2 !!**
23-
2420
<ins>**[Check out our instruction page to try out!](https://mlc.ai/mlc-llm/docs/get_started/try_out.html)**</ins>
2521

2622
<p align="center">
2723
<img src="site/gif/ios-demo.gif" height="700">
2824
</p>
2925

26+
## News
27+
28+
* [08/02/2023] [Dockerfile](https://github.com/junrushao/llm-perf-bench/) released for CUDA performance benchmarking
29+
* [07/19/2023] Supports 7B/13B/70B Llama-2
30+
3031
## What is MLC LLM?
3132

3233
In recent years, there has been remarkable progress in generative artificial intelligence (AI) and large language models (LLMs), which are becoming increasingly prevalent. Thanks to open-source initiatives, it is now possible to develop personal AI assistants using open-sourced models. However, LLMs tend to be resource-intensive and computationally demanding. To create a scalable service, developers may need to rely on powerful clusters and expensive hardware to run model inference. Additionally, deploying LLMs presents several challenges, such as their ever-evolving model innovation, memory constraints, and the need for potential optimization techniques.

0 commit comments

Comments
 (0)