Note
redis embedded language model, available for stand-alone version only
-
use rust/c/c++ impl redisxlm modules
-
redis x language model (load pre-trained model, instruction-tuned model); size (tiny|t, small|s, medium|m, large|l) with quantization;
-
Load model types:(text)
- embedding model
- generation(inference) model
-
Third-party open source libraries used:
- [done] https://github.com/karpathy/llama2.c (simple, inference Llama 2 in one file of pure C)
- [done] https://github.com/ggerganov/llama.cpp (Integrated nearly all open-source LLMs, including the following open-source LLMs)
- [done] https://github.com/google/gemma.cpp (Google's open-source LLM)
- https://github.com/li-plus/chatglm.cpp (LLM open-sourced by the Tsinghua University community)
- https://github.com/QwenLM/qwen.cpp (Similar to chatglm, LLM open-sourced by Alibaba)
- [todo] https://github.com/microsoft/onnxruntime/tree/main/rust (use onnxruntime to load small model to inference on the edge)