Releases
Release
Version 0.0.4 (Current)
Major Features
Ollama API Compatibility : Added support for the Ollama API interface, allowing RKLLAMA to work with Ollama clients and tools.
Enhanced Streaming Responses : Improved reliability of streaming responses with better handling of completion signals.
Optional Debug Mode : Added detailed debugging tools that can be enabled with --debug
flag.
CPU Model Auto-detection : Automatic detection of RK3588 or RK3576 platform with fallback to interactive selection.
New API Endpoints
/api/tags
- List all available models (Ollama-compatible)
/api/show
- Show model information
/api/create
- Create a new model from a Modelfile
/api/pull
- Pull a model from Hugging Face
/api/delete
- Delete a model
/api/generate
- Generate a completion for a prompt
/api/chat
- Generate a chat completion
/api/embeddings
- (Placeholder) Generate embeddings
/api/debug
- Diagnostic endpoint (available only in debug mode)
Improvements
More reliable "done" signaling for streaming responses
Auto-detection of CPU model (RK3588 or RK3576) with fallback to user selection
Better error handling and error messages
Fixed threading issues in request processing
Automatic content formatting for various response types
Improved stream handling with token tracking
Optional debugging mode with detailed logs
Technical Changes
Added new utility modules for debugging and API handling
Improved thread management for streaming responses
Added CPU model detection and selection
Updated server configuration options
Made debugging tools optional through environment variable and command line flag
You can’t perform that action at this time.