๐ VeloraAI is a modern and flexible C# library designed to simplify local LLM integration. It allows developers to interact with quantized AI models directly from .NET Standard 2/.NET 8.0 applications โ with a single line of code. Whether you're building chatbots, creative tools, or AI companions, VeloraAI is optimized for speed, reliability, and customization.
- โก Quick-Start Model Loading โ Choose from pre-integrated models or load your own via
TestingMode
. - ๐ง Support for Multiple Models โ CrystalThink, Qwen, Mistral, DeepSeek, Llama and more.
- ๐ Event-driven Response System โ React to
TextGenerated
,ResponseStarted
, andResponseEnded
in real-time. - ๐ Customizable System Prompts โ Use friendly or aggressive instruction styles (e.g.,
NoBSMode
). - ๐ฆ Model Downloader โ Automatically fetches models from Hugging Face if not already available.
- ๐ท Experimental Vision Mode โ Send image + prompt for visual reasoning (WIP).
Exception thrown: 'System.TypeLoadException' in LLamaSharp.dll
Authentication failed: Could not load type 'LLama.Native.NativeApi' from assembly 'LLamaSharp, Version=0.0.0.0, Culture=neutral, PublicKeyToken=null' because the method 'llama_backend_free' has no implementation (no RVA).
The fix for this problem is currently unfamiliar, however there is a chance that finding this fix would need LLamaSharp's source code integrated into VeloraAI's project and modified to make it work properly with .NET Framework. This is not promised. You are free to fork the project and find a fix for this issue and push it.
- LLamaSharp โ Backbone inference engine.
- .NET 8.0 โ Modern C# support.
- WinForms & Console โ Sample UI and CLI clients included.
Example.mp4
Model | Size | Strengths |
---|---|---|
Crystal_Think_V2_Q4 | 2.32 GB | ๐ฅ Fast, tiny, math-heavy reasoning, Chain-of-Thought format |
Qwen_V3_4B_Chat | 2.70 GB | ๐ฅ Fast general model with good code and reasoning |
Mistral_7B_Chat | 2.87 GB | ๐ฅ Informative and precise longer-form chat |
Llama_7B_Chat | 3.07 GB | Reliable general conversations |
DeepSeek_6B_Coder | 3.07 GB | Code generation, math-only |
DeepSeek_7B_Chat | 5.28 GB | Slower general chat, strong context retention |
var result = await VeloraAI.AuthenticateAsync(VeloraAI.Models.Crystal_Think_V2_Q4);
if (result == VeloraAI.AuthState.Authenticated)
{
await VeloraAI.AskAsync("What is the capital of France?");
}
VeloraAI.TextGenerated += (_, text) => Console.Write(text);
VeloraAI.ResponseStarted += (_, __) => Console.WriteLine("\n[VELORA is typing...]");
VeloraAI.ResponseEnded += (_, __) => Console.WriteLine("\n\n[Done]");
VeloraAI.TestingMode = true;
VeloraAI.TestingModelPath = @"C:\path\to\your_model.gguf";
await VeloraAI.AuthenticateAsync(VeloraAI.Models.TestingModel);
Follows a natural conversational tone with emojis and personality.
Blunt, hyper-logical response style with no emotional overhead or filler.
await VeloraAI.AuthenticateAsync(VeloraAI.Models.Crystal_Think_V2_Q4, NoBSMode: true);
Models are downloaded on first use to:
%APPDATA%/VeloraAI
Progress can be tracked using:
VeloraAI.CurrentDownloadProgress;
VeloraAI.ResetHistory(); // or use custom system prompt
You can fine-tune Velora's behavior using the following optional parameters in AskAsync
:
Parameter | Description | Recommended for Speed |
---|---|---|
Temperature |
Controls randomness (lower = more deterministic) | 0.2 - 0.3 |
TopP |
Nucleus sampling threshold | 0.0 - 0.3 |
TopK |
Limits token pool to top-K options | 0 for fastest |
RepeatPenalty |
Penalizes repetition | 1.05 - 1.2 |
MaxTokens |
Maximum tokens to generate | 80 - 128 |
await VeloraAI.AskAsync(
userInput: "Summarize this paragraph.",
temperature: 0.25f,
TopP: 0.2f,
TopK: 0,
RepeatPenalty: 1.1f,
maxTokens: 80
);
Pull requests are welcome! Please submit improvements, optimizations, or new model integrations.
MIT
Authenticating model...
Authentication result: Authenticated
> What is 21 * 2?
[VELORA is typing...]
42
[Done]
- Developed by voidZiAD
- Powered by LLamaSharp, GGUF models, and the C#/.NET 8.0 ecosystem
"I'm VELORA โ not just another chatbot. I'm here to help you code, reason, and think clearer. No nonsense, just clarity."