Skip to content

vnc-lm is a Discord bot that integrates leading large language model APIs.

License

Notifications You must be signed in to change notification settings

jake83741/vnc-lm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

2 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

vnc-lm

Introduction

vnc-lm is a Discord bot that integrates leading large language model APIs.

Load and manage language models through local or hosted API endpoints. Configure parameters, split conversations, and refine prompts to improve responses.

Screen Recording 2024-11-15 at 12 36 15β€―AM
Vision support

Features

Model Management

Load models using the /model command. Configure model behavior by adjusting the num_ctx (context length), system_prompt (base instructions), and temperature (response randomness) parameters. The bot sends notifications upon successful model loading. When models load, a thread will be created for the conversation. /model cannot be used in threads.

The initial prompt in a thread will be scraped for keywords and be used to rename the thread. To change models inside a thread, send + and a partial piece of the model name you wish to switch to. For example, to switch to Claude Sonnet 3.5, send + Claude, + Sonnet, or + 3.5.

When you switch models mid-conversation, your current conversation history and settings (system_prompt and temperature) will remain unchanged.

Resume any conversation just by sending a new message.

Edit any prompt to refine the subsequent model response. The bot generates a new response using your edited prompt, replacing the previous output.

When you edit or delete messages in Discord, these changes are immediately synchronized with the conversation cache and incorporated into the model's context for future responses.

Download new models by sending a model tag link in a channel. Models may not be downloaded inside threads.

https://ollama.com/library/llama3.2:1b-instruct-q8_0
https://huggingface.co/bartowski/Llama-3.2-1B-Instruct-GGUF/blob/main/Llama-3.2-1B-Instruct-Q8_0.gguf

Local models can be removed with the remove parameter of /model.

πŸ’‘ Model downloading and removal is turned off by default and can be enabled by configuring the .env.

QoL Improvements

Messages longer than 1500 characters are automatically paginated during generation. Message streaming is available with ollama. Other APIs handle responses quickly without streaming. The context window accepts text files, web links, and images. Deploy using Docker for a simplified setup.

Messages are cached and organized in bot_cache.json. The entrypoint.sh script maintains conversation history across Docker container restarts.

While both hosted APIs and Ollama support vision functionality, not all models have vision capabilities.

πŸ’‘ Message stop to end message generation early.

Requirements

Docker: Docker is a platform designed to help developers build, share, and run container applications. We handle the tedious setup, so you can focus on the code.

Supported APIs

Provider Description
ollama Get up and running with Llama 3.2, Mistral, Gemma 2, and other large language models.
OpenRouter A unified interface for LLMs. Find the best models & prices for your prompts. Use the latest state-of-the-art models from OpenAI, Anthropic, Google, and Meta.
Mistral Mistral AI is a research lab building the best open source models in the world. La Plateforme enables developers and enterprises to build new products and applications, powered by Mistral's open source and commercial LLMs.
Cohere The Cohere platform builds natural language processing and generation into your product with a few lines of code. Our large language models can solve a broad spectrum of natural language use cases, including classification, semantic search, paraphrasing, summarization, and content generation.
Groq Groq technology can be accessed by anyone via GroqCloudβ„’, while enterprises and partners can choose between cloud or on-prem AI compute center deployment.
GitHub Models If you want to develop a generative AI application, you can use GitHub Models to find and experiment with AI models for free. Once you are ready to bring your application to production, you can switch to a token from a paid Azure account.

πŸ’‘ Each API offers a free tier.

Environment Configuration

# clone the repository
git clone https://github.com/jake83741/vnc-lm.git

# enter the directory
cd vnc-lm

# rename the env file
mv .env.example .env

Configure the below fields in the .env:

Discord configuration

TOKEN: Discord bot token from the Discord Developer Portal. Set required bot permissions.
ADMIN: Discord user ID for model management permissions.
CHARACTER_LIMIT: Page embed character limit. Default: 1500
REQUIRE_MENTION: Toggle mention requirement. Default: false
MESSAGE_UPDATE_INTERVAL: Discord message update frequency. Lower values may trigger rate limits (default: 10)
MESSAGE_UPDATE_INTERVAL is specific to ollama.

Service configuration

USE_VISION: Turn vision on or off. Turning vision off will turn OCR on. (default: false)

API configuration

OLLAMAURL: ollama server URL. See API documentation. For Docker: http://host.docker.internal:11434
OPENROUTER_API_KEY: OpenRouter API key from the OpenRouter Dashboard
OPENROUTER_MODELS: Comma-separated OpenRouter model list
MISTRAL_API_KEY: Mistral API key from the Mistral Dashboard
MISTRAL_MODELS: Comma-separated Mistral model list
COHERE_API_KEY: Cohere API key from the Cohere Dashboard
COHERE_MODELS: Comma-separated Cohere model list
GROQ_API_KEY: Groq API key from the Groq Dashboard
GROQ_MODELS: Comma-separated Groq model list
GITHUB_API_KEY: GitHub API key from the GitHub Models Dashboard
GITHUB_MODELS: Comma-separated GitHub model list
Configure at least one API.

## Discord configuration

# Discord bot token
TOKEN=bKZ57JqLJq...
# Administrator Discord user ID
ADMIN=qddSPlT9MG...
# Character limit for page embeds (default: 1500)
CHARACTER_LIMIT=1500
# Require bot mention (default: false)
REQUIRE_MENTION=false
# API response chunk size before message update (default: 10)
MESSAGE_UPDATE_INTERVAL=10


## Generic model configuration

# Turn vision on or off. Turning vision off will turn OCR on. (default: false)
USE_VISION=true


## API configurations

# ollama server URL (default: http://localhost:11434)
# For Docker: http://host.docker.internal:11434
# Leave blank to not use ollama
OLLAMAURL=http://host.docker.internal:11434
# OpenRouter API Key
OPENROUTER_API_KEY=Zmda38qlVH...
# OpenRouter models (comma-separated)
OPENROUTER_MODELS="meta-llama/llama-3.1-405b-instruct:free, anthropic/claude-3.5-sonnet:beta, google/gemini-flash-1.5, "
# Mistral API Key
MISTRAL_API_KEY=0bbFJgRiPg...
# Mistral models (comma-separated)
MISTRAL_MODELS="mistral-large-latest, "
# Cohere API Key
COHERE_API_KEY=ijpyasLrFw...
# Cohere models (comma-separated)
COHERE_MODELS="command-r-plus-08-2024, "
# Groq API Key
GROQ_API_KEY=WRDSyLb11g...
# Groq models (comma-separated)
GROQ_MODELS="llama-3.1-70b-versatile, "
# Github API Key
GITHUB_API_KEY=0gHrfg6RZD...
# Github models (comma-separated)
GITHUB_MODELS="gpt-4o, "

Docker Installation (Preferred)

# build the container with Docker
docker compose up --build

πŸ’‘ Send /help for instructions on how to use the bot.

Manual Installation


npm install
npm run build
npm start

Usage

382479369-982b64f9-fb2d-44ed-b4dc-ef4d9884ca2b(1) Screen Recording 2024-11-16 at 7 08 20β€―PM
Use /model to load and configure models and to create threads. Quickly adjust model behavior using the optional parameters num_ctx, system_prompt, and temperature. Note that num_ctx only works with local ollama models.

Switch between models mid-conversation in threads by sending + followed by part of the model name. For example: + gpt for GPT-4o, + mistral for Mistral-Large-Latest, or + google for Gemini-Flash-1.5.

Screen Recording 2024-11-19 at 11 15 38β€―PM
Reply split to any message to create a new thread of the conversation from that point. A diagram of the thread relationship and a summary of the conversation up to the point where it was split will also be sent in the new thread.

Hop between different threads while maintaining separate conversation histories, allowing you to explore different directions with the same or different models.



Screen Recording 2024-11-15 at 12 41 10β€―AM
Edit any prompt to refine a model's response. Each edit automatically generates a new response that replaces the previous one. Your latest edits are saved and will be used for context in future responses.

Tree Diagram

.
β”œβ”€β”€ api-connections
β”‚   β”œβ”€β”€ base-client.ts
β”‚   β”œβ”€β”€ factory.ts
β”‚   └── provider
β”‚       β”œβ”€β”€ hosted
β”‚       β”‚   └── client.ts
β”‚       └── ollama
β”‚           └── client.ts
β”œβ”€β”€ bot.ts
β”œβ”€β”€ commands
β”‚   β”œβ”€β”€ command-registry.ts
β”‚   β”œβ”€β”€ help-command.ts
β”‚   β”œβ”€β”€ loading-comand.ts
β”‚   β”œβ”€β”€ model-command.ts
β”‚   β”œβ”€β”€ remove-command.ts
β”‚   β”œβ”€β”€ services
β”‚   β”‚   β”œβ”€β”€ ocr.ts
β”‚   β”‚   └── scraper.ts
β”‚   β”œβ”€β”€ stop-command.ts
β”‚   └── thread-command.ts
β”œβ”€β”€ managers
β”‚   β”œβ”€β”€ cache
β”‚   β”‚   β”œβ”€β”€ entrypoint.sh
β”‚   β”‚   β”œβ”€β”€ manager.ts
β”‚   β”‚   └── store.ts
β”‚   └── generation
β”‚       β”œβ”€β”€ controller.ts
β”‚       β”œβ”€β”€ messages.ts
β”‚       β”œβ”€β”€ pages.ts
β”‚       β”œβ”€β”€ processor.ts
β”‚       └── stream.ts
└── utilities
    β”œβ”€β”€ index.ts
    β”œβ”€β”€ settings.ts
    └── types.ts

Dependencies


{
  "dependencies": {
    "@azure-rest/ai-inference": "latest",
    "@azure/core-auth": "latest",
    "@mozilla/readability": "^0.5.0",
    "@types/xlsx": "^0.0.35",
    "axios": "^1.7.2",
    "cohere-ai": "^7.14.0",
    "discord.js": "^14.15.3",
    "dotenv": "^16.4.5",
    "jsdom": "^24.1.3",
    "keyword-extractor": "^0.0.27",
    "puppeteer": "^22.14.0",
    "sharp": "^0.33.5",
    "tesseract.js": "^5.1.0"
  },
  "devDependencies": {
    "@types/jsdom": "^21.1.7",
    "@types/node": "^18.15.25",
    "typescript": "^5.1.3"
  }
}

Notes


  1. Set higher num_ctx values when using attachments with large amounts of text.
  2. Vision models may have difficulty with follow-up questions.

License

This project is licensed under the MIT License.

Packages

No packages published