This guide provides practical examples of using LocalLab in your projects. Each example includes code snippets and explanations.
- Basic Usage
- Text Generation
- Chat Completion
- Streaming Responses
- Batch Processing
- Model Management
- Error Handling
First, set up your LocalLab environment:
import asyncio
from locallab.client import LocalLabClient
async def main():
# Initialize client
client = LocalLabClient("http://localhost:8000") # or "https://your-ngrok-url.ngrok.app"
try:
# Check if server is healthy
is_healthy = await client.health_check()
print(f"Server status: {'Ready' if is_healthy else 'Not Ready'}")
# Your code here...
finally:
# Always close the client when done
await client.close()
# Run your async code
asyncio.run(main())
Generate text with default settings:
async def generate_text():
client = LocalLabClient("http://localhost:8000") # or "https://your-ngrok-url.ngrok.app"
try:
response = await client.generate(
"Write a short story about a robot"
)
print(response)
finally:
await client.close()
Control the generation with parameters:
response = await client.generate(
prompt="Write a poem about coding",
temperature=0.7, # Control creativity (0.0 to 1.0)
max_length=100, # Maximum length of response
top_p=0.9 # Nucleus sampling parameter
)
Have a simple conversation:
async def chat_example():
client = LocalLabClient("http://localhost:8000") # or "https://your-ngrok-url.ngrok.app"
try:
response = await client.chat([
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is Python?"}
])
print(response.choices[0].message.content)
finally:
await client.close()
Maintain a conversation thread:
messages = [
{"role": "system", "content": "You are a math tutor."},
{"role": "user", "content": "Can you help me with algebra?"},
{"role": "assistant", "content": "Of course! What would you like to know?"},
{"role": "user", "content": "Explain quadratic equations."}
]
response = await client.chat(messages)
Get responses token by token:
async def stream_example():
client = LocalLabClient("http://localhost:8000") # or "https://your-ngrok-url.ngrok.app"
try:
print("Generating story: ", end="", flush=True)
async for token in client.stream_generate("Once upon a time"):
print(token, end="", flush=True)
print() # New line at end
finally:
await client.close()
Stream chat responses:
async def stream_chat():
async for token in client.stream_chat([
{"role": "user", "content": "Tell me a story"}
]):
print(token, end="", flush=True)
The streaming generation now maintains context of the conversation for more coherent responses:
async def stream_with_context():
client = LocalLabClient("http://localhost:8000")
try:
# First response
print("Q: Tell me a story about a robot")
async for token in client.stream_generate("Tell me a story about a robot"):
print(token, end="", flush=True)
print("\n")
# Follow-up question (will have context from previous response)
print("Q: What happens next in the story?")
async for token in client.stream_generate("What happens next in the story?"):
print(token, end="", flush=True)
print("\n")
finally:
await client.close()
The client maintains a context of recent exchanges, allowing for more coherent follow-up responses. The context is automatically managed and includes up to 5 previous exchanges.
Generate responses for multiple prompts efficiently:
async def batch_example():
prompts = [
"Write a haiku",
"Tell a joke",
"Give a fun fact"
]
responses = await client.batch_generate(prompts)
for prompt, response in zip(prompts, responses["responses"]):
print(f"\nPrompt: {prompt}")
print(f"Response: {response}")
Switch between different models:
async def model_management():
client = LocalLabClient("http://localhost:8000") # or "https://your-ngrok-url.ngrok.app"
try:
# List available models
models = await client.list_models()
print("Available models:", models)
# Load a specific model
await client.load_model("microsoft/phi-2")
# Get current model info
model_info = await client.get_current_model()
print("Current model:", model_info)
# Generate with loaded model
response = await client.generate("Hello!")
print(response)
finally:
await client.close()
Properly handle potential errors:
async def error_handling():
try:
# Try to connect
client = LocalLabClient("http://localhost:8000") # or "https://your-ngrok-url.ngrok.app"
# Check server health
if not await client.health_check():
print("Server is not responding")
return
# Try generation
try:
response = await client.generate("Hello!")
print(response)
except Exception as e:
print(f"Generation failed: {str(e)}")
except ConnectionError:
print("Could not connect to server")
except Exception as e:
print(f"Unexpected error: {str(e)}")
finally:
await client.close()
-
Always Close the Client
try: # Your code here finally: await client.close()
-
Check Server Health
if not await client.health_check(): print("Server not ready") return
-
Use Proper Error Handling
try: response = await client.generate(prompt) except Exception as e: print(f"Error: {str(e)}")
-
Monitor System Resources
info = await client.get_system_info() print(f"Memory usage: {info.memory_usage}%")
- Check the API Reference for detailed parameter information
- Learn about Advanced Features
- See Performance Guide for optimization tips
- Visit Troubleshooting if you encounter issues
Need more examples? Check our Community Examples or ask in our Discussion Forum.