Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
126 changes: 13 additions & 113 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -218,12 +218,15 @@ bifrost/
│ ├── tests/ # Tests to make sure everything is in place
│ ├── bifrost.go # Main Bifrost implementation
├── docs/ # Documentations for Bifrost's configurations and contribution guides
│ └── ...
├── transports/ # Interface layers (HTTP, gRPC, etc.)
│ ├── bifrost-http/ # HTTP transport implementation
│ └── ...
└── plugins/ # Plugin Implementations
├── maxim-logger.go
├── maxim/
└── ...
```

Expand All @@ -240,7 +243,7 @@ If you want to **set up the Bifrost API quickly**, [check the transports documen
Bifrost is divided into three Go packages: core, plugins, and transports.

1. **core**: This package contains the core implementation of Bifrost as a Go package.
2. **plugins**: This package serves as an extension to core. You can download this package using `go get github.com/maximhq/bifrost/plugins` and pass the plugins while initializing Bifrost.
2. **plugins**: This package serves as an extension to core. You can download individual packages using `go get github.com/maximhq/bifrost/plugins/{plugin-name}` and pass the plugins while initializing Bifrost.

```golang
plugin, err := plugins.NewMaximLoggerPlugin(os.Getenv("MAXIM_API_KEY"), os.Getenv("MAXIM_LOGGER_ID"))
Expand All @@ -259,116 +262,11 @@ client, err := bifrost.Init(schemas.BifrostConfig{

### Additional Configurations

1. InitalPoolSize and DropExcessRequests: You can customise the initial pool size of the structs and channels bifrost creates on `bifrost.Init()`. A higher value would mean lesser run time allocations and lower latency but at the cost of more memory usage. Takes the defined default value if not provided.

```golang
client, err := bifrost.Init(schemas.BifrostConfig{
Account: &yourAccount,
InitialPoolSize: 500,
DropExcessRequests: true,
})
```

When `DropExcessRequests` is set to true, in cases where the queue is full, requests will not wait for the queue to be empty and will be dropped instead. By default it is set to false.

2. Logger: Like account interface, bifrost also allows you to pass your custom logger if it follows [bifrost's logger interface](https://github.com/maximhq/bifrost/blob/main/core/schemas/logger.go). Takes in the [default logger](https://github.com/maximhq/bifrost/blob/main/core/logger.go) if not provided.

```golang
client, err := bifrost.Init(schemas.BifrostConfig{
Account: &yourAccount,
Logger: &yourLogger,
})
```

The default logger is set to level info by default. If you wish to use it but with a different log level, you can do it like this -

```golang
client, err := bifrost.Init(schemas.BifrostConfig{
Account: &yourAccount,
Logger: bifrost.NewDefaultLogger(schemas.LogLevelDebug),
})
```

3. Plugins: You can create and pass your custom pre-hook and post-hook plugins to bifrost as long as they follow [bifrost's plugin interface](https://github.com/maximhq/bifrost/blob/main/core/schemas/plugin.go).

```golang
client, err := bifrost.Init(schemas.BifrostConfig{
Account: &yourAccount,
Plugins: []schemas.Plugin{yourPlugin1, yourPlugin2, ...},
})
```

4. Customise your provider settings: You can customise proxy config, timeouts, retry settings, concurrency buffer sizes for each of your provider in your account interface's GetConfigForProvider() method.

exmaple:

```golang
schemas.ProviderConfig{
NetworkConfig: schemas.NetworkConfig{
DefaultRequestTimeoutInSeconds: 30,
MaxRetries: 2,
RetryBackoffInitial: 100 * time.Millisecond,
RetryBackoffMax: 2 * time.Second,
},
MetaConfig: &meta.BedrockMetaConfig{
SecretAccessKey: os.Getenv("BEDROCK_ACCESS_KEY"),
Region: StrPtr("us-east-1"),
},
ConcurrencyAndBufferSize: schemas.ConcurrencyAndBufferSize{
Concurrency: 3,
BufferSize: 10,
},
ProxyConfig: &schemas.ProxyConfig{
Type: schemas.HttpProxy,
URL: yourProxyURL,
},
}
```

You can manage buffer size (maximum number of requests you want to hold in the system) concurrency (maximum number of requests you want to be made concurrently) for each provider. You can manage user usage and provider limits by providing these custom provider settings Default values are taken for network config, concurrecy and buffer sizes if not provided.

Bifrost also supports multiple API keys per provider, enabling both load balancing and redundancy. You can assign weights to each key to control how frequently they are selected for requests. By default, all keys are treated with equal weight unless specified otherwise.

```golang
[]schemas.Key{
{
Value: os.Getenv("OPEN_AI_API_KEY1"),
Models: []string{"gpt-4o-mini", "gpt-4-turbo"},
Weight: 0.6,
},
{
Value: os.Getenv("OPEN_AI_API_KEY2"),
Models: []string{"gpt-4-turbo"},
Weight: 0.3,
},
{
Value: os.Getenv("OPEN_AI_API_KEY3"),
Models: []string{"gpt-4o-mini"},
Weight: 0.1,
},
}
```

You can check [this](https://github.com/maximhq/bifrost/blob/main/core/tests/account.go) file to refer all the customisation settings.

5. Fallbacks: You can define fallback providers for each request, which will be used if all retry attempts with your primary provider fail. These fallback providers are attempted in the order you specify, provided they are configured in your account at runtime. Once a fallback is triggered, its own retry settings will apply, rather than those of the original provider.

```golang
result, err := bifrost.ChatCompletionRequest(
schemas.OpenAI, &schemas.BifrostRequest{
Model: "gpt-4o-mini",
Input: schemas.RequestInput{
ChatCompletionInput: &messages,
},
Fallbacks: []schemas.Fallback{
{
Provider: schemas.Anthropic,
Model: "claude-3-5-sonnet-20240620", // make sure you have configured this
},
},
}, context.Background()
)
```
- [Memory Management](https://github.com/maximhq/bifrost/blob/main/docs/memory-management.md)
- [Logger](https://github.com/maximhq/bifrost/blob/main/docs/logger.md))
- [Plugins](https://github.com/maximhq/bifrost/blob/main/docs/plugins.md)
- [Provider Configurations](https://github.com/maximhq/bifrost/blob/main/docs/providers.md)
- [Fallbacks](https://github.com/maximhq/bifrost/blob/main/docs/fallbacks.md)

---

Expand Down Expand Up @@ -457,7 +355,9 @@ This flexibility allows you to optimize Bifrost for your specific use case, whet

## 🤝 Contributing

Contributions are welcome! We welcome all kinds of contributions — bug fixes, features, docs, and ideas. Please feel free to submit a Pull Request.
We welcome contributions of all kinds—whether it's bug fixes, features, documentation improvements, or new ideas. Feel free to open an issue, and once its’ assigned, submit a Pull Request.

Here's how to get started (after picking up an issue):

1. Fork the repository
2. Create your feature branch (`git checkout -b feature/amazing-feature`)
Expand Down
205 changes: 205 additions & 0 deletions docs/fallbacks.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,205 @@
# Bifrost Fallback System

Bifrost provides a robust fallback mechanism that allows you to define alternative providers and models to use when the primary provider fails. This ensures high availability and reliability for your AI-powered applications.

## 1. How Fallbacks Work

1. When a request is made to a primary provider, Bifrost first attempts to complete the request using that provider
2. If the primary provider fails after all retry attempts, Bifrost automatically tries the fallback providers in the order specified
3. Each fallback provider uses its own retry settings and configuration set in your account implementation
4. The first successful fallback response is returned to the client

## 2. Configuring Fallbacks

### Basic Fallback Configuration

```golang
result, err := bifrost.ChatCompletionRequest(
context.Background(), &schemas.BifrostRequest{
Provider: schemas.OpenAI,
Model: "gpt-4",
Input: schemas.RequestInput{
ChatCompletionInput: &messages,
},
Fallbacks: []schemas.Fallback{
{
Provider: schemas.Anthropic,
Model: "claude-3-sonnet",
},
},
},
)
```

### Multiple Fallbacks

```golang
result, err := bifrost.ChatCompletionRequest(
context.Background(), &schemas.BifrostRequest{
Provider: schemas.OpenAI,
Model: "gpt-4",
Input: schemas.RequestInput{
ChatCompletionInput: &messages,
},
Fallbacks: []schemas.Fallback{
{
Provider: schemas.Anthropic,
Model: "claude-3-sonnet",
},
{
Provider: schemas.Bedrock,
Model: "anthropic.claude-3-sonnet",
},
{
Provider: schemas.Azure,
Model: "gpt-4",
},
},
},
)
```

## 3. Important Considerations

### Provider Configuration

- Each fallback provider must be properly configured in your account
- If a fallback provider is not configured, it will be skipped
- Each provider's configuration (retries, timeouts, etc.) is independent

### Model Compatibility

- Ensure that the fallback models support the same capabilities as your primary model
- Consider model-specific parameters and limitations
- Verify that the fallback models are available in your account

### Performance Impact

- Fallbacks add latency when the primary provider fails
- Consider the order of fallbacks based on:
- Provider reliability
- Model performance
- Cost considerations
- Geographic location

## 4. Best Practices

1. **Provider Selection**

- Choose fallback providers with different infrastructure
- Consider geographic distribution for high availability
- Balance cost and performance in fallback order

2. **Model Selection**

- Use models with similar capabilities
- Consider model-specific features (e.g., function calling, streaming)
- Account for different token limits and pricing

3. **Error Handling**

- Monitor fallback usage to identify provider issues
- Set up alerts for frequent fallback activations (can be done using bifrost's plugin interface)
- Regularly review and update fallback configurations

4. **Testing**
- Test fallback scenarios in development
- Verify all fallback providers are properly configured
- Simulate provider failures to ensure smooth fallback

## 5. HTTP Transport Examples

### Basic HTTP Fallback Request

```json
POST /v1/chat/completions
{
"provider": "openai",
"model": "gpt-4",
"input": {
"chat_completion_input": [
{
"role": "user",
"content": "Hello, how are you?"
}
]
},
"fallbacks": [
{
"provider": "anthropic",
"model": "claude-3-sonnet"
}
]
}
```

### HTTP Request with Multiple Fallbacks

```json
POST /v1/chat/completions
{
"provider": "openai",
"model": "gpt-4",
"input": {
"chat_completion_input": [
{
"role": "user",
"content": "Explain quantum computing"
}
]
},
"fallbacks": [
{
"provider": "anthropic",
"model": "claude-3-sonnet"
},
{
"provider": "bedrock",
"model": "anthropic.claude-3-sonnet"
},
{
"provider": "azure",
"model": "gpt-4"
}
],
"params": {
"temperature": 0.7,
"max_tokens": 1000
}
}
```

### HTTP Response Example

```json
{
"id": "chatcmpl-123",
"object": "chat.completion",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Quantum computing is a type of computing..."
},
"finish_reason": "stop"
}
],
"model": "claude-3-sonnet",
"usage": {
"prompt_tokens": 10,
"completion_tokens": 100,
"total_tokens": 110
},
"extra_fields": {
"provider": "anthropic",
"latency": 1.234,
"billed_usage": {
"prompt_tokens": 10.0,
"completion_tokens": 100.0
}
}
}
```

Note: The response includes metadata about which provider was used (in this case, the fallback provider "anthropic") and its performance metrics.
Loading