Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
106 changes: 42 additions & 64 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,15 +15,17 @@ Bifrost is an open-source middleware that serves as a unified gateway to various
1. **Create `config.json`**: This file should contain your provider settings and API keys.

```json
[
"openai": {
"keys": [{
"value": "env.OPENAI_API_KEY",
"models": ["gpt-4o-mini"],
"weight": 1.0
}],
},
]
{
"openai": {
"keys": [
{
"value": "env.OPENAI_API_KEY",
"models": ["gpt-4o-mini"],
"weight": 1.0
}
]
}
}
```

2. **Setup your Environment**: Add your environment variable to the session.
Expand All @@ -33,7 +35,7 @@ Bifrost is an open-source middleware that serves as a unified gateway to various
export ANTHROPIC_API_KEY=your_anthropic_api_key
```

Note: Make sure to add all the variables stated in your config.json file.
Note: Make sure to add all the variables stated in your `config.json` file.

3. **Start the Bifrost HTTP Server**:

Expand All @@ -42,23 +44,17 @@ Bifrost is an open-source middleware that serves as a unified gateway to various
#### i) Using Go Binary

- Install the transport package:

```bash
go install github.com/maximhq/bifrost/transports/bifrost-http@latest
```
- Run the server:

- If it's in your PATH:
- Run the server (make sure Go is set in your PATH):

```bash
bifrost-http -config config.json -port 8080 -pool-size 300
```

- Otherwise:

```bash
./bifrost-http -config config.json -port 8080 -pool-size 300
```

#### ii) OR Using Docker

- Download the Dockerfile:
Expand All @@ -71,7 +67,7 @@ Bifrost is an open-source middleware that serves as a unified gateway to various

```bash
docker build \
--build-arg CONFIG_PATH=./config.example.json \
--build-arg CONFIG_PATH=./config.json \
--build-arg PORT=8080 \
--build-arg POOL_SIZE=300 \
-t bifrost-transports .
Expand All @@ -83,7 +79,7 @@ Bifrost is an open-source middleware that serves as a unified gateway to various
docker run -p 8080:8080 -e OPENAI_API_KEY -e ANTHROPIC_API_KEY bifrost-transports
```

Note: Make sure to add all the variables stated in your config.json file.
Note: Make sure to add all the variables stated in your `config.json` file.

4. **Using the API**: Once the server is running, you can send requests to the HTTP endpoints.

Expand Down Expand Up @@ -180,8 +176,6 @@ For additional configurations in HTTP server setup, please read [this](https://g
- [Additional Configurations](#additional-configurations)
- [📊 Benchmarks](#-benchmarks)
- [Test Environment](#test-environment)
- [t3.medium Instance](#t3medium-instance)
- [t3.xlarge Instance](#t3xlarge-instance)
- [Performance Metrics](#performance-metrics)
- [Key Performance Highlights](#key-performance-highlights)
- [🤝 Contributing](#-contributing)
Expand Down Expand Up @@ -210,16 +204,18 @@ With Bifrost, you can focus on building your AI-powered applications without wor
- **Dynamic Key Management**: Rotate and manage API keys efficiently
- **Connection Pooling**: Optimize network resources for better performance
- **Concurrency Control**: Manage rate limits and parallel requests effectively
- **HTTP Transport**: RESTful API interface for easy integration
- **Custom Configuration**: Flexible JSON-based configuration
- **Flexible Transports**: Multiple transports for easy integration into your infra
- **Plugin First Architecture**: No callback hell, simple addition/creation of custom plugins
- **Custom Configuration**: Offers granular control over pool sizes, network retry settings, fallback providers, and network proxy configurations
- **Build in Observability**: Native Prometheus metrics out of the box, no wrappers, no sidecars, just drop it in and scrape

---

## 🏗️ Repository Structure

Bifrost is built with a modular architecture:

```
```text
bifrost/
├── core/ # Core functionality and shared components
│ ├── providers/ # Provider-specific implementations
Expand All @@ -239,7 +235,7 @@ bifrost/
└── ...
```

The system uses a provider-agnostic approach with well-defined interfaces to easily extend to new AI providers. All interfaces are defined in `core/schemas/` and can be used as a reference for adding new plugins.
The system uses a provider-agnostic approach with well-defined interfaces to easily extend to new AI providers. All interfaces are defined in `core/schemas/` and can be used as a reference for contributions.

---

Expand Down Expand Up @@ -283,29 +279,19 @@ client, err := bifrost.Init(schemas.BifrostConfig{

## 📊 Benchmarks

Bifrost has been tested under high load conditions to ensure optimal performance. The following results were obtained from benchmark tests running at 5000 requests per second (RPS) on different AWS EC2 instances, with Bifrost running inside Docker containers.
Bifrost has been tested under high load conditions to ensure optimal performance. The following results were obtained from benchmark tests running at 5000 requests per second (RPS) on different AWS EC2 instances.

### Test Environment

#### t3.medium Instance
**1. t3.medium(2 vCPUs, 4GB RAM)**

- **Instance**: AWS EC2 t3.medium
- **vCPUs**: 2
- **Memory**: 4GB RAM
- **Container**: Docker container with resource limits matching instance specs
- **Bifrost Configurations**:
- Buffer Size: 15,000
- Initial Pool Size: 10,000
- Buffer Size: 15,000
- Initial Pool Size: 10,000

#### t3.xlarge Instance
**2. t3.xlarge(4 vCPUs, 16GB RAM)**

- **Instance**: AWS EC2 t3.xlarge
- **vCPUs**: 4
- **Memory**: 16GB RAM
- **Container**: Docker container with resource limits matching instance specs
- **Bifrost Configurations**:
- Buffer Size: 20,000
- Initial Pool Size: 15,000
- Buffer Size: 20,000
- Initial Pool Size: 15,000

### Performance Metrics

Expand All @@ -326,41 +312,35 @@ Bifrost has been tested under high load conditions to ensure optimal performance
| HTTP Request | 1.56s | 1.50s |
| Error Handling | 189 ns | 162 ns |
| Response Parsing | 11.30 ms | 2.11 ms |
| **Bifrost's Overhead** | **`59 µs\*`** | **`11 µs\*`** |

_\*Bifrost's overhead is measured at 59 µs on t3.medium and 11 µs on t3.xlarge, excluding the time taken for JSON marshalling and the HTTP call to the LLM, both of which are required in any custom implementation._

**Note**: On the t3.xlarge, we tested with significantly larger response payloads (~10 KB average vs ~1 KB on t3.medium). Even so, response parsing time dropped dramatically thanks to better CPU throughput and Bifrost's optimized memory reuse.

### Key Performance Highlights

- **Perfect Success Rate**: 100% request success rate under high load on both instances
- **Total Overhead**: Less than only _15µs added per request_ on average
- **Efficient Queue Management**: Minimal queue wait time (1.67 µs on t3.xlarge)
- **Fast Key Selection**: Near-instantaneous key selection (10 ns on t3.xlarge)
- **Optimized Memory Usage**:
- t3.medium: ~1.3GB at 5000 RPS
- t3.xlarge: ~3.3GB at 5000 RPS
- **Efficient Request Processing**: Most operations complete in microseconds
- **Network Efficiency**:
- Consistent small request sizes (0.13 KB) across instances
- Larger response sizes on t3.xlarge (10.32 KB vs 1.37 KB) due to more detailed responses
- **Improved Performance on t3.xlarge**:
- 24% faster average latency
- 81% faster response parsing
- 58% faster JSON marshaling
- Significantly reduced queue wait times
- Higher buffer and pool sizes enabled by increased resources

One of Bifrost's key strengths is its flexibility in configuration. You can freely decide the tradeoff between memory usage and processing speed by adjusting Bifrost's configurations:
One of Bifrost's key strengths is its flexibility in configuration. You can freely decide the tradeoff between memory usage and processing speed by adjusting Bifrost's configurations. This flexibility allows you to optimize Bifrost for your specific use case, whether you prioritize speed, memory efficiency, or a balance between the two.

- **Memory vs Speed Tradeoff**:
- Higher buffer and pool sizes (like in t3.xlarge) improve speed but use more memory
- Lower configurations (like in t3.medium) use less memory but may have slightly higher latencies
- You can fine-tune these parameters based on your specific needs and available resources

- Higher buffer and pool sizes (like in t3.xlarge) improve speed but use more memory
- Lower configurations (like in t3.medium) use less memory but may have slightly higher latencies
- You can fine-tune these parameters based on your specific needs and available resources

- **Customizable Parameters**:
- Buffer Size: Controls the maximum number of concurrent requests
- Initial Pool Size: Determines the initial allocation of resources
- Concurrency Settings: Adjustable per provider
- Retry and Timeout Configurations: Customizable based on your requirements
- Buffer and Concurrency Settings: Controls the queue size and maximum number of concurrent requests (adjustable per provider).
- Retry and Timeout Configurations: Customizable based on your requirements for each provider.

This flexibility allows you to optimize Bifrost for your specific use case, whether you prioritize speed, memory efficiency, or a balance between the two.
Curious? Run your own benchmarks. The [Bifrost Benchmarking](https://github.com/maximhq/bifrost-benchmarking) repo has everything you need to test it in your own environment.

---

Expand All @@ -382,6 +362,4 @@ Here's how to get started (after picking up an issue):

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

---

Built with ❤️ by [Maxim](https://github.com/maximhq)
36 changes: 14 additions & 22 deletions transports/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,17 +38,19 @@ Bifrost uses a combination of a JSON configuration file and environment variable

```json
{
"keys": [
{
"value": "env.OPENAI_API_KEY",
"models": ["gpt-4o-mini", "gpt-4-turbo"],
"weight": 1.0
}
]
"openai": {
"keys": [
{
"value": "env.OPENAI_API_KEY",
"models": ["gpt-4o-mini"],
"weight": 1.0
}
]
}
}
```

In this example, `OPENAI_API_KEY` refers to a key set in your environment. At runtime, its value will be used to replace the placeholder.
In this example config file, `OPENAI_API_KEY` refers to a key set in your environment. At runtime, its value will be used to replace the placeholder.

The same setup applies to keys in meta configs of all providers:

Expand All @@ -63,7 +65,7 @@ The same setup applies to keys in meta configs of all providers:

In this example, `BEDROCK_ACCESS_KEY` and `BEDROCK_REGION` refer to keys in the environment.

Please refer to `config.example.json` for examples.
**Please refer to `config.example.json` for examples.**

### Docker Setup

Expand All @@ -77,7 +79,7 @@ curl -L -o Dockerfile https://raw.githubusercontent.com/maximhq/bifrost/main/tra

```bash
docker build \
--build-arg CONFIG_PATH=./config.example.json \
--build-arg CONFIG_PATH=./config.json \
--build-arg PORT=8080 \
--build-arg POOL_SIZE=300 \
-t bifrost-transports .
Expand Down Expand Up @@ -116,20 +118,12 @@ If you wish to run Bifrost in your Go environment, follow these steps:
go install github.com/maximhq/bifrost/transports/bifrost-http@latest
```

2. Run your binary:

- If it's in your PATH:
2. Run your binary (make sure Go is set in your PATH):

```bash
bifrost-http -config config.json -port 8080 -pool-size 300
```

- Otherwise:

```bash
./bifrost-http -config config.json -port 8080 -pool-size 300
```

You can also add a flag for `-drop-excess-requests=false` in your command to drop excess requests when the buffer is full. Read more about `DROP_EXCESS_REQUESTS` and `POOL_SIZE` [here](https://github.com/maximhq/bifrost/tree/main?tab=README-ov-file#additional-configurations).

## 🧰 Usage
Expand Down Expand Up @@ -193,7 +187,7 @@ You can explore the available plugins [here](https://github.com/maximhq/bifrost/

For eg. `-plugins maxim`

Note: Please check plugin specific documentations (github.com/maximhq/bifrost/tree/main/plugins/{plugin_name}) for more nuanced control.
Note: Please check plugin specific documentations (github.com/maximhq/bifrost/tree/main/plugins/{plugin_name}) for more nuanced control and any additional setup.

### Fallbacks

Expand All @@ -219,6 +213,4 @@ Configure fallback options in your requests:

Read more about fallbacks and other additional configurations [here](https://github.com/maximhq/bifrost/tree/main?tab=README-ov-file#additional-configurations).

---

Built with ❤️ by [Maxim](https://github.com/maximhq)