Skip to content

Commit de3e53a

Browse files
feat: Add Grafana and Perces monitoring dashboards for vLLM (#23498)
1 parent 85e0df1 commit de3e53a

File tree

7 files changed

+3515
-0
lines changed

7 files changed

+3515
-0
lines changed
Lines changed: 87 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,87 @@
1+
# Monitoring Dashboards
2+
3+
This directory contains monitoring dashboard configurations for vLLM, providing
4+
comprehensive observability for your vLLM deployments.
5+
6+
## Dashboard Platforms
7+
8+
We provide dashboards for two popular observability platforms:
9+
10+
- **[Grafana](https://grafana.com)**
11+
- **[Perses](https://perses.dev)**
12+
13+
## Dashboard Format Approach
14+
15+
All dashboards are provided in **native formats** that work across different
16+
deployment methods:
17+
18+
### Grafana (JSON)
19+
20+
- ✅ Works with any Grafana instance (cloud, self-hosted, Docker)
21+
- ✅ Direct import via Grafana UI or API
22+
- ✅ Can be wrapped in Kubernetes operators when needed
23+
- ✅ No vendor lock-in or deployment dependencies
24+
25+
### Perses (YAML)
26+
27+
- ✅ Works with standalone Perses instances
28+
- ✅ Compatible with Perses API and CLI
29+
- ✅ Supports Dashboard-as-Code workflows
30+
- ✅ Can be wrapped in Kubernetes operators when needed
31+
32+
## Dashboard Contents
33+
34+
Both platforms provide equivalent monitoring capabilities:
35+
36+
| Dashboard | Description |
37+
|-----------|-------------|
38+
| **Performance Statistics** | Tracks latency, throughput, and performance metrics |
39+
| **Query Statistics** | Monitors request volume, query performance, and KPIs |
40+
41+
## Quick Start
42+
43+
First, navigate to this example's directory:
44+
45+
```bash
46+
cd examples/online_serving/dashboards
47+
```
48+
49+
### Grafana
50+
51+
Import the JSON directly into the Grafana UI, or use the API:
52+
53+
```bash
54+
curl -X POST http://grafana/api/dashboards/db \
55+
-H "Content-Type: application/json" \
56+
-d @grafana/performance_statistics.json
57+
```
58+
59+
### Perses
60+
61+
Import via the Perses CLI:
62+
63+
```bash
64+
percli apply -f perses/performance_statistics.yaml
65+
```
66+
67+
## Requirements
68+
69+
- **Prometheus** metrics from your vLLM deployment
70+
- **Data source** configured in your monitoring platform
71+
- **vLLM metrics** enabled and accessible
72+
73+
## Platform-Specific Documentation
74+
75+
For detailed deployment instructions and platform-specific options, see:
76+
77+
- **[Grafana Documentation](./grafana)** - JSON dashboards, operator usage, manual import
78+
- **[Perses Documentation](./perses)** - YAML specs, CLI usage, operator wrapping
79+
80+
## Contributing
81+
82+
When adding new dashboards, please:
83+
84+
1. Provide native formats (JSON for Grafana, YAML specs for Perses)
85+
2. Update platform-specific README files
86+
3. Ensure dashboards work across deployment methods
87+
4. Test with the latest platform versions
Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
# Grafana Dashboards for vLLM Monitoring
2+
3+
This directory contains Grafana dashboard configurations (as JSON) designed to monitor
4+
vLLM performance and metrics.
5+
6+
## Requirements
7+
8+
- Grafana 8.0+
9+
- Prometheus data source configured in Grafana
10+
- vLLM deployment with Prometheus metrics enabled
11+
12+
## Dashboard Descriptions
13+
14+
- **[performance_statistics.json](./performance_statistics.json)**: Tracks performance metrics including latency and
15+
throughput for your vLLM service.
16+
- **[query_statistics.json](./query_statistics.json)**: Tracks query performance, request volume, and key
17+
performance indicators for your vLLM service.
18+
19+
## Deployment Options
20+
21+
### Manual Import (Recommended)
22+
23+
The easiest way to use these dashboards is to manually import the JSON configurations
24+
directly into your Grafana instance:
25+
26+
1. Navigate to your Grafana instance
27+
2. Click the '+' icon in the sidebar
28+
3. Select 'Import'
29+
4. Copy and paste the JSON content from the dashboard files, or upload the JSON files
30+
directly
31+
32+
### Grafana Operator
33+
34+
If you're using the [Grafana Operator](https://github.com/grafana-operator/grafana-operator)
35+
in Kubernetes, you can wrap these JSON configurations in a `GrafanaDashboard` custom
36+
resource:
37+
38+
```yaml
39+
# Note: Adjust the instanceSelector to match your Grafana instance's labels
40+
# You can check with: kubectl get grafana -o yaml
41+
apiVersion: grafana.integreatly.org/v1beta1
42+
kind: GrafanaDashboard
43+
metadata:
44+
name: vllm-performance-dashboard
45+
spec:
46+
instanceSelector:
47+
matchLabels:
48+
dashboards: grafana # Adjust to match your Grafana instance labels
49+
folder: "vLLM Monitoring"
50+
json: |
51+
# Replace this comment with the complete JSON content from
52+
# performance_statistics.json - The JSON should start with { and end with }
53+
```
54+
55+
Then apply to your cluster:
56+
57+
```bash
58+
kubectl apply -f your-dashboard.yaml -n <namespace>
59+
```

0 commit comments

Comments
 (0)