Sonata is an autoregressive transformer for MIDI music generation, implemented in PyTorch.
Here's a 30-minute demo of Sonata's musical capabilities:
https://youtu.be/Zs9ool4Ddmw?t=87
- RoPE (Rotary Positional Embeddings)
- Relative Position Embeddings
- KV Caching
- Top-P & Top-K Sampling
- Mixed precision training (bf16)
- Multi-GPU training (DDP)
- Experiment tracking with Weights & Biases
- Script to run an endless YouTube livestream of generated music
- Serverless API for generating music on RunPod
- Data is augmented with pitch, velocity, and duration offsets
- Early stopping
- Label smoothing of 0.1 to prevent overfitting
- Adam optimizer with learning rate of 3e-4, cosine annealing, and weight decay of 0.1
- Batch size of 32 (16 x 2 accumulation steps)
This repository also includes a script to run an endless YouTube livestream of generated music. The demo video above is a recording of the livestream running for 30 minutes.
- Speculative decoding
- Quantization
- Deploy to web with ONNX Runtime