Skip to content

JinayJain/sonata

Repository files navigation

Sonata

Sonata is an autoregressive transformer for MIDI music generation, implemented in PyTorch.

Here's a 30-minute demo of Sonata's musical capabilities:

https://youtu.be/Zs9ool4Ddmw?t=87

Sonata Demo

Technical Details

  • RoPE (Rotary Positional Embeddings)
  • Relative Position Embeddings
  • KV Caching
  • Top-P & Top-K Sampling
  • Mixed precision training (bf16)
  • Multi-GPU training (DDP)
  • Experiment tracking with Weights & Biases
  • Script to run an endless YouTube livestream of generated music
  • Serverless API for generating music on RunPod

Training details on MAESTRO-v3

  • Data is augmented with pitch, velocity, and duration offsets
  • Early stopping
  • Label smoothing of 0.1 to prevent overfitting
  • Adam optimizer with learning rate of 3e-4, cosine annealing, and weight decay of 0.1
  • Batch size of 32 (16 x 2 accumulation steps)

MAESTRO-v3 train/val accuracy

Realtime YouTube Livestream

This repository also includes a script to run an endless YouTube livestream of generated music. The demo video above is a recording of the livestream running for 30 minutes.

Sonata YouTube Livestream

TODO

  • Speculative decoding
  • Quantization
  • Deploy to web with ONNX Runtime

About

MIDI music generation transformer in PyTorch

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published