Skip to content

Latest commit

 

History

History
310 lines (252 loc) · 15.7 KB

index.md

File metadata and controls

310 lines (252 loc) · 15.7 KB

ComMU

<iframe width="800" height="457" src="https://www.youtube.com/embed/ybKJGGWuX9U" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe> ComMU has 11,144 MIDI samples that consist of short note sequences created by professional composers with their corresponding 12 metadata. We propose combinatorial music generation, a new task that generate diverse and high-quality music only with metadata through auto-regressive language model. Here are the ComMU's 12 metadata: - BPM, genre, key, instrument, track-role, time signature, pitch range, number of measures, chord progression, min velocity, max velocity, and rhythm.

Examples of the dataset

  • #bpm: 100, #key: C major, #time_signature: 4/4 #number_of_measures: 8, #genre: cinematic, #rhythm: standard #track-role: accompaniment, #pitch_range: mid_low, #instruments: piano, #min_velocity: 36, #max_velocity: 40 #chord_progression: F → C → Am → G
  • #bpm: 120, #key: A minor, #time_signature: 3/4 #number_of_measures: 16, #genre: cinematic, #rhythm: standard #track-role: main_melody, #pitch_range: mid_high, #instruments: string, #min_velocity: 70, #max_velocity: 70 #chord_progression: Am → Em7 → FM7 → Em7 → Dm7 → CM7 → Bm7b5 → E7 → Am → Em7 → FM7 → Em7 → Dm7 → CM7 → Bm7b5 → Am

Pipeline of data collection

Combinatorial music generation

As shown above, the process of combinatorial music generation is divided into two stages. In stage 1, a note sequence is generated from a set of metadata. In stage 2, those note sequences are combined to produce a complete piece of music. ComMU is utilized to solve stage 1.

Stage 1

Audio samples are automatically generated only with descripted metadata.

  • Common metadata of the 5 clips are as follows:
    • #bpm: 130, #key: A minor, #time_signature: 4/4
    • #number_of_measures: 8, #genre: new age, #rhythm: standard
    • #chord_progression: Am → F → C → G → A m → F → C → G

  • #track-role: accompaniment, #pitch_range: mid_low, #instrument: piano, #min_velocity: 40, #max_velocity: 50
  • #track-role: main_melody, #pitch_range: mid, #instrument: piano, #min_velocity: 60, #max_velocity: 70
  • #track-role: pad, #pitch_range: mid_low, #instrument: piano, #min_velocity: 70, #max_velocity: 80
  • #track-role: pad, #pitch_range: mid_low, #instrument: string, #min_velocity: 2, #max_velocity: 127
  • #track-role: riff, #pitch_range: mid_high, #instrument: piano, #min_velocity: 70, #max_velocity: 80

Stage 2

A human composer combines the 5 above audio samples, putting only 3-4 minutes to create the full song below.

The followings are musics that randomly combined the 5 samples generated in stage1 while maintaining only the track-role of the samples. Although there are differences from the music combined by the composer above, they sounds harmonious because the chord progression between the samples is consistent.

Stage 1

Below is another set of samples with a different genre. Audio samples are automatically generated only with described metadata.

  • Common metadata of the 5 clips are as follows:
    • #bpm: 80, #key: A minor, #time_signature: 4/4
    • #number_of_measures: 8, #genre: cinematic, #rhythm: standard

  • #track-role: main_melody, #pitch_range: mid_high, #instrument: violin, #min_velocity: 1, #max_velocity: 127
  • #chord_progression: Am → Gmaj7 → Fmaj7 → G → Cmaj7 → Dm7 → Am → Bmaj7 → E → Am
  • #track-role: main_melody, #pitch_range: mid_high, #instrument: piano, #min_velocity: 25, #max_velocity: 60
  • #chord_progression: Am → Gmaj7 → Fmaj7 → G → Cmaj7 → Dm7 → Am → Bmaj7 → E → Am
  • #track-role: accompaniment, #pitch_range: mid_low, #instrument: piano, #min_velocity: 25, #max_velocity: 55
  • #chord_progression: Am → Gmaj7 → Fmaj7 → G → Cmaj7 → Dm7 → Am → Bmaj7 → E → Am
  • #track-role: main_melody, #pitch_range: mid_high, #instrument: string, #min_velocity: 1, #max_velocity: 127
  • #chord_progression: Dm7 → Em7 → Fmaj7 → Cmaj7 → Am7 → Em7 → Fmaj7 → Cmaj7 → F#m7b5 → Gsus4 → E7
  • #track-role: accompaniment, #pitch_range: mid_low, #instrument: piano, #min_velocity: 25, #max_velocity: 55
  • #chord_progression: Dm7 → Em7 → Fmaj7 → Cmaj7 → Am7 → Em7 → Fmaj7 → Cmaj7 → F#m7b5 → Gsus4 → E7

Stage 2

A human composer combines the 5 above audio samples, putting only 3-4 minutes to create the full song below.

Ground truth vs. Generated

Our system does not reconstruct ground-truth samples but generate samples that have originality.

Ground truth Generated

Diversity of Generated Music

We can check the diversity of generated music with same metadata. Corresponding metadata for each set of examples are listed here.

  • Piano Example: #bpm: 80, #key: A minor, #time_signature: 4/4, #number_of_measures: 8, #genre: cinematic, #rthythm: standard, #track-role: main_melody, #pitch_range: mid, #instrument: piano, #min_velocity: 25, #max_velocity: 60 #chord_progression: Dm7 → Em7 → Asus4 → Am → Em7 → Dmaj7 → Asus4 → Am → Dm7 → Em7 → Asus4 → Em7 → F#m7b5 → Em7 → Asus4 → A 
  • Violin Example: #bpm: 80, #key: A minor, #time_signature: 4/4, #number_of_measures: 8, #genre: cinematic, #rthythm: standard, #track-role: main_melody, #pitch_range: mid, #instrument: violin, #min_velocity: 1, #max_velocity: 127 #chord_progression: Am → Gmaj7 → Fmaj7 → G → Cmaj7 → Dm7 → Am → Bmaj7 → E → Am
Piano Example Violin Example

Multi-track with Track-role

Figure shows that the track-role can provide precise guidance to generated music.

Piano with 4 track-role

All metadata are the same except for track-role
Main Melody
Sub Melody
Accompaniment
Riff

String with 4 track-role

All metadata are the same except for track-role
Main Melody
Sub Melody
Accompaniment
Riff

License

The license of ComMU dataset is under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.