Happy Leap Day! I’ve just published some high-level details and sound reconstructions from a newly-trained model that decomposes musical audio into an easy-to-manipulate format:

Sparse Interpretable Audio Model Architecture

  • event times and amplitudes
  • time and amplitude-agnostic event vectors that describe a musical event
  • A global context vector that determines room reverb

Different “axes” of the sound can be manipulated independently, and indivdual events can be played in isolation.