Introducing MuseNet, a deep neural network that can generate 4-minute musical compositions with 10 different instruments, and can combine styles from country to Mozart to the Beatles. MuseNet uses unsupervised technology to discover patterns of harmony, rhythm, and style by learning to predict the next token in hundreds of thousands of MIDI files. With MuseNet, musicians and non-musicians alike can create new compositions in simple or advanced mode, exploring the variety of musical styles the model can create.
Musenet Features
- MuseNet: A deep neural network that can generate 4-minute musical compositions with 10 different instruments, and can combine styles from country to Mozart to the Beatles.
- User Interaction: Users can interact with MuseNet in two modes - simple and advanced. In simple mode, users can hear random uncurated samples that the model has pre-generated. In advanced mode, users can interact with the model directly to create a new composition.
- Composition Control: Users have control over the kinds of samples MuseNet generates by providing a composer or style to prompt the model. Composer and instrumentation tokens were created to give more control over the kinds of samples MuseNet generates.
- Long-term Structure: MuseNet uses the recompute and optimized kernels of Sparse Transformer to train a 72-layer network with 24 attention heads - with full attention over a context of 4096 tokens. This long context may be one reason why it is able to remember long-term structure in a piece. MuseNet can create musical melodic structures and imitate the styles of various composers such as Chopin and Mozart.
- Dataset: Training data for MuseNet was collected from various sources such as ClassicalArchives, BitMidi, and the MAESTRO dataset. Several different ways were experimented with to encode the MIDI files into tokens suitable for the task.
Share: