Skip to main content
School of Electronic Engineering and Computer Science

Ningzhi Wang



Project title:Generative Models For Music Audio Representation And Understanding

Industry partner: Spotify 

C4DM theme affiliation: Music Informatics, Music Cognition, Sound Synthesis 

Abstract: Interests in generative models grew rapidly over the last decade. As problems in music information retrieval (MIR) shares a lot of common features with NLP, pre-training with generative models is now attracting more interest in MIR. 

In contrast to only learning conditional probability, generative models trying to find a joint probability, describing the overall distribution of the dataset and sampling realistic data points. Generative models trained on unlabelled data points can be easily fine-tuned for most down-stream tasks and achieved massive success in the field of transfer learning.

Transfer learning is beneficial for problems with small training data (usually less one thousand) while large datasets (around one million) are available in related tasks, where models can be pre-trained on those datasets learning relating representations for input data and fine-tuned using a small dataset for specific discrimination. Training generative models then naturally become an appealing candidate for pre-training tasks as it only requires unlabelled datasets which can be collected cheaply through internet and GPT-3.  


Back to top