Skip to main content
School of Electronic Engineering and Computer Science

Aditya Bhattacharjee



Project Title: Self-supervision in Audio Fingerprinting 

Affiliation Theme: Music Informatics, Machine Listening 

Abstract: Recent developments in deep learning have proven that supervised learning models perform very well on any problem which it is trained to solve if massive amounts of carefully labelled data is available. However, there is a paradigm shift in recent years towards unsupervised and self-supervised representation learning due to the large cost and overall impracticality of creating large datasets with labels that are relevant to a particular task. This is especially relevant to the field of music information retrieval where massive human-annotated datasets are difficult to procure or suffer from ambiguity caused by the bias of the human annotators. The proposed research project is centered around the use of self-supervised representation learning for audio fingerprinting. The project would entail a systematic investigation of different self-supervised learning frameworks that have been proposed in NLP and vision-related works that might be suitable for learning an embedding space that could generate a large audio fingerprint database without loss of query-matching performance. The research would base this performance on robustness to various distortions such as gain, background noise and impulse response. Further, the project will also establish non-traditional robustness criteria such as robustness to pitch-shifting and time-stretching. The goal of the aforementioned investigations would be to develop a novel self-supervised framework suitable for robust audio fingerprinting at a real-world scale. 


Back to top