Skip to main content
School of Electronic Engineering and Computer Science

Self-Supervised Invariant Learning of Bird Sound Representations

When: Friday, June 21, 2024, 2:00 PM - 3:00 PM
Where: TBC, Mile End

Ilyass Moummad is a PhD student (December 2021 - November 2024) at IMT Atlantique, Brest, France. He works under the supervision of Nicolas Farrugia and is co-supervised by Romain Serizel. Currently, Ilyass is a Visiting Researcher at C4DM, QMUL, working under the supervision of Emmanouil Benetos (April - June 2024). Ilyass’s PhD topic is Deep Learning for Bioacoustics, with an interest in representation learning (both self-supervised and supervised) of animal sounds, as well as few-shot learning (species sound classification and detection from very few annotated examples).

 Self-supervised learning (SSL) involves the learning of data representations without any manual annotation. It consists of solving a pretext task relevant for learning informative data representations, which can be used for transfer learning to solve downstream tasks. Among the different learning paradigms, the most successful in learning discriminative features for classification tasks are “Invariant Learning” methods. They train the model to be insensitive to pre-defined transformations (e.g. if pitch shift is used as a transformation, the model is shown two versions of the same signal with different pitch shifts and is trained to output the same representation for both versions). The choice of data transformations for learning invariance is crucial and depends on the data domain and its relevance to downstream tasks.
In bioacoustics, it is not yet clear which data transformations the model should be robust to. In this work, we show that simple and domain-agnostic data augmentations (which do not use any prior knowledge of the nature of bioacoustic sounds) can learn robust and informative features. We evaluate the learned representations through transfer learning to downstream tasks with different challenges such as novel classes (downstream datasets can have classes never seen during pretraining), few-shot (very few)

This event will be hybrid. You can attend in person, or join online using this Zoom link:

Back to top