Skip to main content
Events

DERI Seminar with Antoine Yang from Google DeepMind

When: Thursday, March 21, 2024, 11:00 AM - 12:00 PM
Where: Zoom

Speaker:  Antoine Yang is a Research Scientist at Google DeepMind

Zoom link: https://qmul-ac-uk.zoom.us/j/81148100921

Title: Learning to describe multi-event videos from Web supervision 

Abstract: This talk introduces several novel contributions to long video description. Firstly, we present Vid2Seq, a multi-modal single-stage model for dense event captioning. Vid2Seq can be effectively pretrained on unlabeled narrated videos at scale, using transcribed speech as pseudo supervision. Secondly, we release VidChapters-7M, a dataset of 7M videos segmented into chapters by online users. We study the task of video chapter generation, and show that pretraining Vid2Seq for video chapter generation on VidChapters-7M significantly improves performance on dense event captioning. 

Bio: Antoine Yang is a Research Scientist at Google DeepMind, working on the multi-modal capabilities of Gemini. He completed his PhD at Inria Paris in 2023, and received a double MEng from Ecole Polytechnique and ENS Paris-Saclay in 2020. 

 

 

 

 

Back to top