Colloquium: Taking the Best of Physics and Machine Learning in Robot Audition

14 October 2019

In the era of deep learning, building autonomous systems that perform highly specific tasks in the harsh conditions of the physical world may seem incompatible with the ever growing need for large volumes of cleanly annotated training data. How to reconciliate these two worlds? In this seminar, we will shed light on this question through the study of several concrete applications in the exciting field of robot audition. We will show how both physical modeling and machine learning can be leveraged and even combined to tackle some of the key challenges in this field.

Robot audition has received growing research interest over the past two decades, sparked by the need for robots that can naturally interact with humans via speech. This includes identifying who talks to whom and when and recognizing speech in real-world conditions. While these high-level goals include many conventional audio signal processing tasks, robot audition also comes with unique challenges and opportunities: How to handle the noise and the possibly fast movements created by the robot itself? How to leverage motor control? How to fuse information from different modalities? We will present a panoramic view of our contributions to some of these questions, using tools from physics, signal processing and machine learning. The presentation will be illustrated with recent research results on different robotic platforms from social robots to rescue drones.

Antoine Deleforge is a tenured research scientist with Inria since January 2016. He started in the PANAMA research group (Rennes, France) before moving to his current team MULTISPEECH (Nancy, France) in April 2018. His research lies at the interface between statistical machine learning, acoustics and audio signal processing with main applications to auditory scene analysis and robot audition. He received the engineering B.Sc. (2008) and M.Sc. (2010) degrees in computer science and mathematics from the school Ensimag (Grenoble, France), as well as the specialized M.Sc. research degree in computer graphics, vision and robotics from the Université Joseph Fourier (Grenoble, France). In November 2013, he received the Ph.D. degree in computer sciences and applied mathematics of the university of Grenoble (France). His thesis was awarded the GRETSI-EEA-ISIS French PhD prize in signal image and vision in 2014, and he received best paper awards in 2015 and 2016. He was employed as a postdoctoral fellow (2014-2015) at the chair of Multimedia Communications and Signal Processing of the Friedrich-Alexander-University (Erlangen, Germany). He co-chaired and co-organized numerous special sessions, tutorials and competitions at the international conferences ICASSP (2016, 2018) and LVA/ICA (2015, 2017, 2018). He serves as an elected member of the IEEE technical committee on Audio Acoustics and Signal Processing since 2018.

Monday, 14 October 2019, 17:15, Lecture Hall B-201, Informatikum, Stellingen

Speaker: Antoine Deleforge, Inria Nancy