Mengyu Yang

Machine Learning PhD Student at Georgia Tech

About Me

Hi! I’m a second-year Machine Learning PhD student at Georgia Tech working with James Hays. I received my BASc in Engineering Science with Honours from the University of Toronto, where I specialized in Machine Intelligence. I’ve previously interned at Google AI during Fall 2023 working on audio-visual sound source localization.

Here’s my CV.

Research Interests

My interests lie at the intersection of computer vision and machine learning. My goal is to build models capable of understanding the visual world through multi-modal data.

Topics include:

  • Multi-modal learning (most recently in audiovisual learning)
  • Representation learning
  • Video understanding


(2023). The Un-Kidnappable Robot: Acoustic Localization of Sneaking People. ICRA 2024.

PDF Project

(2021). TriBERT: Human-centric Audio-visual Representation Learning. NeurIPS 2021.

PDF Code

(2020). Mask-Guided Discovery of Semantic Manifolds in Generative Models. NeurIPS 2020 Creativity Workshop.

PDF Code Slides

(2020). Musical Speech: A Transformer-based Composition Tool. NeurIPS 2020 Demonstration Track.