Yu Bai

Yu Bai 

Yu Bai
Email: yu.bai (at) salesforce (dot) com

Curriculum Vitae | Google Scholar Profile | Github

About Me

I am a Senior Research Scientist at Salesforce AI Research in Palo Alto, CA. My research interest lies broadly in machine learning, such as deep learning, reinforcement learning, learning in games, and uncertainty quantification.

Before joining Salesforce, I completed my PhD in Statistics at Stanford University (specializing in machine learning) in September 2019, where I was fortunate to be advised by Prof. John Duchi and was a member of the Machine Learning Group. During my PhD I also spent times at the research labs of Google and Amazon. Prior to Stanford, I was an undergrad in mathematics at Peking University.

My research interest lies broadly in machine learning, with recent focus on

  • Theoretical foundations of deep learning (blog post);

  • Reinforcement learning theory (slides on partially observable RL);

  • Multi-agent reinforcement learning and games (blog post, slides on MARL, slides on Extensive-Form Games);

  • Uncertainty quantification (slides).


  • [May 2023] Invited talk at SIAM OP23, Seattle.

  • [Apr 2023] Three papers accepted at ICML 2023.

  • [Mar 2023] I will serve as an Area Chair for NeurIPS 2023.

  • [Jan 2023] Three papers accepted at ICLR 2023.

  • [Nov 2022] Excited to be giving an invited talk “Recent Progresses on the Theory of Multi-Agent Reinforcement Learning and Games” at Stanford CS332.

  • [Sep 2022] Four papers accepted at NeurIPS 2022.

  • [May 2022] Excited to be speaking at the RL theory seminar about our work on sample-efficient learning of general-sum Markov Games with a large number of players!

Recent Work

Research Focus and Selected Publications

Multi-Agent Reinforcement Learning Theory

We developed the first line of provably efficient algorithms for multi-agent reinforcement learning.

Deep Learning Theory

We developed optimization and generalization results for overparametrized neural networks beyond the Neural Tangent Kenrels (NTK) regime, and identified provable advantages over the NTK regime.

Partially Observable Reinforcement Learning

We designed sharp sample-efficient algorithms and studied the fundamental limits for partially observable reinforcement learning.

Learning in Games

We designed near-optimal algorithms for learning equilibria in various multi-player games under bandit feedback.

Uncertainty Quantification in Machine Learning

We gave precise theoretical characterizations of the calibration and coverage of vanilla machine learning algorithms, and developed new uncertainty quantificaiton algorithms with valid guarantees and improved efficiency.