Yu Bai

Yu Bai 

Yu Bai

Curriculum Vitae | Google Scholar Profile | Email | Github

About Me

I am a currently a Researcher at OpenAI.

Previously, I was a Senior Research Scientist at Salesforce AI Research in Palo Alto, CA. My research interest lied broadly in machine learning, such as deep learning, large language models/foundation models, reinforcement learning, learning in games, and uncertainty quantification. Before joining Salesforce, I completed my PhD in Statistics at Stanford University (specializing in machine learning) in September 2019, where I was fortunate to be advised by Prof. John Duchi and was a member of the Machine Learning Group. Prior to Stanford, I was an undergrad in mathematics at Peking University.

My research has focused on Large language models; Theoretical foundations of deep learning (blog post); Reinforcement learning theory (slides on partially observable RL); Multi-agent reinforcement learning and games (blog post, slides on MARL, slides on Extensive-Form Games); and Uncertainty quantification (slides), among others.

Recent Work

Research Focus and Selected Publications

Foundation Models and Transformers

Our goal is to discover new capabilities and new understandings of transformers and large language models.

Multi-Agent Reinforcement Learning Theory

We developed the first line of provably efficient algorithms for multi-agent reinforcement learning.

Deep Learning Theory

We developed optimization and generalization results for overparametrized neural networks beyond the Neural Tangent Kenrels (NTK) regime, and identified provable advantages over the NTK regime.

Partially Observable Reinforcement Learning

We designed sharp sample-efficient algorithms and studied the fundamental limits for partially observable reinforcement learning.

Learning in Games

We designed near-optimal algorithms for learning equilibria in various multi-player games under bandit feedback.

Uncertainty Quantification in Machine Learning

We gave precise theoretical characterizations of the calibration and coverage of vanilla machine learning algorithms, and developed new uncertainty quantificaiton algorithms with valid guarantees and improved efficiency.