Yu Bai
About MeI am a currently a Researcher at OpenAI. Previously, I was a Senior Research Scientist at Salesforce AI Research in Palo Alto, CA. My research interest lied broadly in machine learning, such as deep learning, large language models/foundation models, reinforcement learning, learning in games, and uncertainty quantification. Before joining Salesforce, I completed my PhD in Statistics at Stanford University (specializing in machine learning) in September 2019, where I was fortunate to be advised by Prof. John Duchi and was a member of the Machine Learning Group. Prior to Stanford, I was an undergrad in mathematics at Peking University. My research has focused on Large language models; Theoretical foundations of deep learning (blog post); Reinforcement learning theory (slides on partially observable RL); Multi-agent reinforcement learning and games (blog post, slides on MARL, slides on Extensive-Form Games); and Uncertainty quantification (slides), among others. Recent Work
Research Focus and Selected PublicationsFoundation Models and Transformers
Our goal is to discover new capabilities and new understandings of transformers and large language models.
Multi-Agent Reinforcement Learning Theory
We developed the first line of provably efficient algorithms for multi-agent reinforcement learning.
Deep Learning Theory
We developed optimization and generalization results for overparametrized neural networks beyond the Neural Tangent Kenrels (NTK) regime, and identified provable advantages over the NTK regime.
Partially Observable Reinforcement Learning
We designed sharp sample-efficient algorithms and studied the fundamental limits for partially observable reinforcement learning.
Learning in Games
We designed near-optimal algorithms for learning equilibria in various multi-player games under bandit feedback.
Uncertainty Quantification in Machine Learning
We gave precise theoretical characterizations of the calibration and coverage of vanilla machine learning algorithms, and developed new uncertainty quantificaiton algorithms with valid guarantees and improved efficiency.
|