Sharan Sahu

I am a second-year PhD student in Statistics and Machine Learning at Cornell University, advised by Martin Wells and Yuchen Wu.

I am interested in statistical machine learning and establishing rigorous foundations for statistical and machine learning methods, while also developing new algorithms guided by theoretical insights. I am fortunate to have been supported by a Cornell University Fellowship.

Before Cornell, I was an undergraduate at UC Berkeley studying Computer Science, where I was advised by Iain Carmichael and Ryan Tibshirani.

If you're an undergrad or Master's student at Cornell and are interested in collaborating, please reach out!

Sharan Sahu

Research

Theory and methods in high-dimensional statistics, robust and stochastic optimization, reinforcement learning, language and diffusion models, sampling, and differential privacy.

Classifier Guidance

Provably Reliable Classifier Guidance through Cross-entropy Error Control

Sharan Sahu, Arisina Banerjee, and Yuchen Wu

Under mild smoothness, we show that controlling per-step cross-entropy (conditional KL) also controls guidance-vector error: \( \mathrm{KL} \le \varepsilon^{2} \) implies guidance MSE \( \widetilde{O}(d\varepsilon) \). This gives a sampling-error bound with a reverse log-Sobolev flavor.

ArXiv
SGD Momentum

On the Provable Suboptimality of Momentum SGD in Nonstationary Stochastic Optimization

Sharan Sahu, Cameron J. Hogan, and Martin T. Wells

A sharp finite-time theory showing when and why momentum fails: we decompose tracking error into initialization + noise + drift, proving momentum suppresses noise but amplifies drift with minimax dynamic-regret lower bounds.

ArXiv
DRO LLM Alignment

Online Distributionally Robust LLM Alignment via Regression to Relative Reward RL

Sharan Sahu and Martin T. Wells

A family of distributionally robust policy optimization algorithms for LLM alignment that reduce to simple relative-reward regressions, achieving minimax-optimal rates and significantly tighter constants than prior DRO-DPO methods.

ArXiv
Differential Privacy

Towards Optimal Differentially Private Regret Bounds in Linear Markov Decision Processes

Sharan Sahu

Privatizing LSVI-UCB++ with Bernstein bonus achieves state-of-the-art regret in linear MDPs under joint differential privacy with minimal utility loss.

ArXiv

Presentations & Talks

Towards Optimal Differentially Private Regret Bounds in Linear MDPs

Cornell University, 2025

View Slides →

The Machine Learning Problems Behind Large Language Models: Self-Supervision, Fine-Tuning, and Reinforcement Learning

University of North Carolina, Chapel Hill, 2025

View Slides →

Beyond RNNs: An Introduction to Transformers and LLMs

Cornell Tech (Break Through Tech), 2025

View Slides →

Unlocking the Power of Databases: The Crucial Role of Theory and Indices in Scalable Vector Databases for Machine Learning

Naval Postgraduate School, 2024

View Slides →

Miscellanea

SOP Video

Writing a Technical SOP for PhD / Research Master's Applications

A video with Sithija Manage breaking down how I approached graduate school applications and wrote my Statement of Purpose. We discuss what to include, how to describe research experiences with technical detail, and how to tailor your SOP for committee-driven vs PI-driven admissions.

PhD Journey Interview

Interview with Sithija Manage: My PhD Application Journey

An interview about my journey through graduate school applications. We discuss my undergraduate research at Berkeley, Stanford, and USC, the admissions process, writing statements of purpose, building mentorships, and evaluating program fit.

Liftoff Podcast

Interview with Aman Manazir: From USAMO Math to Quant Trading, ML PhD, and AI Startups

A podcast episode on the Liftoff Podcast with Aman Manazir. I share how I got interested in competitive math, balanced internships and research at Berkeley, Stanford, and USC, and the decision between quant research and a PhD at Cornell.