Sharut Gupta

I am a fourth-year Ph.D. student at MIT CSAIL, advised by Prof. Phillip Isola and Prof. Stefanie Jegelka. I have spent time at Google DeepMind (Gemini) and at Meta Superintelligence Labs (FAIR), working on post-training large language models. Previously, I completed my undergraduate studies in Mathematics and Computing at the Indian Institute of Technology, Delhi (IIT Delhi), where I was mentored by Prof. Yoshua Bengio for my thesis.

My long-term research goal is to build intelligent systems that can understand, integrate, and reason over continuous, multimodal, real-world sensory inputs. To this end, I focus on two key intertwined paths:

Representation Learning: How do we pretrain models to learn from diverse, unpaired, heterogeneous data and discover a grounded representation that enables multimodal understanding?
Adaptive Intelligence: How do we design algorithms that enable efficient and robust adaptation under continuous distribution shifts, changing users, and novel tasks?

What's New

Check out the latest news about my research, talks and more.

ResearchJan 2026
Excited to organize the Any-to-Any Multimodal Learning (A2A-MML) Workshop at CVPR 2026!
ResearchJan 2026
Our work on Unpaired Multimodal Learning got accepted at ICLR 2026! (paper)
ResearchJan 2026
Our work on Prefix-Scannable Models got accepted at ICLR 2026! (paper)
Talks & PanelsJan 2026
Gave an invited talk at the AI, ML and Computer Vision Meetup hosted by Microsoft and Voxel51 (recording).
Talks & PanelsJan 2026
Serving on the panel for Artificial General Intelligence (AGI) at NCRC 2026 at Harvard University.
Talks & PanelsOct 2025
Excited to attend and give a talk at the Aspen Meeting on Foundation Models!
ResearchSep 2025
Our work on Representation Guidance for Diffusion Models got accepted at NeurIPS 2025! (paper)
Awards & PressSep 2025
Awarded the MathWorks Engineering Fellowship; thank you, MathWorks!
ResearchMay 2025
Thrilled to be interning at Meta Superintelligence Labs this summer with Mohammad Pezeshki and Mark Ibrahim!
Awards & PressMay 2025
Got featured in MIT's CSAIL Alliances Student Spotlight!
Awards & PressApr 2025
Recieved the finalist award (top 3) for the Citadel GQS PhD Fellowship.
Awards & PressApr 2025
Received the Citadel Securities PhD Summit Award.
Talks & PanelsFeb 2025
Excited to join MIT's CSAIL Alliances podcast to discuss my recent work (recording).
ResearchJan 2025
Our work on Learning Disentangled Multimodal Representations got accepted at ICLR 2025! (paper)
Talks & PanelsDec 2024
Gave a talk at the NeurIPS 2024 Workshop on Self-Supervised Learning – Theory and Practice (recording).
Talks & PanelsNov 2024
Invited to speak at the MIT CSAIL Embodied Intelligence Seminar (recording).
Awards & PressDec 2024
Our NeurIPS 2024 paper, In-Context Symmetries was featured by MIT News!
Awards & PressDec 2024
Received the Top Reviewer at NeurIPS 2024.
Awards & PressDec 2024
Received the Honorable Mention Award at the UniReps Workshop, NeurIPS 2024.
Awards & PressApr 2024
Recognized as a finalist for the Jane Street Graduate Research Fellowship.
Awards & PressSept 2022
Received the MIT Presidential Fellowship!
ResearchSep 2024
Our work on In-Context Symmetries got accepted at NeurIPS 2024! (paper)
ResearchSep 2024
Our recent paper on the role of equivariance in SSL got accepted at NeurIPS 2024! (paper)
Talks & PanelsOct 2024
Gave a talk at the MIT Machine Learning Tea (ML Tea) Seminar series.
Talks & PanelsSep 2024
Invited to present at TAG-DS Pacific Northwest Seminar on Topology, Algebra, and Geometry in Data Science.
Talks & PanelsMay 2024
Spoke at the Quantitative Translational Imaging in Medicine (QTIM) Lab, Harvard University.
Talks & PanelsMar 2024
Presented at the MIT LIDS and STATS Tea Talk series.
ResearchJun 2024
Thrilled to be interning at in the Gemini Team at Google DeepMind this summer with Dilip Krishnan!
ResearchJan 2024
Our recent paper, Context is Environment got accepted at ICLR 2024! (paper)
ResearchJan 2024
Our paper on rotationally equivariant contrastive learning got accepted at ICLR 2024! (paper)
ResearchJan 2024
Our work on removing biases from molecular representations got accepted at ICLR 2024! (paper)
ResearchJun 2023
Find me in Paris, interning at Meta AI with David Lopez-Paz and Kartik Ahuja.
Talks & PanelsDec 2022
Gave a talk at the NeurIPS 2022 Workshop on Federated Learning: Recent Advances and New Challenges (recording).
ResearchAug 2022
I have officialy started my PhD at MIT!

Publications

Full list at Google Scholar

* denotes equal contribution

Latent Chain of Thought Sharut Gupta, Jarred Barber, Dilip Krishnan (Gemini Team, Google DeepMind)

U.S. Patent

Developed an alternative approach to the traditional chain-of-thought-based reasoning, enhancing the reasoning capabilities of large language models by around 18% over real-world reasoning benchmarks

Better Together: Leveraging Unpaired Multimodal Data for Stronger Unimodal Models Sharut Gupta, Shobhita Sundaram, Chenyu Wang, Stefanie Jegelka, Phillip Isola

ICLR 2026 | Blog post | Code

Also at NeurIPS'25 UniReps

ReasonCACHE: Teaching LLMs To Reason Without Weight Updates Sharut Gupta, Phillip Isola, Stefanie Jegelka, David Lopez-Paz, Kartik Ahuja, Mark Ibrahim, Mohammad Pezeshki

Blog post

Canonicalizing Multimodal Contrastive Representation Learning Sharut Gupta*, Sanyam Kansal*, Stefanie Jegelka, Phillip Isola, Vikas Garg

Blog post | Code

In-Context Symmetries: Self-Supervised Learning through Contextual World Models Sharut Gupta*, Chenyu Wang*, Yifei Wang*, Tommi Jaakkola, Stefanie Jegelka

NeurIPS 2024 Oral @SSL MIT News MIT Podcast | Code

Oral Presentation (top 4) at NeurIPS'24 SSL

Context is Environment Sharut Gupta, Stefanie Jegelka, David Lopez-Paz, Kartik Ahuja

ICLR 2024 Book Chapter | Code | Talk

Also at NeurIPS'23 DistShift, NeurIPS'23 R0-FoMo

Structuring Representation Geometry with Rotationally Equivariant Contrastive Learning Sharut Gupta*, Joshua Robinson*, Derek Lim, Soledad Villar, Stefanie Jegelka

ICLR 2024 | Code | Talk

Also at ICML'23 TAG-ML, NeurIPS'23 SSL

Sequential-Parallel Duality in Prefix-Scannable Models Morris Yau*, Sharut Gupta*, Valerie Engelmayer, Kazuki Irie, Stefanie Jegelka, Jacob Andreas

ICLR 2026

Learning Diffusion Models with Flexible Representation Guidance Chenyu Wang*, Cai Zhou*, Sharut Gupta, Zongyu Lin, Stefanie Jegelka, Stephen Bates, Tommi Jaakkola

NeurIPS 2025 Oral at FM4LS | Blog post | Code

Oral Presentation at ICML'25 FM4LS

An Information Criterion for Controlled Disentanglement of Multimodal Data Chenyu Wang*, Sharut Gupta*, Xinyi Zhang, Sana Tonekaboni, Stefanie Jegelka, Tommi Jaakkola, Caroline Uhler

ICLR 2025 Oral at UniReps Honorable Mention | Code

Oral Presentation (top 4) and the Honorable Mention Award at NeurIPS'24 UniReps

Understanding the Role of Equivariance in Self-supervised Learning Yifei Wang, Kaiwen Hu, Sharut Gupta, Ziyu Ye, Yisen Wang, Stefanie Jegelka

NeurIPS 2024

Also at ICML'24 TF2M

Removing Biases from Molecular Representations via Information Maximization Chenyu Wang, Sharut Gupta, Caroline Uhler, Tommi Jaakkola

ICLR 2024 | Code |

Also at NeurIPS'23 New Frontiers of AI for Drug Discovery and Development

Near-Optimal Algorithms for Group Distributionally Robust Optimization and Beyond Tasuku Soma, Khashayar Gatmiry, Sharut Gupta, Stefanie Jegelka
arXiv

Collaborative privacy-preserving approaches for distributed deep learning using multi-institutional data Sharut Gupta, Sourav Kumar, Ken Chang, Charles Lu, Praveer Singh, Jayashree Kalpathy-Cramer

RSNA RadioGraphics 2023

Minimizing Client Drift in Federated Learning via Adaptive Bias Estimation Farshid Varno, Laya Rafiee, Sharut Gupta, Stan Matwin, Mohammad Havaei

ECCV 2022

FL Games: A federated learning framework for distribution shifts Sharut Gupta, Kartik Ahuja, Mohammad Havaei, Niladri Chatterjee, Yoshua Bengio

NeurIPS FL 2022 Spotlight at FL

Spotlight Presentation at NeurIPS'22 FL Workshop

Addressing catastrophic forgetting for medical domain expansion Sharut Gupta, Praveer Singh, Ken Chang, Liangqiong Qu et al.

NeurIPS ML4H 2022 Spotlight at ML4H | Code | Talk

Spotlight Presentation at NeurIPS'22 ML4H

Towards Trainable Saliency Maps in Medical Imaging Mehak Aggrawal, Nishanth Arun, Sharut Gupta, Ashwin Vaswani et al.

Spotlight

NeurIPS'20 ML4H (Spotlight)

Assessing the (un) trustworthiness of saliency maps for localizing abnormalities in medical imaging
Nishanth Arun, Nathan Gaw, Praveer Singh, Ken Chang, Mehak Aggrawal, Bryan Chen, Katharina Hoebel, Sharut Gupta et al.
Radiology Artificial Intelligence 2021

Federated learning for breast density classification: A real-world implementation Holger R Roth, Ken Chang, Praveer Singh, Nir Neumark, Wenqi Li, Vikash Gupta, Sharut Gupta, Liangqiong Qu et al.
MICCAI'20 Domain Adaptation and Representation Transfer, and Distributed and Collaborative Learning

Exploring the forecasting approach for road accidents: Analytical measures with hybrid machine learning Mamoudou Sangare, Sharut Gupta, Samia Bouzefrane, Soumya Banerjee, Paul Muhlethaler
Journal of Expert Systems with Applications 2021

Improvement and multi-population generalizability of a deep learning-based chest radiograph severity score for COVID-19 Matthew D Li, Nishanth T Arun, Mehak Aggrawal, Sharut Gupta, Praveer Singh et al.
MedRxiv 2020

Segmentation, Survival Prediction, and Uncertainty Estimation of Gliomas From Multimodal 3d MRI Using Selective Kernel Networks Jay Patel, Ken Chang, Katharina Hoebel, Mishka Gidwani, Nishanth T Arun, Sharut Gupta et al.
MICCAI'20 International Brainlesion Workshop

Research Outreach and Leadership

WiDS Cambridge Datathon | co-organizer of the WiDS Cambridge Datathon as a part of the global WiDS Conference Datathon (video coverages can be found for 2024 and 2023)

2023-present

TILOS AI Institute | co-organizer of the TILOS Social at NeurIPS'23 and NeurIPS'24

2023-2024

ML Tea, MIT | co-organizer of ML Tea, a weekly seminar series from members of the machine learning community around MIT

2023-2025

The Gradient | editor for biweekly newsletter covering recent AI news and research at the Gradient substack.

2023-2025

NeurIPS | recieved the volunteer award to help organize the NeurIPS conference.

2022

Graduate Teaching Assistant, Deep Learning at MIT (more info)

Course information can be found here.

2024

Instructor, Mysteries of the Hilbert's Hotel, Splash MIT (more info)

Taught a class on the Mysteries of Hilbert's Infinite Hotel ("Room" for thought!) to a class of 100 high school students. The slides for the class are available here.

2023

Undergraduate Teaching Assistant, Differential Equations at IIT Delhi

2021

Undergraduate Teaching Assistant, Analysis and Design of Algorithms at IIT Delhi

2021

Undergraduate Teaching Assistant, Probability and Stochastic Processes at IIT Delhi

2020

Executive Team, MIT EECS Graduate Application Assistance Program (GAAP) (more info)

A student-run initiative offered by PhD students in the MIT EECS department, pairing applicants with current student volunteers who mentor them 1:1 through the graduate application process. More information here.

2024-2026

Session Chair, ACM SIGKDD (more info)

Chaired the session on Responsible AI as part of 'Data Science in India', an ACM SIGKDD India Chapter event. More information here.

2022

Deputy General Secretary Mentorship, Board for Student Welfare (BSW), IIT Delhi (more info)

Initialised an auxiliary program to tackle issues of substance abuse, intellectual plagiarism and language barriers; co-established the Office of Accessible Education (OAE) for the disabled community; founded a research mentorship and journal club at IIT Delhi. More information here.

2020-2021

Student Representative, IIT Delhi Strategy and Vision Document 2030 Implementation Committee (more info)

IIT Delhi initiated the development of its vision and direction for 2030, backed by the IIT Delhi Endowment Fund launched by the President of India in 2019 with an initial commitment of INR 250 crore. More details here.

2020-2021

Core Team Member, Initiative for Gender Equity and Sensitisation (IGES), IIT Delhi (more info)

IGES aims to create a safe and violence-free educational atmosphere for all, irrespective of diversities in gender, sex, caste, class, ethnicity, language, race, disability and sexual orientation, and advocates a zero tolerance policy against sexual harassment. More information here.

2020-2021

Captain, National Baseball Championship

2016

Generative Coding

Disclaimer: I'm a novice to generative art, and I'm still finding my feet. But I think I'm learning and having fun. In case you find something cool and interesting, or are looking forward to collaborating, please feel free to get in touch with me! It would really mean a lot :)
Here are a few of my attempts at generative art using p5js.

Previous Next
Craters' On The Moon

The artwork represents the terrain of moon. Each particle on this terrain has a variable life, post which it fades off and dies. The motion of each particle is constructed using 2D Perlin Noise.
A Scenic Paradise

The artwork represents an evening at the Newport Beach a.k.a Easton's Beach in Newport, Rhode Island. The indivisible smallest unit used for constructing this resembles a pinecone geometry.
Previous Next
Tesseract

The artwork depicts the beauty of lines bent across various angles to create masterpieces of kaleidoscopes. A continous outward spiral can also be created from a bunch of straight lines.
Loss Landscape

A representation of Loss Landscape of Neural Networks. It is constructed using a Triangle element, populated across a terrain generated using 2D Perlin Noise. Color of each element is based on height of the terrain.
Previous Next
Alone in Crowd: A Silent Pandemic

The artwork depicts the power of tiling. It represents the loneliness associated with urban living owing to acute dependency on gadgets, materialistic desires and instant gratification etc.
Previous Next
Cubic Beauty

The artwork shows squares. The constructing unit of this artwork is a 2D projection of a cube across its longest diagonal. This unit is translated using a cosine function and the distance from left end of the window.
Algal Bloom

The depiction of the growth and accumulation in the population of algae. Each algal unit is represented by a filled bezier curve. The motion is generated using 2D Perlin Noise and basic shapes.
Previous Next
Dark Rooms

The artwork depicts a house full of infinitely long dark rooms. The building blocks of this work are basic squares which are translated with varying gaps as we move inside a room.
Random Walk

The artwork depicts random walk whereby a particle takes a random step generated by incrementing both x and y by a random number in -1 to 1. Neighbouring colors are chosen to ensure closeness in RGB space.

Sharut Gupta

What's New

Publications

Full list at Google Scholar

Research Outreach and Leadership

Generative Coding

Craters' On The Moon

A Scenic Paradise

Tesseract

Loss Landscape

Alone in Crowd: A Silent Pandemic

Cubic Beauty

Algal Bloom

Dark Rooms

Random Walk