Sharut Gupta

I am a second year Ph.D student at MIT CSAIL advised by Prof. Stefanie Jegelka. I received my Bachelors degree from the Indian Institute of Technology, Delhi majoring in Mathematics and Computer Science. Previously I've worked at Meta AI, Google Research, Microsoft Research and MILA.

My research interests broadly lie in self supervised learning, robustness and out-of-distribution generalization. I seek to investigate mechanisms that enable the learning of representations that not only capture richer structural relationships in unlabeled data but are also interpretable and can efficiently adapt to unseen data distributions.

Thank you for taking time out of your day to find out what I do with mine!

Recent News


Google Scholar

Image Description
Context is Environment
Sharut Gupta, Stefanie Jegelka , David Lopez-Paz, Kartik Ahuja,
NeurIPS: Distribution Shifts - New Frontiers with Foundation Models, 2023
NeurIPS: Robustness of zero/few-shot learning in foundation models (R0-FoMo), 2023
Image Description
Structuring Representation Geometry with Rotationally Equivariant Contrastive Learning
Sharut Gupta, Joshua Robinson, Derek Lim, Soledad Villar, Stefanie Jegelka
ICML: Topology, Algebra, and Geometry in Machine Learning (TAG-ML), 2023
NeurIPS: Self-Supervised Learning - Theory and Practice, 2023
Workshop | Code

Work Experience

Research Intern, Meta AI

David Lopez-Paz and Kartik Ahuja
Paris, France

A key challenge in AI research is to build systems that generalize across varying data distributions. Unfortunately, the bitter lesson so far is that no proposal convincingly outperforms a simple empirical risk minimization baseline. We posit that in-context learning holds the key to better domain generalization, reformulating the problem as next-token prediction in transformers. Via extensive theory and experiments, we show that paying attention to context---unlabeled examples as they arrive---allows our algorithm to amortize performance over many out-of-distribution tasks.

June 2023 - August 2023

Research Intern, Mila - Quebec AI Institute

Prof. Yoshua Bengio

In Federated learning (FL), participating clients typically each hold data from a different distribution, whereby predictive models with strong in-distribution generalization can fail catastrophically on unseen domains. We propose FL Games, a game-theoretic framework for FL for learning causal features that are invariant across clients. FL Games scales well in the number of clients, requires significantly fewer communication rounds, and is agnostic to device heterogeneity. It also achieves high out-of-distribution performance on various benchmarks.

Sept 2021 - May 2022

Software Engineering Intern, Google Research

Dr. Sriram Lakshminarasimhan and Narayan Hegde
Bangalore, India

Wearables and trackers can measure several combinations of physiological markers like Heart Rate, SpO2, Stress levels, sleep durations, physical activity, etc, with varying levels of accuracy and often at different sampling frequencies. The aim of the study is to be able to learn to coach the users towards healthier lifestyle choices through smartphones. We proposed algorithms and architectures that leverage the inter-correlations among different physiological markers for time series imputation, generation, outlier and signature detection and are robust to aperiodic and erroneous data. WE further improved the learnability of both deep learning and statistical algorithms to model highly stochastic time series data.

May 2021 - July 2021

Research Intern, Microsoft Research and Development

Dr. Mithun Das Gupta
Hyderabad, India

Capsule network is a recently introduced neural network architecture whose parameter count scales quadratically with the number of capsules per layer. We proposed a novel routing approach to improve scalability of capsule networks. We attained state-of-the-art results with capsule networks on CIFAR-10 and SVHN datasets with much fewer parameters. Our method also achieved state-of-the art performance on benchmark tasks for testing affine transformation robustness and recognizing highly overlapping digits. Our work received the Best Paper Award in MLADS-SYNAPSE 2020 and was also selected for an Oral presentation

May 2020 - July 2020

Research Intern, Quantitative Translational Imaging in Medicine (QTIM) Lab

Prof. Jayashree Kalpathy Cramer
Boston, USA

Model brittleness is a primary concern when deploying deep learning models in medical settings. A model which performs extremely well in one institution may plummet in performance when tested in another. This arises from inter-institution variation, such as patient demographics, as well as intra-institution variation, such as multiple scanner types. While simply combining the two datasets and re-training the model may seem like a simple solution, it is both time-consuming and fraught with data privacy limitations (for multi-institutional datasets). An alternative approach is to fine-tune the model on subsequent institutions after training it on the original institution. However, this corrodes model performance on the original dataset, a phenomenon called catastrophic forgetting . In this project, we propose a novel simple yet effective approach resulting in remarkable performance on both the original and the target domain allowing successful domain expansion while mitigating catastrophic forgetting.

Jan 2020 - May 2021

Research Intern, INRIA

Prof. Paul Muhlethaler and Prof. Samia Bouzefrane
Paris, France

It is a regrettable fact that the number of road traffic accidents continues to rise, largely due to rapid urban growth and the ever-increasing density of vehicles in cities and surrounding areas.However, with emerging sensor devices and the IoT, it has become feasible to configure future vehicles with safety sensors to prevent many of these accidents. Urban traffic forecasting models generally follow either a Gaussian Mixture Model (GMM) or a Support Vector Classifier (SVC) to estimate the features of potential road accidents. Although SVC can provide good performances with less data than GMM, it incurs a higher computational cost. In this project, we propose a novel framework that combines the descriptive strength of the Gaussian Mixture Model with the high-performance classification capabilities of the Support Vector Classifier. Experimental results show that the approach compares very favorably with baseline statistical methods.

May 2019 - July 2019

Creative Coding

Disclaimer: I’m a novice to generative art, and I’m still finding my feet. But I think I’m learning and having fun. In case you find something cool and interesting, or are looking forward to collaborating, please feel free to get in touch with me! It would really mean a lot :)
A complete set of my artwork can be found at the Art Gallery

Here are a few of my attempts at generative art using p5js.


3D Rendered Ping-Pong Game

IIT Delhi - Computer Vision
Project Partner: Harkirat Singh Dhanoa
  • Using multiple views of chessboard, estimated the camera caliberation matrix
  • Rendered a 3D augmented reality object (ball and animals) over the chessboard
  • Used video input from webcam, two visual markers as paddles reflecting the ball off the plane using laws of reflection
October 2019 - November 2019

CovidNet: Segmenting COVID-19 abnormalities

QTIM Lab - Deep Learning
  • Developed a CT segmentation algorithm that estimates the extent of abnormality in chest CTs from COVID-19 patients
  • Achieved a dice score of 0.71 on the test set with Intra-Class Correlation and Spearman coefficient as 0.99 and 0.98
March 2020 - March 2020

My Exam Scribe

International Women’s Hackathon - Software Development
Project Partner: Sakshi Taparia

Project Presentation:    

  • Built a mobile application on top of the Google Assistant using Dialogflow, Webhook and Firebase Cloud Database
  • Enabled visually impaired to write exams without the use of human scribes by reading questions and storing answers
March 2019 - May 2019


Sinha Presidential Fellowship | awarded to ~100 most outstanding students worldwide to pursue graduate studies at MIT
Prof. D.S. Varma Award | for obtaining highest GPA amongst all graduating UG female students, IIT Delhi
Suman Upma Gupta Memorial Award | for obtaining highest GPA amongst all UG/PG graduating female students, IIT Delhi
Suyash Chandra Memorial Award | for the best UG project among all graduating student in the department, IIT Delhi
Research and Innovative Project Award | first prize for the best ongoing research project by IIT Delhi Alumni Association
Quadeye Excellence Scholarship | among top 50 students nationwide who were awarded the Quadeye Excellence Scholarship
Kishore Vaigyanik Protsahan Yojana (KVPY) | awarded with the KVPY fellowship by IISc for exceptional aptitude in Science
Indian National Mathematics Olympiad (INMO) | 34th amongst top 800 students who qualified for INMO
Class XII National Standing | for nationwide top 1% in Mathematics and Chemistry examinations in CBSE XII

Research Outreach and Leadership

Research Outreach

WiDS Cambridge Datathon | co-organizer of the WiDS Cambridge Datathon as a part of the global WiDS Conference Datathon
ML Tea, MIT | co-organizer of ML Tea, a weekly seminar series from members of the machine learning community around MIT
The Gradient | editor for biweekly newsletter covering recent AI news and research at the Gradient substack.
NeurIPS | recieved the volunteer award and helped at the registration desk, and managed poster presentations and live sessions


Taught a class on the Mysteries of Hilbert's Infinite Hotel ("Room" for thought!) to a class of 100 high school students. The slides for the class are available here

Systems of differential equations, Existence and uniqueness theorems for initial value problems of semilinear and nonlinear ODEs, continuous dependence and well-posed ness; Comparison theorems of Sturms, Sturm-Liouville eigenvalue problems; Phase-plane analysis, Linear and Non-linear stability, Liapunov functions and applications;First order Partial differential equations, Method of characteristics, local and global solutions, envelop of solutions, complete and general solutions; Second order equations: Heat and Wave equation, fundamental solutions, method of eigenfunctions, Duhamel’s principle. Maximum priciples for Heat and Laplace equation,Greens functions.

Models of computation: RAM and Turing Machines; Algorithm Analysis techniques; Basic techniques for designing algorithms: dynamic programming, divide-and-conquer and Greedy; DFS , BFS and their applications; Some Basic Graph Algorithms; linear time sorting algorithms; NP-Completeness and Approximation Algorithms.

Axioms of probability, Probability space, Conditional probability, Independence, Bayes’ rule, Random variable, Some common discrete and continuous distributions, Distribution of Functions of Random Variable, Moments, Generating functions, Two and higher dimensional distributions, Functions of random variables, Order statistics, Conditional distributions, Covariance, Correlation coefficient, conditional expectation, Modes of convergences, Laws of large numbers, Central limit theorem, Definition of Stochastic process, Classification and properties of stochastic processes, Simple Markovian stochastic processes, Gaussian processes, Stationary processes, Discrete and continuous time Markov chains, Classification of states, Limiting distribution, Birth and death process, Poisson process, Steady state and transient distributions, Simple Markovian queuing models (M/M/1, M/M/1/N, M/M/c/N, M/M/N/N, M/M/∞).


Social Engagements & Leadership

Chaired the session on Responsible AI as a part of 'Data Science in India', an ACM SIGKDD India Chapter event. More information about this event is available here.

  • Initialised an auxiliary program to tackle crucial issues of substance abuse, intellectual plagiarism and language issues
  • Co-established the Office of Accessible Education (OAE) providing special assistance for the disabled community
  • Founded research mentorship and journal club at IIT Delhi, which is dedicated towards fostering student research
More information about the board is available here.

The IIT Delhi Endowment Fund was launched by the Honourable President of India in 2019, backed by an initial commitment of INR 250 crore by alumni, with a stated goal of raising USD 1 billion over a period of time. Subsequently, IIT Delhi initiated the development of its vision and direction for 2030, with a focus to build and use the Endowment Fund towards achieving these goals. More details can be found here.

Initiative for Gender Equity and Sensitisation (IGES), under Indian Institute of Technology, Delhi, aims to create a safe and violence-free educational atmosphere for all, irrespective of diversities in identities of gender, sex, caste, class, ethnicity, language, race, disability and sexual orientation. IGES also advocates a zero tolerance policy against sexual harassment. More information can be found here.