Jerry's smirking face

Jerry Li

Stata Center. 32 Vassar Street
Cambridge, MA 02139
Office G32-578

jerryzli AT mit DOT edu

My CV (last updated 2/19/2018)


I am a Ph.D student studying theoretical computer science at MIT. My advisor is Ankur Moitra. I did my masters at MIT under the wonderful supervision of Nir Shavit. I am partially supported by an NSF Graduate Research Fellowship. My primary research interests are in learning theory and distributed algorithms, but I am broadly interested in many other things in TCS. I particularly like applications of analysis and analytic techniques to TCS problems.

As an undergrad at the University of Washington, I worked on complexity of branching programs, and how we could prove hardness of techniques used for naturally arising learning problems in database theory and AI.

In my free time I enjoy being remarkably mediocre at ultimate frisbee, chess, and piano, amongst other things.


Authors are ordered alphabetically unless stated otherwise.


  • SEVER: A Robust Meta-Algorithm for Stochastic Optimization
    Ilias Diakonikolas, Gautam Kamath, Daniel M. Kane, Jerry Li, Jacob Steinhardt, Alistair Stewart

  • Fast and Sample-Efficient Algorithms for Learning Multidimensional Histograms
    Ilias Diakonikolas, Jerry Li, Ludwig Schmidt

  • Asynchronous Balanced Allocations with Applications to Approximate Counting
    Dan Alistarh, Justin Kopinsky, Jerry Li, Giorgi Nadiradze

  • Towards Understanding the Dynamics of Generative Adversarial Networks
    Jerry Li, Aleksander Mądry, John Peebles, Ludwig Schmidt
    manuscript, preliminary version in PADL 2017

Conference and Workshop Papers

Journal Papers

  • Exact Model Counting of Query Expressions: Limitations of Propositional Methods
    Paul Beame, Jerry Li, Sudeepa Roy, Dan Suciu.
    to appear, ACM Transactions on Database Systems


  • Efficient training of neural networks
    Dan Alistarh, Jerry Li, Ryota Tomioka, Milan Vojnovic
    in submission


Other Writing


  • QSGD: Communication-Efficient SGD via Gradient Quantization and Encoding

    • NIPS 2017, December 2017

  • Being Robust (in High Dimensions) can be Practical

    • ICML 2017, August 2017

  • Robust Proper Learning for Mixtures of Gaussians via Systems of Polynomial Inequalities

    • COLT 2017, July 2017

  • Efficient Robust Sparse Estimation in High Dimensions

    • COLT 2017, July 2017. Joint with Simon Du

  • Robust Estimators In High Dimensions without the Computational Intractability [slides]

    • TCS+, December 2016 [video]

    • FOCS, October 2016 [video]

    • ETH Theory Seminar, August 2016

    • UW Theory Lunch, July 2016

    • MIT Algorithms and Complexity Seminar, June 2016

  • Quantized Stochastic Gradient Descent

    • MIT ML Tea, October 2016

  • Fast Algorithms for Segmented Regression [slides]

  • Fast and Near-Optimal Algorithms for Approximating Distributions by Histograms [slides]

    • PODS 2015

  • Model Counting of Query Expressions: Limitations of Propositional Methods [slides]

    • ICDT 2015

    • MIT Theory Lunch, 2014


  1. TA for 6.852, Distributed Algorithms, Fall 2014.

  1. TA for the UW Math REU under Dr. James Morrow, Summer 2013.

  2. TA for MATH 334/5/6, Advanced Accelerated Second Year Honors Calculus, 2012-2013.

  3. TA for CS 373, Algorithms and Data Structures, Spring 2012.

  4. TA for CS 344, Databases, Winter 2012.


* This might be false