Workshop on Algorithms with Predictions

The best algorithm for a computational problem generally depends on the “relevant inputs”, a concept that depends on the application domain and often defies formal articulation. While there is a large literature on empirical approaches to selecting the best algorithm for a given application domain, there has been surprisingly little theoretical analysis of the problem.

We adapt concepts from statistical and online learning theory to reason about application-specific algorithm selection. Our models are straightforward to understand, but also expressive enough to capture several existing approaches in the theoretical computer science and AI communities, ranging from self-improving algorithms to empirical performance models. We present one framework that models algorithm selection as a statistical learning problem, and our work here shows that dimension notions from statistical learning theory, historically used to measure the complexity of classes of binary- and real-valued functions, are relevant in a much broader algorithmic context. We also study the online version of the algorithm selection problem, and give possibility and impossibility results for the existence of no-regret learning algorithms.

Joint work with Rishi Gupta.

Recently there has been increased interest in using machine learning techniques to improve classical algorithms. We study when it is possible to construct compact, composable sketches for weighted sampling and statistics estimation according to functions of data frequencies. Such structures are now central components of large-scale data analytics and machine learning pipelines. However, many common functions, such as thresholds and p-th frequency moments with p>2, are known to require polynomial size sketches in the worst case. We explore performance beyond the worst case under two different types of assumptions. The first is having access to noisy advice on item frequencies. This continues the line of work of Hsu et al (ICLR 2019), who assume predictions are provided by a machine learning model. The second is providing guaranteed performance on a restricted class of input frequency distributions that are better aligned with what is observed in practice. This extends the work on heavy hitters under Zipfian distributions in a seminal paper of Charikar et al (ESA 2002). Surprisingly, we show analytically and empirically that “in practice” small polylogarithmic-size sketches provide accuracy for “hard” functions.

Joint work with Ofir Geri (Stanford) and Rasmus Pagh (IT University of Copenhagen, BARC, and Google Research)

We consider the problem of online linear optimization with hints and show near-optimal regret bounds in terms of the “goodness” of the hints.

Joint work with Aditya Bhaskara, Ashok Cutkosky, Manish Purohit.

Machine Learning for Algorithm Design by Eric Balkanski, Fall 2020 (Columbia University). [course webpage]
Learning-Augmented Algorithms (6.890) by Piotr Indyk and Costis Daskalakis, Spring 2019 (MIT). [course webpage]
Algorithms with Predictions by Michael Mitzenmacher and Sergei Vassilvitskii, survey paper (a chapter in Beyond the Worst-Case Analysis of Algorithms book, a collection edited by Tim Roughgarden), 2020. [survey]
Technical perspective: Algorithms selection as a learning problem by Avrim Blum, research highlights (CACM), May 2020. [short paper]
Data-driven algorithm design by Tim Roughgarden and Rishi Gupta, research highlights (CACM), May 2020. [short paper]
Application-Specific Algorithm Selection by Tim Roughgarden, Simons Institute Open Lecture, 2016. [video] [slides]
Data Driven Algorithm Design by Nina Balcan, Plenary talk at 24th Annual LIDS Student Conference, MIT 2019. [video] [slides]
Learning in Algorithms by Sergei Vassilvitskii, Survey talk at 4th Highlights of Algorithms (HALG 2019). [slides]
Learning-Augmented Algorithms: How ML can Lead to Provably Better Algorithms by Michael Mitzenmacher, Keynote talk at (ALGO 2019). [slides]
Workshop on Data-driven Algorithmics by Andreas Krause, Pavlos Protopapas and Yaron Singer, Harvard, 2015. [workshop webpage]
Workshop on Data-driven Algorithmics by Andreas Krause and Yaron Singer, Bertinoro, 2017. [workshop webpage]
Workshop on Automated Algorithm Design by Nina Balcan, Bistra Dilkina, Carl Kingsford and Paul Medvedev, TTIC, 2019. [workshop webpage]
Workshop on Learning-Based Algorithms by Piotr Indyk, Yaron Singer, Ali Vakilian and Sergei Vassilvitskii, TTIC, 2019. [workshop webpage]