Info Invited Speakers Schedule Resources
Summer Workshop on Learning-Based Algorithms

This workshop will cover recent developments in using machine learning to improve the performance of “classical” algorithms, by adapting their behavior to the properties of the input distribution. This reduces their running time, space usage or improves their accuracy, while (often) retaining worst case guarantees.

The workshop will cover general approaches to designing such algorithms, as well as specific case studies. We plan to cover learning-augmented methods for designing data structures, streaming and sketching algorithms, on-line algorithms, compressive sensing and recovery, error-correcting codes, scheduling algorithms, and combinatorial optimization. The attendees span a diverse set of areas, including theoretical computer science, machine learning, algorithmic game theory, coding theory, databases and systems.

When: 12 - 14 August 2019
Where: Toyota Technological Institute at Chicago (TTIC)
6045 S Kenwood Ave,
Chicago, IL 60637
Organizers: Piotr Indyk (MIT), Yaron Singer (Harvard), Ali Vakilian (MIT) and Sergei Vassilvitskii (Google NYC)
Invited Speakers:
Mohammad Alizadeh (MIT)
Nina Balcan (CMU)
Anupam Gupta (CMU)
Sreeram Kannan (University of Washington)
Tim Kraska (MIT)
Ravi Kumar (Google Mountain View)
Ben Moseley (CMU)
Ali Mousavi (Google NYC)
Debmalya Panigrahi (Duke University)
Eric Price (UT Austin)
Manish Purohit (Google Mountain View)
Ilya Razenshteyn (MSR Redmond)
Tim Roughgarden (Columbia University)
Zoya Svitkina (Google Mountain View)
Pramod Viswanath (UIUC)
D Sivakumar (Google Mountain View)
Le Song (Georgia Tech)
Schedule
Monday (8/12)
10:00 -- 10:40 Tim Kraska Towards Learned Algorithms, Data Structures, and Systems

All systems and applications are composed from basic data structures and algorithms, such as index structures, priority queues, and sorting algorithms. Most of these primitives have been around since the early beginnings of computer science (CS) and form the basis of every CS intro lecture. Yet, we might soon face an inflection point: recent results show that machine learning has the potential to alter the way those primitives or systems at large are implemented in order to provide optimal performance for specific applications.

In this talk, I will highlight several results from my research group on how machine learning is changing the way we build systems and outline different ways to build learned algorithms and data structures to achieve “instance-optimality” with a particular focus on data management systems.

10:40 -- 11:20 Mohammad Alizadeh Learning to Schedule

Efficiently scheduling jobs on distributed compute clusters requires complex algorithms. Current systems use simple, generalized heuristics and ignore workload characteristics, since developing and tuning a scheduling policy for each workload is infeasible. In this talk, I will discuss Decima, a scheduling system for data processing clusters that learns sophisticated workload-specific scheduling algorithms entirely through experience. Decima uses reinforcement learning (RL) and neural networks to learn a workload-specific scheduling algorithm without any human instruction beyond a high-level objective, such as minimizing average job completion time. I will highlight key aspects of Decima’s design, including novel RL training techniques for systems like Decima that must operate in the presence of a stochastic input process.

11:20 -- 12:00 Tim Roughgarden Application-Specific Algorithm Selection

The best algorithm for a computational problem generally depends on the “relevant inputs”, a concept that depends on the application domain and often defies formal articulation. While there is a large literature on empirical approaches to selecting the best algorithm for a given application domain, there has been surprisingly little theoretical analysis of the problem.

We adapt concepts from statistical and online learning theory to reason about application-specific algorithm selection. Our models are straightforward to understand, but also expressive enough to capture several existing approaches in the theoretical computer science and AI communities, ranging from self-improving algorithms to empirical performance models. We present one framework that models algorithm selection as a statistical learning problem, and our work here shows that dimension notions from statistical learning theory, historically used to measure the complexity of classes of binary- and real-valued functions, are relevant in a much broader algorithmic context. We also study the online version of the algorithm selection problem, and give possibility and impossibility results for the existence of no-regret learning algorithms.

Joint work with Rishi Gupta.

[slides]
12:00 -- 1:30 Lunch (provided on site)
1:30 -- 2:10 Sreeram Kannan Learning Graph Algorithms that Generalize

Optimization problems on graphs occupy a central role in algorithmic design as well as in machine learning (clustering or is a popular example). The success of machine learning methods in a variety of domains provides a new impetus to ask whether such algorithms can be “learnt” directly.

A key challenge is that such learnable algorithms need to generalize not only to (exponentially many) unseen instances but also to graphs of unseen (and much larger) sizes. We propose a reinforcement-learning based solution by representing graphs as randomized nonlinear dynamical systems. We show that such representations are guaranteed to be information-lossless. We also show how training on hard theoretical instances of small graph sizes can lead to generalization for much larger sized graphs (100~1000x). We show this on the optimization problems of minimum vertex cover, maximum cut, maximum independent set and community detection which vary across the complexity theoretic hardness spectrum.

[slides]
2:10 -- 2:50 Pramod Viswanath Inventing Communication Algorithms via Deep Learning

Coding theory is a central discipline underpinning wireline and wireless modems that are the workhorses of the information age. Progress in coding theory is largely driven by individual human ingenuity with sporadic breakthroughs over the past century. In this work we study whether it is possible to automate the discovery of communication algorithms via deep learning.

We present the first family of codes in the presence of noisy feedback - designed via learning. While the study of this problem was initiated by Shannon and feedback is known to improve performance, good codes are unknown and linear codes are known to be highly sub-optimal. Our learnt codes significantly outperform known codes by several orders of magnitude in reliability. For the channel without feedback, we show that we can construct near-optimal codes directly via learning. These codes are more robust and adaptive than the state-of-the-art codes.

2:50 -- 3:10 Break
3:10 -- 3:50 Ali Mousavi From Supervised to Unsupervised Computational Sensing

Great progress has been made on sensing, perception, and signal processing over the last decades through the design of algorithms matched to the underlying physics and statistics of the task at hand. However, a host of difficult problems remain where the physics-based approach comes up short; for example, unrealistic image models stunt the performance of MRI and other computational imaging systems. Fortunately, the big data age has enabled the development of new kinds of machine learning algorithms that augment our understanding of the physics with models learned from large amounts of training data. In this talk, we will overview integrated physics+data algorithms for solving the kinds of inverse problems encountered in computational sensing. In particular, we show how data can be used to learn a more realistic signal model that boosts the performance of an iterative recovery algorithm in a supervised way. We then show how we can adopt statistical techniques to solve inverse problems in computational sensing even without having ground truth data and in an unsupervised way.

3:50 -- 4:30 Eric Price Compressed Sensing and Generative Models

The goal of compressed sensing is make use of image structure to estimate an image from a small number of linear measurements. The structure is typically represented by sparsity in a well-chosen basis. We show how to achieve guarantees similar to standard compressed sensing but without employing sparsity at all -- instead, we suppose that vectors lie near the range of a generative model G: Rk -> Rn. Our main theorem here is that, if G is L-Lipschitz, then roughly O(k log L) random Gaussian measurements suffice; this is O(k d log n) for typical d-layer neural networks.

The above result describes how to use a model to recover a signal from noisy data. But if the data is noisy, how can we learn the generative model in the first place? The second part of my talk will describe how to incorporate the measurement process in generative adversarial network (GAN) training. Even if the noisy data does not uniquely identify the non-noisy signal, the _distribution_ of noisy data may still uniquely identify the distribution of non-noisy signals.

This talk is based on joint works with Ashish Bora, Ajil Jalal, and Alex Dimakis.

4:30 -- 5:10 Ali Vakilian Learned Sketches for Frequency Estimation

Classical streaming algorithms typically do not leverage data properties or patterns in their input. We propose to augment such algorithms with a learning model that enables them to exploit data properties without being specific to a particular pattern or property. We focus on the problem of estimating the frequency of elements in a data stream, a fundamental problem with applications in network measurements, natural language processing, and security. We propose a new framework for designing frequency estimation streaming algorithms that automatically learn to leverage the properties of the input data. We present a theoretical analysis of the proposed algorithms and prove that, under natural assumptions, they have lower space complexity than prior algorithms. We also evaluate our algorithms on two real-world datasets and demonstrate empirically their performance gains.

This talk is based on joint works with Anders Aamand, Chen-Yu Hsu, Piotr Indyk and Dina Katabi.

[slides]
Tuesday (8/13)
10:00 -- 10:40 Nina Balcan An online learning approach to data-driven algorithm selection

Data driven algorithm design for combinatorial problems is an important aspect of modern data science. Rather than using off the shelf algorithms that only have worst case performance guarantees, practitioners typically optimize over large families of parametrized algorithms and tune the parameters of these algorithms based on past problem instances in order to perform well on future instances from that domain.

In this talk, I will discuss general techniques for providing formal guarantees for data driven algorithm selection via online learning. We provide upper and lower bounds on regret for algorithm selection from infinite parametrized families of algorithms in online settings, where problems arrive sequentially and we must choose parameters online. The major technical challenge is that for many combinatorial problems (including subset selection problems, partitioning problems, and clustering), small differences between two algorithms can lead to a cascade of changes in their behavior and dramatically change their performance. This leads to very interesting challenges that require new techniques that help significantly push the boundaries of learning theory as well.

[slides]
10:40 -- 11:20 Sergei Vassilvitskii Online algorithms with ML advice: caching & beyond

Classic algorithm design focuses on optimizing worst case complexity, resulting in algorithms that perform well for any possible input. In many applications, however, the typical input is not just far from worst case, but has some predictable characteristics. We first focus on the caching problem and give results that formally incorporate machine learned predictions into algorithm design. We then talk about the modeling choices and challenges faced when extending the work to weighted caching.

11:20 -- 11:40 Break
11:40 -- 12:20 Manish Purohit Ski-Rental with Predictions: Three Models

I'll give an overview of recent work in augmenting traditional online algorithms with additional information such as machine-learnt predictions. This talk demonstrates the different design choices available to the algorithm designer with respect to the type and structure of the predictions used. Naturally, the flavor of theoretical results obtained depends greatly on the prediction models. Using the well-studied Ski-Rental problem as a running example, I'll cover three distinct prediction models and design online algorithms in these regimes with very different competitive ratios.

12:20 -- 1:00 Debmalya Panigrahi Learning Augmented Algorithms: The Case of Multiple Experts

Recent research has focused on incorporating the advice of ML models in improving the performance of online algorithms, without compromising the intrinsic robustness offered by traditional competitive analysis. This talk will focus on the case of multiple experts in this setting. In particular, we show how the advice provided by multiple experts can be incorporated into an online algorithm for the classical ski rental problem. The goal is to match the performance of the best expert without knowing her identity, a task similar in flavor to the traditional experts learning. In contrast to online learning, however, this is not an iterative process, and the algorithm does not have the scope of learning the identity of the best expert over time. We do not make any assumptions on the quality of the advice provided by the experts, and obtain tight performance bounds in this context.

This is based on joint work with Sreenivas Gollapudi that appeared at ICML 2019.

1:00 -- 2:30 Lunch (provided on site)
2:30 -- 3:10 Ben Moseley Online Scheduling via Learned Weights

The use of machine learning and predictive models have produced a revolution in science and engineering. Online optimization problems are a natural source of uncertainty that predictions can be used to manage and improve performance. This paper studies how predictive techniques can be used to break through worst case barriers in online scheduling.

The make-span minimization problem on unrelated machines and its special case, restricted assignment, are classic problems in online scheduling theory. Worst case analysis of these problems yields Ω(log m) lower bounds on the competitive ratio in the online setting. In this paper we construct non-trivial predictions for these problems and design algorithms that utilize these predictions to compute solutions online. Our predictions are compact in size, having dimension linear in the number of machines. The performance guarantees of our algorithms depend on the accuracy of the predictions, and moderately accurate predictions allow our techniques to beat the worst case lower bounds. More precisely, the predictions can be used to construct O(log η)-competitive fractional assignments, where η is the error of the predictions. We then give an online algorithm that is O((loglog(m))3)-competitive to round these fractional assignments, with a nearly matching Ω(loglog m) lower bound for online rounding algorithms.

[slides]
3:10 -- 3:50 Zoya Svitkina Semi-Online Bipartite Matching

We consider the semi-online model that generalizes the classical online computational model. The semi-online model postulates that the unknown future has a predictable part and an adversarial part; these parts can be arbitrarily interleaved. An algorithm in this model operates as in the standard online model, i.e., makes an irrevocable decision at each step. We study bipartite matching in the semi-online model. Our main contributions are competitive algorithms for this problem and a near-matching hardness bound. The competitive ratio of the algorithms nicely interpolates between the truly offline setting (i.e., no adversarial part) and the truly online setting (i.e., no predictable part).

Joint work with Ravi Kumar, Manish Purohit, Aaron Schild, Erik Vee.

3:50 -- 4:10 Break
4:10 Rump Session
Wednesday (8/14)
10:00 -- 10:40 Anupam Gupta Robust Algorithms for Bandits and Secretaries

In the secretary problem, a set of items are revealed to us in random order, and we want to maximize the probability of picking the best among them. In the (stochastic) multi-armed bandit problem, we perform pulls from a set of arms, each with a fixed but unknown payoff probability, and want to minimize the regret. Both these problems are well-studied in the sequential decision-making literature.

We consider the semi-adversarial setting where an adversary is allowed to introduce some limited amount of corruptions. We show that classical algorithms fail badly in the presence of corruptions, and then we give algorithms that are more robust to corruptions.

This is based on joint works with Domagoj Bradac, Sahil Singla, and Goran Zuzic, and with Tomer Koren and Kunal Talwar.

10:40 -- 11:20 Yaron Singer Algorithms in the Era of Machine Learning: An Inconvenient Truth

The traditional approach in algorithm design assumes that there is an underlying objective that is known to the algorithm designer and the focus is on efficiently optimizing that objective. In many cases however, the objectives we aim to optimize are not known but rather learned from data. So what are the guarantees of the algorithms we develop and teach when the input is learned from data? In this talk we will address this question and discuss challenges at the intersection of machine learning and algorithms. We will present some stark impossibility results and argue for new algorithmic paradigms.

11:20 -- 11:40 Break
11:40 -- 12:20 Le Song Learning to Plan via Neural Exploration-Exploitation Trees

Sampling-based planning algorithms such as RRT and its variants are powerful tools for path planning problems in high-dimensional continuous state and action spaces. While these algorithms perform systematic exploration of the state space, they do not fully exploit past planning experiences from similar environments. In this paper, we design a meta path planning algorithm, called Neural Exploration-Exploitation Trees (NEXT), which can utilize prior experience to drastically reduce the sample requirement for solving new path planning problems. More specifically, NEXT contains a novel neural architecture which can learn from experiences the dependency between task structures and promising path search directions. Then this learned prior is integrated with a UCB-type algorithm to achieve an online balance between exploration and exploitation when solving a new problem. Empirically, we show that NEXT can complete the planning tasks with very small search trees and significantly outperforms previous state-of-the-arts on several benchmark problems.

[slides]
12:20 -- 1:00 D Sivakumar Can we learn algorithms?

Deep Learning has been applied successfully to many basic human tasks such as object recognition and speech recognition, and increasingly to the more complex task of language understanding. In addition, deep learning has been extremely successful in the context of planning tasks in constrained environments (e.g., game playing).

We ask: Can deep learning be applied to design algorithms?

We formulate this question precisely, and show how to apply deep reinforcement learning to design online algorithms for a number of optimization problems such as the AdWords problem (choosing which advertisers to include in a keyword auction), the online knapsack problem, and optimal stopping problems. Our results indicate that the models have learned behaviours that are consistent with the traditional optimal algorithms for these problems.

Joint work with William Kong, Chris Liaw, and Aranyak Mehta; appeared in ICLR 2019.

1:00 -- 2:30 Lunch (provided on site)
2:30 -- 3:10 Ilya Razenshteyn Learning Space Partitions for Nearest Neighbor Search

I will talk about a new algorithm for finding high-quality space partitions useful for nearest neighbor data structures. It is based on balanced graph partitioning followed by supervised learning. When instantiated with KaHIP (Sanders, Schulz SEA 2013) and neural networks, respectively, the new algorithm consistently outperforms quantization-based and tree-based partitioning methods.

Joint work with Yihe Dong (Microsoft Research), Piotr Indyk (MIT) and Tal Wagner (MIT).

3:10 -- 3:50 Piotr Indyk Learning-Based Low-Rank Approximations

We introduce a “learning-based” algorithm for the low-rank matrix decomposition problem: given an n × d matrix A, and a parameter k, compute a rank-k matrix A' that minimizes the approximation loss |A-A'|F. The algorithm uses a training set of input matrices in order to optimize its performance. Specifically, some of the most efficient approximate algorithms for computing low-rank approximations proceed by computing a projection SA, where S is a sparse random m × n “sketching matrix”, and then performing the singular value decomposition of SA. We show how to replace the random matrix S with a “learned” matrix of the same sparsity to reduce the error.

Our experiments show that, for multiple types of data sets, a learned sketch matrix can substantially reduce the approximation loss compared to a random matrix S, sometimes by one order of magnitude. We also study mixed matrices where only some of the rows are trained and the remaining ones are random, and show that matrices still offer improved performance while retaining worst-case guarantees.

Joint work with Ali Vakilian and Yang Yuan.

[slides]
Resources:
  • Learning-Augmented Algorithms (6.890) by Piotr Indyk and Costis Daskalakis, Spring 2019 (MIT). [course webpage]
  • Data Driven Algorithm Design by Nina Balcan, Plenary talk at 24th Annual LIDS Student Conference, MIT 2019. [videos] [slides]
  • Learning in Algorithms by Sergei Vassilvitskii, Survey talk at 4th Highlights of Algorithms (HALG 2019). [slides]