Recent availability of large data sets has had a significant impact on the design of algorithms. While working with big data,
classical algorithms are often too inefficient, e.g., they are too slow, or require too much space. This course focuses on
algorithms that are specifically designed for large datasets and will cover the following topics.
-
Some of the new computational models that capture various aspects of massive data computation such as streaming algorithms, and
sub-linear time algorithms.
-
Some of the algorithmic techniques and tools for solving problems over massive data,
such as sampling, sketching, dimensionality reduction, and computing efficient summaries of the data (e.g., core-sets).
This is a theoretical course and targets both graduate students and advanced undergraduate students with a strong
background in algorithms and discrete mathematics.
|
Monday, March 29th |
Course Logistics, Introduction to the course, Distinct Elements, Morris Counter
(Slides,
Notes on Morris Counter from Jelani's Lecture)
|
Wednesday, March 31st |
Norm and Frequency Estimation Streaming Algorithms (AMS, CountMin, CountSketch)
(Slides, See also Piotr's slides:
Lec3,
Lec5,
Lec6)
|
Monday, April 5th |
Streaming Graph Algorithms (Connectivity using L_0 samplers)
(Slides,
See Andrew's course for more streaming graph algorithms.)
|
Wednesday, April 7th |
Streaming Algorithms for Coverage Problems
(Slides)
|
Monday, April 12th |
Streaming Geometric Algorithms
(Slides,
See Piotr's lecture)
|
Wednesday, April 14th |
Streaming Lower Bounds
(Slides)
|
Monday, April 19th |
Core-sets (definition, and core-set for diversity maximization)
(Slides)
|
Wednesday, April 21st |
Core-sets (for k-median)
(Slides,
See Dan Feldman's Videos
on Core-sets)
|
Monday, April 26th |
Dimension Reduction
(Slides, Lecture Notes from Piotr and Jelani's course:
Lec3,
Lec5,
Lec9)
|
Wednesday, April 28st |
Nearest Neighbor Search
(Slides, See the
ANN paper by Har-Peled, Indyk,
and Motwani)
|
Monday, May 3rd |
Sub-linear Time Algorithms
(Slides, See also Ronitt's slides:
Lec1,
Lec12,
Lec13)
|
Wednesday, May 5th |
Sub-linear Time Algorithms
(Slides, See Ronitt's slides:
Lec13
)
|
Monday, May 10th |
Property Testing (Testing on Distributions)
(Slides, See Ronitt's slides:
Lec2,
Lec2 Notes,
Lec4
)
|
Wednesday, May 12th |
Randomized Linear Algebra (Matrix Product Apprxoimation)
(Slides,
See Jelani's Lecture)
|
Monday, May 17th |
Randomized Linear Algebra |
Wednesday, May 19th |
Randomized Linear Algebra (Applications) |
Monday, May 24th |
Project Presentation |
Wednesday, May 26th |
Project Presentation |
|