Tal Wagner
Academic Homepage
I am a postdoc in the Machine Learning Foundations group at Microsoft Research Redmond.
I recently obtained my PhD in Computer Science at CSAIL, MIT, advised by Piotr Indyk.
My research interests are in designing algorithms for massive datasets and largescale machine learning, especially in the following contexts:
 Highdimensional metric data and nearest neighbor search [1, 2, 3, 4, 5, 6, 7]
 Efficient linear algebra [1, 2, 3]
 Learning on data streams [1]
 Datadriven and learningaugmented algorithm design [1, 2, 3]
Email: firstname (dot) lastname (at) gmail (dot) com
Updates
[ recent  all ]
(Jan'22)  Our work on learningaugmented motif counting on graph streams is accepted to ICLR'22.  
(Jan'22)  Seminar talk in TelAviv University School of Computer Science.  
(Jan'22)  Seminar talk in the Hebrew University Computer Science Colloquium.  
(Jan'22)  Seminar talk in BenGurion University Computer Science Department Seminar.  
(Jan'22)  Seminar talk in BarIlan University Computer Science Colloquium.  
(Jan'22)  Seminar talk in the Technion Faculty of Computer Science.  
(Dec'21)  Seminar talk in TelAviv University Electrical EngineeringSystems Department Colloquium.  
(Dec'21)  Seminar talk in Haifa University Computer Science Colloquium.  
(Dec'21)  Seminar talk in Reichman University School of Computer Science.  
(Nov'21)  Our work on optimal metric compression is accepted to the SIAM Journal on Computing.  
(Nov'21)  Honored to have been selected for the Rising Stars in Data Science workshop at the University of Chicago. 
(Sep'21)  Our work on datadriven algorithms for low rank approximation is accepted to NeurIPS'21.  
(May'21)  Our work on faster kernel matrix algebra is accepted to ICML'21.  
(Jan'21)  Our work on learningbased support size estimation is accepted to ICLR'21 as a spotlight.  
(Sep'20)  I joined the Machine Learning Foundations group in MSR Redmond as a postdoc.  
(Aug'20)  I completed my PhD in MIT.  
(Jul'20)  I spoke at the "Are we ready for semisupervised learning for big data? From theory to practice" minisymposium at SIAM Imaging Science 2020.  
(Jul'20)  We presented our work on scalable nearest neighbor search for optimal transport in ICML'20. Here is the video. 
Publications
[ selected  all ]

Triangle and Four Cycle Counting with Predictions in Graph Streams
with Justin Chen, Talya Eden, Piotr Indyk, Honghao Lin, Shyam Narayanan, Ronitt Rubinfeld, Sandeep Silwal, David Woodruff and Michael Zhang
in ICLR 2022
[link soon]

Optimal (Euclidean) Metric Compression
with Piotr Indyk
in SIAM Journal on Computing (SICOMP), to appear 2022
[arXiv] [video (30min)]

FewShot DataDriven Algorithms for Low Rank Approximation
with Piotr Indyk and David Woodruff
in NeurIPS 2021
[link soon]

Faster Kernel Matrix Algebra via Density Estimation
with Arturs Backurs, Piotr Indyk and Cameron Musco
in ICML 2021
[arXiv] [video (5min)]

Learningbased Support Estimation in Sublinear Time
with Talya Eden, Piotr Indyk, Shyam Narayanan, Ronitt Rubinfeld and Sandeep Silwal
in ICLR 2021, Spotlight presentation
[arXiv]

Scalable Nearest Neighbor Search for Optimal Transport
with Arturs Backurs, Yihe Dong, Piotr Indyk and Ilya Razenshteyn
in ICML 2020
Earlier version in OTML Workshop at NeurIPS 2019
[arXiv] [slides (PDF)] [video (12min)] [code]

Learning Space Partitions for Nearest Neighbor Search
with Yihe Dong, Piotr Indyk and Ilya Razenshteyn
in ICLR 2020
[arXiv] [slides (PDF)] [video (5min)] [code]

Eccentricity Heuristics through Sublinear Analysis Lenses
in APoCS 2020
Earlier version in SPAA 2019 (brief announcement)
[PDF] [brief announcement (PDF)] [slides (PDF)]

Multitasking Capacity: Hardness Results and Improved Constructions
with Noga Alon, Jonathan D. Cohen, Tom Griffiths, Pasin Manurangsi, Daniel Reichman, Igor Shinkar and Alexander Yu
in SIAM Journal on Discrete Mathematics (SIDMA), 2020
[arXiv] [doi]

Space and Time Efficient Kernel Density Estimation in High Dimensions
with Arturs Backurs and Piotr Indyk
in NeurIPS 2019
[PDF] [slides (PDF)] [poster (PDF)] [code]

Scalable Fair Clustering
with Arturs Backurs, Piotr Indyk, Krzysztof Onak, Baruch Schieber and Ali Vakilian
in ICML 2019
[arXiv] [code]

SampleOptimal LowRank Approximation of Distance Matrices
with Piotr Indyk, Ali Vakilian and David Woodruff
in COLT 2019
[arXiv] [slides (PDF)] [poster (PDF)] [video (10min)]

SemiSupervised Learning on Data Streams via Temporal Label Propagation
with Sudipto Guha, Shiva Kasiviswanathan and Nina Mishra
in ICML 2018
[PDF] [proceedings] [slides (PDF)] [poster (PDF)] [video (10min)]

Approximate Nearest Neighbors in Limited Space
with Piotr Indyk
in COLT 2018
[arXiv] [slides (PDF)] [poster (PDF)] [video (10min)]

Practical DataDependent Metric Compression with Provable Guarantees
with Piotr Indyk and Ilya Razenshteyn
in NeurIPS 2017
[arXiv] [code] [poster (PDF)]

A GraphTheoretic Approach to Multitasking
with Noga Alon, Jonathan D. Cohen, Biswadip Dey, Tom Griffiths, Sebastian Musslick, Kayhan Ozcimder, Daniel Reichman and Igor Shinkar
in NeurIPS 2017, Oral presentation
[arXiv] [poster (PDF)]

NearOptimal (Euclidean) Metric Compression
with Piotr Indyk
in SODA 2017
[arXiv] [doi] [slides]

A Samplingbased Approach to Accelerating Queries in Log Management Systems
with Eric Schkufza and Udi Wieder
in SPLASH 2016
[doi]

Towards Resistance Sparsifiers
with Michael Dinitz and Robert Krauthgamer
in RANDOM 2015
[arXiv] [doi] [slides (PDF)]

CheegerType Approximation for Sparsest stCut
with Robert Krauthgamer
in ACM Transactions on Algorithms (TALG), 2016
[arXiv] [doi]

Generalized Girth Problems in Graphs and Hypergraphs
with Uriel Feige
Manuscript 2013
[PDF]

Volume Regularization for Binary Classification
Koby Crammer and Tal Wagner
in NeurIPS 2012, Spotlight presentation
[PDF]
Ph.D. Thesis:  Efficient Metric Representations for Big Data 
MIT, September 2020 