Emmanouil (Manolis) Zampetakis

Welcome! I am a PhD student in the Department of Electrical Engineering and Computer Science of Massachusetts Institute of Technology (MIT). I am very fortunate to be advised by Constantinos Daskalakis.

My research interest include Algorithmic Game Theory, Machine Learning (particularly proving theoretical guarantees for machine learning algorithms), Learning Theory and Algorithms.

Email: mzampet [at) mit(dot]edu
CV: pdf


I joined the graduate program of MIT on September 2014 in the Theory of Computation group at CSAIL. My research advisor is Costis Daskalakis and we are working on a wide range of problems from algorithmic game theory to proving theoretical guarantees for machine learning algorithms.

Before MIT, I was an undergrad student at the Department of Electrical Engineering and Computer Science at National Technical University of Athens where I completed my Diploma Degree. During my time there, I was fortunate to work with Dimitris Fotakis on Mechanism Design with Verification and Scheduling Map-Reduce jobs.

Research Papers

Abstract. Banach's fixed point theorem for contraction maps has been widely used to analyze the convergence of iterative methods in non-convex problems. It is a common experience, however, that iterative maps fail to be globally contracting under the natural metric in their domain, making the applicability of Banach's theorem limited. We explore how generally we can apply Banach's fixed point theorem to establish the convergence of iterative methods when pairing it with carefully designed metrics.

Our first result is a strong converse of Banach's theorem, showing that it is a universal analysis tool for establishing global convergence of iterative methods to unique fixed points, and for bounding their convergence rate. In other words, we show that, whenever an iterative map globally converges to a unique fixed point, there exists a metric under which the iterative map is contracting and which can be used to bound the number of iterations until convergence. We illustrate our approach in the widely used power method, providing a new way of bounding its convergence rate through contraction arguments.

We next consider the computational complexity of Banach's fixed point theorem. Making the proof of our converse theorem constructive, we show that computing a fixed point whose existence is guaranteed by Banach's fixed point theorem is CLS-complete. We thus provide the first natural complete problem for the class CLS, which was defined in [DP11] to capture the complexity of problems such as P-matrix LCP, computing KKT-points, and finding mixed Nash equilibria in congestion and network coordination games.

21st International Conference on Artificial Intelligence and Statistics (AISTATS) 2018
pdf abs

Abstract. We study the convergence properties of the Expectation-Maximization algorithm in the Naive Bayes model. We show that EM can get stuck in regions of slow convergence, even when the features are binary and i.i.d. conditioning on the class label, and even under random (i.e. non worst-case) initialization. In turn, we show that EM can be bootstrapped in a pre-training step that computes a good initialization. From this initialization we show theoretically and experimentally that EM converges exponentially fast to the true model parameters. Our bootstrapping method amounts to running the EM algorithm on appropriately centered iterates of small magnitude, which as we show corresponds to effectively performing power iteration on the covariance matrix of the mixture model, although power iteration is performed under the hood by EM itself. As such, we call our bootstrapping approach "power EM". Specifically for the case of two binary features, we show global exponentially fast convergence of EM, even without bootstrapping. Finally, as the Naive Bayes model is quite expressive, we show as corollaries of our convergence results that the EM algorithm globally converges to the true model parameters for mixtures of two Gaussians, recovering recent results of [XHM16, DTZ17].

30th Conference on Learning Theory (COLT) 2017

Preliminary version: NIPS Workshop on Non-Convex Optimization for Machine Learning, 2016.

pdf abs

Abstract. The Expectation-Maximization (EM) algorithm is a widely used method for maximum likelihood estimation in models with latent variables. For estimating mixtures of Gaussians, its iteration can be viewed as a soft version of the k-means clustering algorithm. Despite its wide use and applications, there are essentially no known convergence guarantees for this method. We provide global convergence guarantees for mixtures of two Gaussians with known covariance matrices. We show that the population version of EM, where the algorithm is given access to infinitely many samples from the mixture, converges geometrically to the correct mean vectors, and provide simple, closed-form expressions for the convergence rate. As a simple illustration, we show that, in one dimension, ten steps of the EM algorithm initialized at infinity result in less than 1% error estimation of the means. In the finite sample regime, we show that, under a random initialization, O(d/epsilon^2) samples suffice to compute the unknown vectors to within epsilon in Mahalanobis distance, where d is the dimension. In particular, the error rate of the EM based estimator is O(sqrt(d/n)) where n is the number of samples, which is optimal up to logarithmic factors.

28th ACM-SIAM Symposium on Discrete Algorithms (SODA), 2017.
pdf abs

Abstract. A conditional sampling oracle for a probability distribution D returns samples from the conditional distribution of D restricted to a specified subset of the domain. A recent line of work (Chakraborty et al. 2013 and Cannone et al. 2014) has shown that having access to such a conditional sampling oracle requires only polylogarithmic or even constant number of samples to solve distribution testing problems like identity and uniformity. This significantly improves over the standard sampling model where polynomially many samples are necessary. Inspired by these results, we introduce a computational model based on conditional sampling to develop sublinear algorithms with exponentially faster runtimes compared to standard sublinear algorithms. We focus on geometric optimization problems over points in high dimensional Euclidean space. Access to these points is provided via a conditional sampling oracle that takes as input a succinct representation of a subset of the domain and outputs a uniformly random point in that subset. We study two well studied problems: k-means clustering and estimating the weight of the minimum spanning tree. In contrast to prior algorithms for the classic model, our algorithms have time, space and sample complexity that is polynomial in the dimension and polylogarithmic in the number of points. Finally, we comment on the applicability of the model and compare with existing ones like streaming, parallel and distributed computational models.

Abstract. We introduce a general approach based on selective verification and obtain approximate mechanisms without money for maximizing the social welfare in the general domain of utilitarian voting. Having a good allocation in mind, a mechanism with verification selects few critical agents and detects, using a verification oracle, whether they have reported truthfully. If yes, the mechanism produces the desired allocation. Otherwise, the mechanism ignores any misreports and proceeds with the remaining agents. We obtain randomized truthful (or almost truthful) mechanisms without money that verify only O(lnm/ϵ) agents, where m is the number of outcomes, independently of the total number of agents, and are (1'ϵ)-approximate for the social welfare. We also show that any truthful mechanism with a constant approximation ratio needs to verify Ω(logm) agents. A remarkable property of our mechanisms is robustness, namely that their outcome depends only on the reports of the truthful agents.

8th International Symposium on Algorithmic Game Theory (SAGT), 2015.

Theory of Computing Systems, 2016 Invited. Special Issue for SAGT 2015.

pdf abs

Abstract. We study mechanism design where the payments charged to the agents are not in the form of monetary transfers, but are effectively burned. In this setting, the objective is to maximize social utility, i.e., the social welfare minus the payments charged. We consider a general setting with m discrete outcomes and n multidimensional agents. We present two essentially orthogonal randomized truthful mechanisms that extract an O(log m) fraction of the maximum welfare as social utility. Moreover, the first mechanism achieves a O(log m)-approximation for the social welfare, which is improved to an O(1)-approximation by the second mechanism. An interesting feature of the second mechanism is that it optimizes over an appropriately “smoothed” space, thus achieving a nice and smooth tradeoff between welfare approximation and the payments charged.

Abstract. We propose a constant approximation algorithm for generalizations of the Flexible Flow Shop (FFS) problem which form a realistic model for non-preemptive scheduling in MapReduce systems. Our results concern the minimization of the total weighted completion time of a set of MapReduce jobs on unrelated processors and improve substantially on the model proposed by Moseley et al. (SPAA 2011) in two directions: (i) we consider jobs consisting of multiple Map and Reduce tasks, which is the key idea behind MapReduce computations, and (ii) we introduce into our model the crucial cost of the data shuffle phase, i.e., the cost for the transmission of intermediate data from Map to Reduce tasks. Moreover, we experimentally evaluate our algorithm compared with a lower bound on the optimal cost of our problem as well as with a fast algorithm, which combines a simple online assignment of tasks to processors with a standard scheduling policy. As we observe, for random instances that capture data locality issues, our algorithm achieves a better performance.

EDBT/ICDT Workshop on Algorithms for Map Reduce and Beyond, 2014
pdf abs

Abstract. MapReduce framework is established as the standard approach for parallel processing of massive amounts of data. In this work, we extend the model of MapReduce scheduling on unrelated processors (Moseley et al., SPAA 2011) and deal with the practically important case of jobs with any number of Map and Reduce tasks. We present a polynomial-time (32 + \epsilon)-approximation algorithm for minimizing the total weighted completion time in this setting. To the best of our knowledge, this is the most general setting of MapReduce scheduling for which an approximation guarantee is known. Moreover, this is the first time that a constant approximation ratio is obtained for minimizing the total weighted completion time on unrelated processors under a nontrivial class of precedence constraints.

9th Conference on Web and Internet Economics (WINE), 2013.

ACM Transactions on Economics and Computation, 2015 Invited. Special Issue for WINE 2013.

pdf abs

Abstract. We investigate the reasons that make symmetric partial verification essentially useless in virtually all domains. Departing from previous work, we consider any possible (finite or infinite) domain and general symmetric verification. We identify a natural property, namely that the correspondence graph of a symmetric verification M is strongly connected by finite paths along which the preferences are consistent with the preferences at the endpoints, and prove that this property is sufficient for the equivalence of truthfulness and M-truthfulness. In fact, defining appropriate versions of this property, we obtain this result for deterministic and randomized mechanisms with and without money. Moreover, we show that a slightly relaxed version of this property is also necessary for the equivalence of truthfulness and M-truthfulness. Our conditions provide a generic and convenient way of checking whether truthful implementation can take advantage of any symmetric verification scheme in any domain. Since the simplest case of symmetric verification is local verification, our results imply, as a special case, the equivalence of local truthfulness and global truthfulness in the setting without money. To complete the picture, we consider asymmetric verification, and prove that a social choice function is M-truthfully implementable by some asymmetric verification M if and only if f does not admit a cycle of profitable deviations.