On the communication and streaming complexity of maximum bipartite matching, SODA 2012
A. Goel, M. Kapralov and S. Khanna
[Abstract]
Consider the following communication problem. Alice holds a graph
$G_A=(P, Q, E_A)$ and Bob holds a graph $G_B=(P, Q, E_B)$, where
$|P|=|Q|=n$. Alice is allowed to send Bob a message $m$ that depends
only on the graph $G_A$. Bob must then output a matching $M\subseteq
E_A\cup E_B$. What is the minimum message size of the message $m$ that Alice
sends to Bob that allows Bob to recover a matching of size at least
$(1-\epsilon)$ times the maximum matching in $G_A\cup G_B$? The minimum
message length is the one-round communication complexity of
approximating bipartite matching. It is easy to see that the one-round
communication
complexity also gives a lower bound on the space needed by a one-pass
streaming algorithm to compute a $1-\epsilon$-approximate bipartite
matching. The focus of this work is to understand one-round
communication complexity and one-pass streaming complexity of maximum
bipartite matching. In particular, how well can one approximate these
problems with linear communication and space? Prior to our work, only a
$\frac{1}{2}$-approximation was known for both these problems.
In order to study these questions, we introduce the concept of an
$\epsilon$-matching cover of a bipartite graph $G$, which is a sparse
subgraph of the original graph that preserves the size of maximum
matching between every subset of vertices to within an additive $\epsilon n$
error. We give a polynomial time construction of a
$\frac{1}{2}$-matching cover of size $O(n)$ with some crucial additional
properties, thereby showing that Alice and Bob can achieve a
$\frac{2}{3}$-approximation with a message of size $O(n)$. While we do
not provide bounds on the size of $\epsilon$-matching covers for $\epsilon<1/2$, we
prove that in general, the size of the smallest $\epsilon$-matching cover of a
graph $G$ on $n$ vertices is essentially equal to the size of the
largest so-called $\epsilon$-Ruzsa Szemeredi graph on $n$ vertices. We use
this connection to show that for any $\delta>0$, a
$(\frac{2}{3}+\delta)$-approximation requires a communication complexity
of $n^{1+\Omega(1/\log\log n)}$.
We also consider the natural restrictingon of the problem in which $G_A$
and $G_B$ are only allowed to share vertices on one side of the
bipartition, which is motivated by applications to one-pass streaming
with vertex arrivals. We show that a $\frac{3}{4}$-approximation can be
achieved with a linear size message in this case, and this result is
best possible in that super-linear space is needed to achieve any better
approximation.
Finally, we build on our techniques for the restricted version above to
design one-pass streaming algorithm
for the case when vertices on one side are known in advance, and the
vertices on the other side arrive in a streaming manner together with
all their incident edges. This is precisely the setting of the
celebrated $(1- \frac{1}{e})$-competitive randomized algorithm of
Karp-Vazirani-Vazirani (KVV) for the online
bipartite matching problem. We present here the first
deterministic one-pass streaming $(1-\frac{1}{e})$-approximation
algorithm using $O(n)$ space for this setting.