Lijie Chen
Photo by Toru
Research Interests: Theoretical Computer Science
I have a broad interest in theoretical computer science. In particular, I am interested in Computational Complexity and FineGrained Complexity.
I am currently a fifthyear graduate student at MIT, and I am very fortunate to be advised by Ryan Williams. Prior to that, I received my bachelor's degree from Yao Class at Tsinghua University.
At Tsinghua University, I was advised by Prof. Jian Li, working on MultiArmed Bandits. During the Spring of 2016, I was visiting MIT, working under the supervision of Prof. Scott Aaronson on Quantum Complexity. [CV]
I was visiting Simons Institute in the Fall of 2018.
Twitter
Note: Click on [summary] or [highlight] to view summaries or highlight of the papers/projects!
Some of my projects:
Video/Slides/Summary of Recent Work
Truly LowSpace Element Distinctness and Subset Sum via Pseudorandom Hash Functions [arxiv]
[abstract]
We consider lowspace algorithms for the classic Element Distinctness problem: given an array of \(n\) input integers with \(O(\log n)\) bitlength, decide whether or not all elements are pairwise distinct.
Beame, Clifford, and Machmouchi [FOCS 2013] gave an \(\tilde O(n^{1.5})\)time randomized algorithm for Element Distinctness using only \(O(\log n)\) bits of working space. However, their algorithm assumes a random oracle (in particular, readonly random access to polynomially many random bits), and it was asked as an open question whether this assumption can be removed.
In this paper, we positively answer this question by giving an \(\tilde O(n^{1.5})\)time randomized algorithm using \(O(\log ^3 n\log \log n)\) bits of space, with oneway access to random bits. As a corollary, we also obtain a \(\textrm{poly}(n)\)space \(O^*(2^{0.86n})\)time randomized algorithm for the Subset Sum problem, removing the random oracles required in the algorithm of Bansal, Garg, Nederlof, and Vyas [STOC 2017]. The main technique underlying our results is a pseudorandom hash family based on iterative restrictions, which can fool the cyclefinding procedure in the algorithms of Beame et al. and Bansal et al.
Hardness vs Randomness, Revised: Uniform, NonBlackBox, and InstanceWise [eccc] [slides by me] [slides by Roei]
[short summary]
Textbook hardnesstorandomness converts circuit lower bounds into PRGs. But is this blackbox approach really necessary for derandomization? In this new work we revamp the classical hardnesstorandomness framework, showing how to convert new types of uniform lower bounds into nonblackbox derandomization, deducing conclusions such as promiseBPP = promiseP without PRGs. Moreover, we show that the same types of lower bounds are in fact necessary for any type of derandomization! This reveals a tight connection between any derandomization of promiseBPP (i.e., not necessarily a blackbox one) and the foregoing new types of uniform lower bounds.
Our framework also allows a flexible tradeoff between hardness and randomness. In an extreme setting, we show that plausible uniform lower bounds imply that "randomness is indistinguishable from useless". That is, every randomized algorithm can be derandomized with an arbitrarily small polynomial overhead, such that no polynomialtime algorithm can find a mistake with nonnegligible probability.
Simple and fast derandomization from very hard functions: Eliminating randomness at almost no cost [eccc] [slides by me] [slides by Roei] [video by Roei Tell at STOC 2021]
[short highlight]
Derandomization with linear overhead: Under plausible assumptions, we show that any \(T\)time randomized computation can be derandomized in \(T \cdot n\)time.
Conditional Optimality: Assuming NSETH, the \(n\) overhead above is optimal.
[longer summary]
Derandomization with a nearlinear overhead: Assuming (1) oneway function exists and (2) generic \(2^{kn}\)time computation cannot be speed up to \(2^{(k\varepsilon) n}\) time with \(2^{(1\varepsilon)n}\) bits of advice, we show that \(T(n)\)time randomized computation can be derandomized in roughly \(T(n) \cdot n^{1+\varepsilon}\) time.
Optimality assuming NSETH: Assuming Nondeterministic Strong Exponential Time Hypothesis (NSETH), the \(n\) overhead is optimal (upto \(n^{o(1)}\)) for every reasonable timebound \(T(n)\). Indeed, we only need a weaker version claiming that \(\#\mathsf{SAT}\) requires \(2^{(1o(1)) \cdot n}\) nondeterministic time (aka, \(\#\mathsf{NSETH}\)).
AverageCase Derandomization with basically no overhead Assuming similar assumptions, we show that \(T(n)\)time randomized computation can be derandomized in \(T(n) \cdot n^{o(1)}\)time with respect to every \(T(n)\)time samplable distribution \(S\) (meaning, for \(x \sim S\), the derandomization fails with \(n^{\omega(1)}\) probability).
Majority vs. Approximate Linear Sum and averagecase complexity below NC1 [eccc] [slides by Igor] [video by Xin]
[short highlight]
For any circuit class \(\mathcal{C}\) contained in \(\mathsf{NC}^1\), (among many other things) we show the following:
AverageCase hardness w.r.t. certain distributions \(\Leftrightarrow\) AverageCase hardness w.r.t. the uniform distribution:
If \(E\) is \(\delta\)averagecase hard against \(\mathcal{C}\) w.r.t. to some distribution \(\mathcal{D}\), then \(E\) is also \(\delta\)averagecase hard against \(\mathcal{C}\) under the uniform distribution.
Generic seedreduction for PRGs:
If there is an \((n1)\)seed PRG fooling \(\mathcal{C}\) with error \(n^{\omega(1)}\), then there is an \(n^{o(1)}\)seed PRG fooling \(\mathcal{C}\) with error \(n^{\omega(1)}\).
Almost Optimal SuperConstantPass Streaming Lower Bounds for Reachability [eccc] [video by Raghuvansh Saxena at STOC 2021]
[short highlight]
We prove that any \(o\left(\sqrt{\log n}\right)\)pass streaming algorithm requires \(n^{2o(1)}\) space to solve the Reachability problem for directed graphs (given two vertices \(s\) and \(t\), determine whether \(s\) can reach \(t\) in the given graph). By known reductions, our result implies the same almost quadratic lower bounds for other problems, including maximum matching, shortest path, matrix rank, and linear programming.
The construction of our hard instances makes crucial use of RuzsaSzemeŕedi graphs and lowdepth sorting networks.
Selected Publications
Derandomization
Hardness vs Randomness, Revised: Uniform, NonBlackBox, and InstanceWise [eccc] [slides]
[short summary]
Textbook hardnesstorandomness converts circuit lower bounds into PRGs. But is this blackbox approach really necessary for derandomization? In this new work we revamp the classical hardnesstorandomness framework, showing how to convert new types of uniform lower bounds into nonblackbox derandomization, deducing conclusions such as promiseBPP = promiseP without PRGs. Moreover, we show that the same types of lower bounds are in fact necessary for any type of derandomization! This reveals a tight connection between any derandomization of promiseBPP (i.e., not necessarily a blackbox one) and the foregoing new types of uniform lower bounds.
Our framework also allows a flexible tradeoff between hardness and randomness. In an extreme setting, we show that plausible uniform lower bounds imply that "randomness is indistinguishable from useless". That is, every randomized algorithm can be derandomized with an arbitrarily small polynomial overhead, such that no polynomialtime algorithm can find a mistake with nonnegligible probability.
Lijie Chen, Roei Tell. Foundations of Computer Science (FOCS 2021). Invited to the SICOMP Special Issue for FOCS 2021
Simple and fast derandomization from very hard functions: Eliminating randomness at almost no cost [eccc] [Oded's choice] [slides by me] [slides by Roei]
[short highlight]
Derandomization with linear overhead: Under plausible assumptions, we show that any \(T\)time randomized computation can be derandomized in \(T \cdot n\)time.
Conditional Optimality: Assuming NSETH, the \(n\) overhead above is optimal.
[longer summary]
Derandomization with a nearlinear overhead: Assuming (1) oneway function exists and (2) generic \(2^{kn}\)time computation cannot be speed up to \(2^{(k\varepsilon) n}\) time with \(2^{(1\varepsilon)n}\) bits of advice, we show that \(T(n)\)time randomized computation can be derandomized in roughly \(T(n) \cdot n^{1+\varepsilon}\) time.
Optimality assuming NSETH: Assuming Nondeterministic Strong Exponential Time Hypothesis (NSETH), the \(n\) overhead is optimal (upto \(n^{o(1)}\)) for every reasonable timebound \(T(n)\). Indeed, we only need a weaker version claiming that \(\#\mathsf{SAT}\) requires \(2^{(1o(1)) \cdot n}\) nondeterministic time (aka, \(\#\mathsf{NSETH}\)).
AverageCase Derandomization with basically no overhead Assuming similar assumptions, we show that \(T(n)\)time randomized computation can be derandomized in \(T(n) \cdot n^{o(1)}\)time with respect to every \(T(n)\)time samplable distribution \(S\) (meaning, for \(x \sim S\), the derandomization fails with \(n^{\omega(1)}\) probability).
Lijie Chen, Roei Tell. Symposium on the Theory of Computing (STOC 2021)
Circuit Lower Bounds from Algorithms
Almost Everywhere Circuit Lower Bounds from NonTrivial Derandomization [eccc] [video by Xin Lyu in FOCS 2020] [slides by Ryan]
[short highlight]
(Among many other things), we show that there is a function \(f \in \mathsf{E}^{\mathsf{NP}}\) such that \(f_n\) (\(f\) restricted to \(n\)bit inputs) cannot be (\(1/2 + 2^{n^{\varepsilon}}\))approximated by \(2^{n^{\varepsilon}}\)size \(\mathsf{ACC}^0\) circuits, for all sufficiently large input lengths \(n\).
Our lower bounds come from a generic framework showing that nontrivial derandomization of a circuit class \(\mathcal{C}\) implies \(\mathsf{E}^{\mathsf{NP}}\) is almosteverywhere hard for \(\mathcal{C}\).
Lijie Chen, Xin Lyu, Ryan Williams. Foundations of Computer Science (FOCS 2020)
Strong AverageCase Circuit Lower Bounds from Nontrivial Derandomization [eccc] [slides at Simons (for complexity theorist)] [slides at IAS (intense version)] [slides at Warwick (lightweight version)]
[short highlight]
(Among many other things) we show that \(\mathsf{NQP} = \mathsf{NTIME}[2^{\operatorname{polylog}(n)}]\) cannot be \((1/2 + 1/\operatorname{poly}(n))\)approximated by \(\mathsf{ACC}^0\).
Our lower bounds come from a generic framework showing that certain nontrivial derandomization of a circuit class \(\mathcal{C}\) implies \(\mathsf{NQP}\) cannot be \((1/2 + 1/\operatorname{poly}(n))\)approximated by \(\mathcal{C}\).
Lijie Chen, Hanlin Ren. Symposium on the Theory of Computing (STOC 2020). Invited to the SICOMP Special Issue for STOC 2020
Efficient Construction of Rigid Matrices Using an NP Oracle [pdf] [slides] [Oded's choice]
[short highlight]
Explicit construction of rigid matrices is a longstanding open question. We give a somewhat explicit (\(\mathsf{P}^{\mathsf{NP}}\)) construction of a familiy \(\{H_n\}_{n \in \mathbb{N}}\) of \(n \times n\) \(\mathbb{F}_2\)matrices such that for infinitely many \(n\), \(H_n\) is \(\Omega(n^2)\)far in Hamming distance to any \(2^{(\log n)^{1/4\varepsilon}}\)rank \(\mathbb{F}_2\)matrices.
Josh Alman, Lijie Chen. Foundations of Computer Science (FOCS 2019). Machtey Award for Best Student Paper. Invited to the SICOMP Special Issue for FOCS 2019
Hardness Magnification (Strong Lower Bounds from Much Weaker Lower Bounds)
Beyond Natural Proofs: Hardness Magnification and Locality [eccc] [arxiv] [notes by Igor]
[short highlight]
The natural proof barrier does not seem to apply here since Hardness Magnification (HM) theorems only work for special functions. So one natural question stemmed from the hardness magnification phenomenon is to understand whether there is another inherent barrier, which prevents us from using current techniques to prove the lower bounds required by HM theorems.
We formulated a concrete barrier called Locality Barrier. Roughly speaking, the locality barrier says that if your lower bounds methods are robust enough to handle smallfanin oracles, then it cannot be used to prove the lower bounds required by HM theorems. Unfortunately, it seems most lower bounds techniques we are aware of (random restrictions, approximation method, communication complexity based lower bounds, etc.) are subject to this barrier.
Lijie Chen, Shuichi Hirahara,Igor Oliveira, Jan Pich, Ninad Rajgopal, Rahul Santhanam. Innovations in Theoretical Computer Science (ITCS 2020)
Bootstrapping Results for Threshold Circuits “Just Beyond” Known Lower Bounds [eccc] [Oded's choice]
[short highlight]
For some natural \(\mathsf{NC}^1\)complete problem \(P\), we show that (among many other things) if one can show \(P\) requires depth\(d\) \(\mathsf{TC}^{0}\) circuit of size (in term of wires) \(n^{1+\exp(o(d))}\), then \(\mathsf{NC}^1 \ne \mathsf{TC}^0\). Previous work implies that \(P\) has no depth\(d\) \(\mathsf{TC}^{0}\) of size \(n^{1+c^{d}}\), for some constant \(c > 1\).
Lijie Chen, Roei Tell. Symposium on the Theory of Computing (STOC 2019). Danny Lewin Best Student Paper Award
Other Topics
(Streaming Lower Bounds) Almost Optimal SuperConstantPass Streaming Lower Bounds for Reachability [eccc]
Lijie Chen, Gillat Kol, Dmitry Paramonov, Raghuvansh Saxena, Zhao Song, Huacheng Yu. Symposium on the Theory of Computing (STOC 2021). Invited to the SICOMP Special Issue for STOC 2021
(Finegrained Complexity) On The Hardness of Approximate and Exact (Bichromatic) Maximum Inner Product [eccc] [arxiv] [slides] [journal version]
[short highlight]
Under \(\mathsf{SETH}\), we give a characterization on when approxiamte Boolean MaxIP is hard (w.r.t approxiamtion ratios and vector dimensions). We also show quadratic time hardness for ZMaxIP with \(2^{\log^*n}\) dimensions (Recall that \(\log^* n\) is the number of logs it takes to reduce \(n\) to at most \(1\). It is an extremely slowgrowing function.)
One notable corollary is that finding the furthest pair among \(n\) points in \(2^{\log^*n}\) dimension Euclidian space requires essentialy \(n^2\) time under \(\mathsf{SETH}\).
Lijie Chen. Computational Complexity Conference (CCC 2018). Invited to the Toc Special Issue for CCC 2018
(Differential Privacy) On Distributed Differential Privacy and Counting Distinct Elements [arxiv] [long slides] [short slides]
[summary]
Problem: We study the CountDisticnt problem where there are \(n\) users each holding an element in a distributed setting, and the goal is to (approximately) count the total number of distinct elements.
Noninteractive Local Model: We show that no (1) \((\ln n  7 \ln\ln n, n^{\omega(1)})\)DP Local protocol can solve CountDisticnt with error \(n/(\ln n)^{\omega(1)}\) and (2) there is an \((\ln n)\)DP Local protocol solving CountDisticnt with error \(\tilde{O}(\sqrt{n})\).
Shuffle Model: We show that no (1) \((O(1), 2^{\log^9 n})\)DP singlemessage shuffle protocol can solve CountDisticnt with error \(n/(\ln n)^{\omega(1)}\) and (2) there is an \((O(1), 2^{\log^9 n})\)DP shuffle protocol solving CountDisticnt with error \(\tilde{O}(\sqrt{n})\), where each user sends at most \(1\) messages in expectation.
TwoParty Model: We also established an \(\tilde{\Omega}(n)\) vs. \(O(1)\) separation between error in twoparty DP and global sensitivity, answering an open question of McGregor et al. (2011).
Dominated Protocols: We also introduce a relaxation of localDP protocols called dominatedprotocols, and show that multimessage shuffleDP protocols are dominated. By proving lower bounds against dominated protocols, we also prove lower bounds for selection and learningparity against multimessage shuffleDP protocols.
Moment Matching and Poissonization: Inspired by the Poissonization and moment matching techniques in the context of property testing. Our lower bounds are proved by (1) constructing two hard distributions on datasets (2) expressing the histogram of applying any \((\ln n  7 \ln\ln n, n^{\omega(1)})\)DP on a distribution of datasets as a sum of many independent mixtures of multidimensional Poisson distributions. (3) Using the momentmatching technique to bound the statistical distance between two mixtures of multidimensional Poisson distributions.
Lijie Chen, Badih Ghazi, Ravi Kumar, Pasin Manurangsi. Innovations in Theoretical Computer Science (ITCS 2021)
