### Design of Fast Algorithms

We present an algorithm that, with high probability, generates a random spanning tree from an edge-weighted undirected graph in \(\tilde{O}(n^{\frac{4}{3}} m^{\frac{1}{2}} + n^2)\) time (The \(\tilde{O}\) notation hides polylog(n) factors). The tree is sampled from a distribution where the probability of each tree is proportional to the product of its edge weights. This improves upon the previous best algorithm due to Colbourn et al. that runs in matrix multiplication time, \(O(n^\omega)\). For the special case of unweighted graphs, this improves upon the best previously known running time of \(\tilde{O}(\min\{n^\omega, m\sqrt{n}, m^{\frac{4}{3}}\})\) for \(m \gg n^{\frac{5}{3}}\) (Colbourn et al. '96, Kelner-Madry '09, Madry et al. '15).

The effective resistance metric is essential to our algorithm, as in the work of Madry et al., but we eschew determinant-based and random walk-based techniques used by previous algorithms. Instead, our algorithm is based on Gaussian elimination, and the fact that effective resistance is preserved in the graph resulting from eliminating a subset of vertices (called a Schur complement). As part of our algorithm, we show how to compute ϵ-approximate effective resistances for a set S of vertex pairs via approximate Schur complements in \(\tilde{O}(m + (n+|S|)\epsilon^{-2})\) time, without using the Johnson-Lindenstrauss lemma which requires \(\tilde{O} \min\{(m+|S|)\epsilon^{-2}, m +n \epsilon^{-4} + |S| \epsilon^{-2}\} \) time. We combine this approximation procedure with an error correction procedure for handing edges where our estimate isn't sufficiently accurate.

A spectral sparsifier of a graph G is a sparser graph H that approximately preserves the quadratic form of G, i.e., for all vectors x, \(x^{\top}L_G x \approx x^{\top}L_H x\) , where \(L_G, L_H\) denote the respective graph Laplacians. Spectral sparsifiers generalize cut sparsifiers, and have found several applications in designing graph algorithms. In recent years, there has been interest in computing spectral sparsifiers in the semi-streaming and dynamic settings. Natural algorithms in these settings involve repeated sparsification of a dynamic graph. We present a framework for analyzing algorithms for graph sparsification that perform repeated sparsifications. The framework yields analyses that avoid a worst-case error accumulation across various resparsification steps, and only incur the error corresponding to a single sparsification step leading to better results. As an application, we show how to maintain a spectral sparsifier of a graph, with \(O(n \log n)\) edges in a semi-streaming setting: We present a simple algorithm that, for a graph G on n vertices and m edges, computes a spectral sparsifier of G with \(O(n \log n)\) edges in a single pass over G, using only \(O(n \log n)\) space, and \(O(m \log^2 n)\) total time. This improves on previous best constructions in the semi-streaming set- ting for both spectral and cut sparsifiers by a factor of log n in both space and runtime. The algorithm also extends to semi-streaming row sampling for general PSD matrices. As another example, we use this framework to combining an algorithm due to Koutis with improved spanner constructions to give a parallel algorithm for construction \(O(n \log^2 n \log \log n)\) sized spectral sparsifiers in \(O(m \log^2 n \log \log n)\) time. This is the best combinatorial graph sparsification algorithm to date, and the size of the sparsifiers produced is only a factor \(\log n \log \log n\) more than numerical ones.

Sampling points from the uniform distribution on a polytope is a well-studied problem, and is an important ingredient in several computational tasks involving polytopes, such as volume estimation. This is achieved by setting up a random walk inside the polytope, with its stationary distribution being uniform in the interior of the polytope. Kannan-Narayanan and Narayanan proposed the Dikin walk based on interior point methods, where the next point is sampled, roughly, from the Dikin ellipsoid at the current point. In this paper, we give a simple proof of the mixing time of the Dikin walk, using well-known properties of Gaussians, and concentration of Gaussian polynomials.

Fast, Sparse, and Simple

We show how to perform sparse approximate Gaussian elimination for Laplacian matrices. We present a simple, nearly linear time algorithm that approximates a Laplacian by a matrix with a sparse Cholesky factorization, the version of Gaussian elimination for symmetric matrices. This is the first nearly linear time solver for Laplacian systems that is based purely on random sampling, and does not use any graph theoretic constructions such as low-stretch trees, sparsifiers, or expanders. The crux of our analysis is a novel concentration bound for matrix martingales where the differences are sums of conditionally independent variables.

We introduce the sparsified Cholesky and sparsified multigrid algorithms for solving systems of linear equations. These algorithms accelerate Gaussian elimination by sparsifying the nonzero matrix entries created by the elimination process.

We use these new algorithms to derive the first nearly linear time algorithms for solving systems of equations in connection Laplacians—a generalization of Laplacian matrices that arise in many problems in image and signal processing.

We also prove that every connection Laplacian has a linear sized approximate inverse. This is an LU factorization with a linear number of nonzero entries that is a strong approximation of the original matrix. Using such a factorization one can solve systems of equations in a connection Laplacian in linear time. Such a factorization was unknown even for ordinary graph Laplacians.

Given a directed acyclic graph G, and a set of values y on the vertices, the Isotonic Regression of y is a vector x that respects the partial order described by G, and minimizes \(\|x−y\|\), for a specified norm. This paper gives improved algorithms for computing the Isotonic Regression for all weighted \(\ell_{p}\)-norms with rigorous performance guarantees. Our algorithms are quite practical, and their variants can be implemented to run fast in practice.

We develop fast algorithms for solving regression problems on graphs where one is given the value of a function at some vertices, and must find its smoothest possible extension to all vertices. The extension we compute is the absolutely minimal Lipschitz extension, and is the limit for large p of p-Laplacian regularization. We present an algorithm that computes a minimal Lipschitz extension in expected linear time, and an algorithm that computes an absolutely minimal Lipschitz extension in expected time \(\tilde{O}(mn)\). The latter algorithm has variants that seem to run much faster in practice. These extensions are particularly amenable to regularization: we can perform \(\ell_{0}\)-regularization on the given values in polynomial time and \(\ell_{1}\)-regularization on the initial function values and on graph edge weights in time \(\tilde{O}(m^{\frac{3}{2}})\).

Faster Algorithms via Approximation Theory illustrates how classical and modern techniques from approximation theory play a crucial role in obtaining results that are relevant to the emerging theory of fast algorithms. The key lies in the fact that such results imply faster ways to approximate primitives such as \(A^s v, A^{-1}v, \exp(-A)v\), eigenvalues and eigenvectors, which are fundamental to many spectral algorithms.

The first half of the book is devoted to the ideas and results from approximation theory that are central, elegant, and may have wider applicability in theoretical computer science. These include not only techniques relating to polynomial approximations but also those relating to approximations by rational functions and beyond. The remaining half illustrates a variety of ways that these results can be used to design fast algorithms.

Faster Algorithms via Approximation Theory is self-contained and should be of interest to researchers and students in theoretical computer science, numerical linear algebra, and related areas.

for Gaussian Mixtures and Autoencoders

We present a new algorithm for Independent Component Analysis (ICA) which has provable performance guarantees. In particular, suppose we are given samples of the form y = Ax + \(\eta\) where A is an unknown n X n matrix and x is chosen uniformly at random from \(\{+1, -1\}^n\), \(\eta\) is an n-dimensional Gaussian random variable with unknown covariance \(\Sigma\): We give an algorithm that provable recovers A and \(\Sigma\) up to an additive \(\epsilon\) whose running time and sample complexity are polynomial in n and \(1 / \epsilon\). To accomplish this, we introduce a novel "quasi-whitening" step that may be useful in other contexts in which the covariance of Gaussian noise is not known in advance. We also give a general framework for finding all local optima of a function (given an oracle for approximately finding just one) and this is a crucial step in our algorithm, one that has been overlooked in previous attempts, and allows us to control the accumulation of error when we find the columns of A one by one via local search.

an \(\tilde{O}(m)\)-Time Spectral Algorithm for Balanced Separator

We give a novel spectral approximation algorithm for the balanced separator problem that, given a graph G, a constant balance b \(\in (0,\frac{1}{2}],\) and a parameter \(\gamma,\) either finds an \(\Omega(b)\)-balanced cut of conductance \(O(\sqrt{\gamma})\) in G, or outputs a certificate that all b-balanced cuts in G have conductance at least \(\gamma,\) and runs in time \(\tilde{O}(m).\) This settles the question of designing asymptotically optimal spectral algorithms for balanced separator. Our algorithm relies on a variant of the heat kernel random walk and requires, as a subroutine, an algorithm to compute exp(-L)v where L is the Laplacian of a graph related to G and v is a vector. Algorithms for computing the matrix-exponential-vector product efficiently comprise our next set of results. Our main result here is a new algorithm which computes a good approximation to exp(-A)v for a class of symmetric positive semidefinite (PSD) matrices A and a given vector v, in time roughly \(\tilde{O}(m_A),\) where \(m_A\) is the number of non-zero entries of A. This uses, in a non-trivial way, the breakthrough result of Spielman and Teng on inverting symmetric and diagonally-dominant matrices in \(\tilde{O}(m_A)\) time. Finally, we prove that \(e^{-x}\) can be uniformly approximated up to a small additive error, in a non-negative interval [a,b] with a polynomial of degree roughly \(\sqrt{b-a}.\) While this result is of independent interest in approximation theory, we show that, via the Lanczos method from numerical analysis, it yields a simple algorithm to compute exp(-A)v for symmetric PSD matrices that runs in time roughly \(O(t_A\cdot \sqrt{\|A\|}),\) where \(t_A\) is time required for the computation of the vector Aw for given vector w. As an application, we obtain a simple and practical algorithm, with output conductance \(O(\sqrt{\gamma}),\) for balanced separator that runs in time \(\tilde{O}(\frac{m}{\sqrt{\gamma}}).\) This latter algorithm matches the running time, but improves on the approximation guarantee of the Evolving-Sets-based algorithm by Andersen and Peres for balanced separator.

We prove that the inverse of a positive-definite matrix can be approximated by a weighted-sum of a small number of matrix exponentials. Combining this with a previous result [OSV12], we establish an equivalence between matrix inversion and exponentiation up to polylogarithmic factors. In particular, this connection justifies the use of Laplacian solvers for designing fast semi-definite programming based algorithms for certain graph problems. The proof relies on the Euler-Maclaurin formula and certain bounds derived from the Riemann zeta function.

### Hardness of Approximation

k-Uniform k-Partite Hypergraphs

We study the problem of computing the minimum vertex cover on k-uniform k-partite hypergraphs when the k-partition is given. On bipartite graphs (k=2), the minimum vertex cover can be computed in polynomial time. For k\(\ge 3,\) this problem is known to be NP-hard. For general k, the problem was studied by Lovász, who gave a \(\frac{k}{2}\)-approximation based on the standard LP relaxation. Subsequent work by Aharoni, Holzman, and Krivelevich showed a tight integrality gap of \(\frac{k}{2} - o(1))\) for the LP relaxation. We further investigate the inapproximability of minimum vertex cover on k-uniform k-partite hypergraphs and present the following results (here \(\varepsilon > 0\) is an arbitrarily small constant): NP-hardness of obtaining an approximation factor of \((\frac{k}{4} - \varepsilon)\) for even k and \((\frac{k}{4} - \frac{1}{4k} - \varepsilon)\) for odd k, NP-hardness of obtaining a nearly optimal approximation factor of \((\frac{k}{2}-1+\frac{1}{2k}-\varepsilon)\), and an optimal unique games-hardness for approximation within factor \((\frac{k}{2} - \varepsilon)\), showing the optimality of Lovász's algorithm if one assumes the Unique Games conjecture. The first hardness result is based on a reduction from minimum vertex cover in r-uniform hypergraphs, for which NP-hardness of approximating within \(r - 1 -\varepsilon\) was shown by Dinur, Guruswami, Khot, and Regev. We include it for its simplicity, despite it being subsumed by the second hardness result. The unique games-hardness result is obtained by applying the results of Kumar, Manokaran, Tulsiani, and Vishnoi, with a slight modification, to the LP integrality gap due to Aharoni, Holzman, and Krivelevich. The modification ensures that the reduction preserves the desired structural properties of the hypergraph. The reduction for the nearly optimal NP-hardness result relies on the multilayered PCP of Dinur, Guruswami, Khot, and Regev and uses a gadget based on biased long codes adapted from the LP integrality gap of Aharoni, Holzman, and Krivelevich. Our reduction requires the analysis of several long codes with different biases, for which we prove structural properties of the so-called cross-intersecting collections of set families, variants of which have been studied in extremal set theory

via Structural Hardness for Hypergraph Vertex Cover

This work studies the inapproximability of Minimum Vertex Cover on
uniform hypergraphs from a specific structural perspective. Our
study is motivated by applications to two well known scheduling
problems: *Concurrent Open Shop* and the *Assembly Line*
problem. For both these problems, Bansal and Khot [BK10]
obtained tight \((2-\epsilon)\)-factor inapproximability, assuming the
Unique Games Conjecture (UGC). In this work, we prove optimal
\((2-\epsilon)\)-factor NP-hardness of approximation for both these
problems *unconditionally*, *i.e.*, without assuming UGC.

Our results for the scheduling problems follow from a structural hardness for Minimum Vertex Cover on hypergraphs -- an unconditional but weaker analog of a similar result of Bansal and Khot [BK10] which however, is based on UGC.

\(k\)-Partite Hypergraphs

We study the problem of computing the minimum vertex cover on \(k\)-uniform \(k\)-partite hypergraphs when the \(k\)-partition is given. On bipartite graphs \((k=2)\), the minimum vertex cover can be computed in polynomial time. For general \(k\), the problem was studied by Lovász, who gave a \(\frac{k}{2}\)-approximation based on the standard LP relaxation. Subsequent work by Aharoni, Holzman and Krivelevich showed a tight integrality gap of \(\frac{k}{2} - o(1)\) for the LP relaxation. While this problem was known to be NP-hard for \(k\geq 3\), the first non-trivial NP-hardness of approximation factor of \(\frac{k}{4}-\epsilon\) was shown in a recent work by Guruswami and Saket. They also showed that assuming Khot's Unique Games Conjecture yields a \(\frac{k}{2}-\epsilon\) inapproximability for this problem, implying the optimality of Lovász's result.

In this work, we
show that this problem is NP-hard to approximate within
\(\frac{k}{2}-1+\frac{1}{2k}-\epsilon\). This hardness factor
is off from the optimal
by an additive constant of at most 1 for \(k\geq
4\). Our reduction relies on the

The \(k\)-fold Cartesian product of a graph \(G\) is defined as a graph on tuples \((x_1,\ldots,x_k)\) where two tuples are connected if they form an edge in one of the positions and are equal in the rest. Starting with \(G\) as a single edge gives \(G^{\Box k}\) as a \(k\)-dimensional hypercube. We study the distributions of edges crossed by a cut in \(G^{\Box k}\) across the copies of \(G\) in different positions. This is a generalization of the notion of influences for cuts on the hypercube.

We show the analogues of the statements of Kahn, Kalai and Linial (KKL theorem) and that of Friedgut (Friedgut's Junta Theorem), for the setting of Cartesian products of arbitrary graphs. We also extends the work on studying isoperimetric constants for these graphs to the value of semidefinite relaxations for expansion. We connect the optimal values of the relaxations for computing expansion, given by various semidefinite hierarchies, for \(G\) and \(G^{\Box k}\).

### Other Publications

In the simultaneous
Max-Cut problem, we are given \(k\) weighted graphs on the
same set of \(n\) vertices, and the goal is to find a cut
of the vertex set so that the minimum, over the \(k\)
graphs, of the cut value is as large as possible. Previous
work [BKS15] gave a polynomial time algorithm which
achieved an approximation factor of \(\frac{1}{2}-o(1)\)
for this problem (and an approximation factor of
\(\frac{1}{2} + \epsilon_k\) in the unweighted case, where
\(\epsilon_k \to 0\) as \(k \to \infty\)).

In this
work, we give a polynomial time approximation algorithm
for simultaneous Max-Cut with an approximation factor of
0.8780 (for all constant \(k\)). The natural SDP
formulation for simultaneous Max-Cut was shown to have an
integrality gap of \(\frac{1}{2} + \epsilon_k\) in
[BKS15]. In achieving the better approximation guarantee,
we use a stronger Sum-of-Squares hierarchy SDP relaxation
and a rounding algorithm based on Raghavendra-Tan [RT12],
in addition to techniques from [BKS15].

We study whether a depth two neural network can learn another depth two network using gradient descent. Assuming a linear output node, we show that the question of whether gradient descent converges to the target function is equivalent to the following question in electrodynamics: Given k fixed protons in \(\mathbb{R}^d\), and k electrons, each moving due to the attractive force from the protons and repulsive force from the remaining electrons, whether at equilibrium all the electrons will be matched up with the protons, up to a permutation. Under the standard electrical force, this follows from the classic Earnshaw's theorem. In our setting, the force is determined by the activation function and the input distribution. Building on this equivalence, we prove the existence of an activation function such that gradient descent learns at least one of the hidden nodes in the target network. Iterating, we show that gradient descent can be used to learn the entire network one node at a time.

Given k collections
of 2SAT clauses on the same set of variables V, can we find
one assignment that satisfies a large fraction of clauses
from

Our main result is that for every CSP F, for \(k <
\tilde{O}(\log^{\frac{1}{4}} n)\), there is a polynomial time
constant factor

These problems are a natural meeting point for the theory of constraint satisfaction problems and multiobjective optimization. We also suggest a number of interesting directions for future research.

Fast Graph Algorithms and Inapproximability

Sept 2013

For several basic optimization problems, it is NP-hard to find an exact solution. As a result, understanding the best possible trade-off between the running time of an algorithm and its approximation guarantee, is a fundamental question in theoretical computer science, and the central goal of the theory of approximation.

There are two aspects to the theory of approximation : (1) efficient approximation algorithms that establish trade-offs between approximation guarantee and running time, and (2) inapproximability results that give evidence against them. In this thesis, we contribute to both facets of the theory of approximation.

In the first part of this thesis, we present the first near-linear-time algorithm for Balanced Separator - given a graph, partition its vertices into two roughly equal parts without cutting too many edges - that achieves the best approximation guarantee possible for algorithms in its class. This is a classic graph partitioning problem and has deep connections to several areas of both theory and practice, such as metric embeddings, Markov chains, clustering, etc.

As an important subroutine for our algorithm for Balanced Separator, we provide a near-linear-time algorithm to simulate the heat-kernel random walk on a graph, equivalent to computing \(\exp(-L)v\), where \(L\) is the Laplacian of the graph, and \(v\) is a vector. This algorithm combines techniques from approximation theory and numerical linear algebra to reduce the problem of approximating the matrix exponential to solving a small number of Laplacian systems. We also give a reduction in the reverse direction, from matrix inversion to matrix exponentiation, hence justifying the use of Laplacian system solvers.

In the second part of this thesis, we prove
inapproximability results for several basic optimization
problems. We address some classic scheduling problems,

We give an arithmetic version of the recent proof of the triangle removal lemma by Fox [Fox11], for the group \(\mathbb{F}_2^n\).

A triangle in \(\mathbb{F}_2^n\) is a tuple \((x,y,z)\) such that \(x+y+z = 0\). The triangle removal lemma for \(\mathbb{F}_2^n\) states that for every \(\epsilon > 0\) there is a \(\delta > 0\), such that a subset \(A\) of \(\mathbb{F}_2^n\) which is \(\epsilon\)-far from being triangle free, must contain at least \(\delta \cdot 2^{2n}\) triangles. We give a direct proof which gives an improved lower bound for \(\delta\), analogous to the one obtained by Fox for triangle removal in graphs.

The improved lower bound was already known to follow (for triangle-removal in all groups), using Fox's removal lemma for directed cycles and a reduction by Kral, Serra and Vena. The purpose of this note is to provide a direct Fourier-analytic proof for the group \(\mathbb{F}_2^n\).

Suppose we are given an oracle that claims to approximate
the permanent for most matrices X, where X is chosen from the
*Gaussian ensemble* the matrix entries are i.i.d.~univariate
complex Gaussians). Can we test that the oracle satisfies this claim?
This paper gives a polynomial-time algorithm for the task.

The oracle-testing problem is of interest because a recent paper of Aaronson and Arkhipov showed that if there is a polynomial-time algorithm for simulating boson-boson interactions in quantum mechanics, then an approximation oracle for the permanent (of the type described above) exists in BPP^NP. Since computing the permanent of even 0/1 matrices is #P-complete, this seems to demonstrate more computational power in quantum mechanics than Shor's factoring algorithm does. However, unlike factoring, which is in NP, it was unclear previously how to test the correctness of an approximation oracle for the permanent, and this is the contribution of the paper.

The technical difficulty overcome here is that univariate polynomial
self-correction, which underlies similar oracle-testing algorithms
for permanent over* finite fields* ---and whose discovery led to
a revolution in complexity theory---does not seem to generalize to
complex (or even, real) numbers. We believe that
this tester will motivate further progress on understanding the
permanent of Gaussian matrices.

Toward a Rigorous Approach

A community in a social network is usually understood to be a group of nodes more densely connected with each other than with the rest of the network. This is an important concept in most domains where networks arise: social, technological, biological, etc. For many years algorithms for ﬁnding communities implicitly assumed communities are nonoverlapping (leading to use of clustering-based approaches) but there is increasing interest in ﬁnding overlapping communities. A barrier to ﬁnding communities is that the solution concept is often deﬁned in terms of an NP-complete problem such as Clique or Hierarchical Clustering.

This paper seeks to initiate a rigorous
approach to the problem of ﬁnding overlapping communities, where
“rigorous” means that we clearly state the following: (a) the object
sought by our algorithm (b) the assumptions about the underlying
network (c) the (worst-case) running time.

Our assumptions about the
network lie between worst-case and average-case. An average-case
analysis would require a precise probabilistic model of the network,
on which there is currently no consensus. However, some plausible
assumptions about network parameters can be gleaned from a long body
of work in the sociology community spanning ﬁve decades focusing on
the study of individual communities and ego-centric networks (in graph
theoretic terms, this is the subgraph induced on a node’s
neighborhood). Thus our assumptions are somewhat “local” in
nature. Nevertheless they suﬃce to permit a rigorous analysis of
running time of algorithms that recover global
structure.

Our algorithms use random sampling similar to that in property testing and algorithms for dense graphs. We note however that our networks are not necessarily dense graphs, not even in local neighborhoods. Our algorithms explore a local-global relationship between ego-centric and socio-centric networks that we hope will provide a fruitful framework for future work both in computer science and sociology.

In a recent paper, Lee and Moharrami construct a family of metrics \((X_n,d_n)\) with \(|X_n|=N\), such that \((X_n,\sqrt{d_n})\) embeds into \(\ell_2\) with constant distortion, but embedding \((X_n,d_n)\) into a metric of negative type requires distortion \(\Omega((\log N)^{\frac{1}{4}})\). In this paper, we build on their analysis, and improve their result by showing a \(\Omega((\log N)^{\frac{1}{3}})\) lower bound for embedding \((X_n,d_n)\) into a metric of negative type. Moreover, we show that this analysis is essentially tight by constructing a map that has distortion \(O((\log N)^{\frac{1}{3}})\).

This result implies a lower bound of \(\Omega((\log N)^{\frac{1}{3}})\) for the integrality gap of the relaxed version of Goemans-Linial semidefinite program with weak triangle inequalities for Non-uniform Sparsest Cut.

In a well-known paper, Arora, Rao and Vazirani obtained an \(O(\sqrt{\log n})\) approximation to the Balanced Separator problem and Uniform Sparsest Cut. At the heart of their result is a geometric statement about sets of points that satisfy triangle inequalities, which also underlies subsequent work on approximation algorithms and geometric embeddings. In this note, we give an equivalent formulation of the Structure theorem in [ARV] in terms of the expansion of large sets in geometric graphs on sets of points satisfying triangle inequalities.