This is me!

Assistant Professor
University of Toronto

2016-2017 Research Scientist, Google
Department of Computer Science, Yale University
Postdoctoral Associate, Hosted by Daniel Spielman
Fall 2013
Simons Institute, UC Berkeley
Simons Research Fellow, Real Analysis in CS program
Department of Computer Science, Princeton University
Ph.D. advised by Sanjeev Arora
CSE Dept., IIT Bombay
B.Tech. thesis advised by Sundar Vishwanathan

Research Interests

Algorithms, and its connections to optimization, machine learning, and statistics.

Focus areas: Design of fast algorithms, particularly for graph problems; Approximation algorithms; Hardness of approximation; Analysis of Boolean functions

Selected Publications

Approximate Gaussian Elimination for Laplacians:
Fast, Sparse, and Simple
with Rasmus Kyng

We show how to perform sparse approximate Gaussian elimination for Laplacian matrices. We present a simple, nearly linear time algorithm that approximates a Laplacian by a matrix with a sparse Cholesky factorization, the version of Gaussian elimination for symmetric matrices. This is the first nearly linear time solver for Laplacian systems that is based purely on random sampling, and does not use any graph theoretic constructions such as low-stretch trees, sparsifiers, or expanders. The crux of our analysis is a novel concentration bound for matrix martingales where the differences are sums of conditionally independent variables.

Sparsified Cholesky and Multigrid Solvers for Connection Laplacians
with Rasmus Kyng, Yin Tat Lee, Richard Peng, Daniel A. Spielman

We introduce the sparsified Cholesky and sparsified multigrid algorithms for solving systems of linear equations. These algorithms accelerate Gaussian elimination by sparsifying the nonzero matrix entries created by the elimination process.

We use these new algorithms to derive the first nearly linear time algorithms for solving systems of equations in connection Laplacians—a generalization of Laplacian matrices that arise in many problems in image and signal processing.

We also prove that every connection Laplacian has a linear sized approximate inverse. This is an LU factorization with a linear number of nonzero entries that is a strong approximation of the original matrix. Using such a factorization one can solve systems of equations in a connection Laplacian in linear time. Such a factorization was unknown even for ordinary graph Laplacians.

Algorithms for Lipschitz Learning on Graphs
with Rasmus Kyng, Anup Rao, Daniel Spielman

We develop fast algorithms for solving regression problems on graphs where one is given the value of a function at some vertices, and must find its smoothest possible extension to all vertices. The extension we compute is the absolutely minimal Lipschitz extension, and is the limit for large p of p-Laplacian regularization. We present an algorithm that computes a minimal Lipschitz extension in expected linear time, and an algorithm that computes an absolutely minimal Lipschitz extension in expected time \(\tilde{O}(mn)\). The latter algorithm has variants that seem to run much faster in practice. These extensions are particularly amenable to regularization: we can perform \(\ell_{0}\)-regularization on the given values in polynomial time and \(\ell_{1}\)-regularization on the initial function values and on graph edge weights in time \(\tilde{O}(m^{\frac{3}{2}})\).

Faster Algorithms via Approximation Theory
with Nisheeth K. Vishnoi

Faster Algorithms via Approximation Theory illustrates how classical and modern techniques from approximation theory play a crucial role in obtaining results that are relevant to the emerging theory of fast algorithms. The key lies in the fact that such results imply faster ways to approximate primitives such as \(A^s v, A^{-1}v, \exp(-A)v\), eigenvalues and eigenvectors, which are fundamental to many spectral algorithms.
The first half of the book is devoted to the ideas and results from approximation theory that are central, elegant, and may have wider applicability in theoretical computer science. These include not only techniques relating to polynomial approximations but also those relating to approximations by rational functions and beyond. The remaining half illustrates a variety of ways that these results can be used to design fast algorithms.
Faster Algorithms via Approximation Theory is self-contained and should be of interest to researchers and students in theoretical computer science, numerical linear algebra, and related areas.

Provable ICA with Unknown Gaussian Noise,
and Implications for Gaussian Mixtures and Autoencoders
with Sanjeev Arora, Rong Ge, Ankur Moitra

We present a new algorithm for Independent Component Analysis (ICA) which has provable performance guarantees. In particular, suppose we are given samples of the form y = Ax + \(\eta\) where A is an unknown n X n matrix and x is chosen uniformly at random from \(\{+1, -1\}^n\), \(\eta\) is an n-dimensional Gaussian random variable with unknown covariance \(\Sigma\): We give an algorithm that provable recovers A and \(\Sigma\) up to an additive \(\epsilon\) whose running time and sample complexity are polynomial in n and \(1 / \epsilon\). To accomplish this, we introduce a novel "quasi-whitening" step that may be useful in other contexts in which the covariance of Gaussian noise is not known in advance. We also give a general framework for finding all local optima of a function (given an oracle for approximately finding just one) and this is a crucial step in our algorithm, one that has been overlooked in previous attempts, and allows us to control the accumulation of error when we find the columns of A one by one via local search.

Approximating the Exponential, the Lanczos Method and
an \(\tilde{O}(m)\)-Time Spectral Algorithm for Balanced Separator
with Lorenzo Orecchia, Nisheeth K. Vishnoi

We give a novel spectral approximation algorithm for the balanced separator problem that, given a graph G, a constant balance b \(\in (0,\frac{1}{2}],\) and a parameter \(\gamma,\) either finds an \(\Omega(b)\)-balanced cut of conductance \(O(\sqrt{\gamma})\) in G, or outputs a certificate that all b-balanced cuts in G have conductance at least \(\gamma,\) and runs in time \(\tilde{O}(m).\) This settles the question of designing asymptotically optimal spectral algorithms for balanced separator. Our algorithm relies on a variant of the heat kernel random walk and requires, as a subroutine, an algorithm to compute exp(-L)v where L is the Laplacian of a graph related to G and v is a vector. Algorithms for computing the matrix-exponential-vector product efficiently comprise our next set of results. Our main result here is a new algorithm which computes a good approximation to exp(-A)v for a class of symmetric positive semidefinite (PSD) matrices A and a given vector v, in time roughly \(\tilde{O}(m_A),\) where \(m_A\) is the number of non-zero entries of A. This uses, in a non-trivial way, the breakthrough result of Spielman and Teng on inverting symmetric and diagonally-dominant matrices in \(\tilde{O}(m_A)\) time. Finally, we prove that \(e^{-x}\) can be uniformly approximated up to a small additive error, in a non-negative interval [a,b] with a polynomial of degree roughly \(\sqrt{b-a}.\) While this result is of independent interest in approximation theory, we show that, via the Lanczos method from numerical analysis, it yields a simple algorithm to compute exp(-A)v for symmetric PSD matrices that runs in time roughly \(O(t_A\cdot \sqrt{\|A\|}),\) where \(t_A\) is time required for the computation of the vector Aw for given vector w. As an application, we obtain a simple and practical algorithm, with output conductance \(O(\sqrt{\gamma}),\) for balanced separator that runs in time \(\tilde{O}(\frac{m}{\sqrt{\gamma}}).\) This latter algorithm matches the running time, but improves on the approximation guarantee of the Evolving-Sets-based algorithm by Andersen and Peres for balanced separator.

Inapproximability of Minimum Vertex Cover on
k-Uniform k-Partite Hypergraphs
with Venkatesan Guruswami, Rishi Saket

We study the problem of computing the minimum vertex cover on k-uniform k-partite hypergraphs when the k-partition is given. On bipartite graphs (k=2), the minimum vertex cover can be computed in polynomial time. For k\(\ge 3,\) this problem is known to be NP-hard. For general k, the problem was studied by Lovász, who gave a \(\frac{k}{2}\)-approximation based on the standard LP relaxation. Subsequent work by Aharoni, Holzman, and Krivelevich showed a tight integrality gap of \(\frac{k}{2} - o(1))\) for the LP relaxation. We further investigate the inapproximability of minimum vertex cover on k-uniform k-partite hypergraphs and present the following results (here \(\varepsilon > 0\) is an arbitrarily small constant): NP-hardness of obtaining an approximation factor of \((\frac{k}{4} - \varepsilon)\) for even k and \((\frac{k}{4} - \frac{1}{4k} - \varepsilon)\) for odd k, NP-hardness of obtaining a nearly optimal approximation factor of \((\frac{k}{2}-1+\frac{1}{2k}-\varepsilon)\), and an optimal unique games-hardness for approximation within factor \((\frac{k}{2} - \varepsilon)\), showing the optimality of Lovász's algorithm if one assumes the Unique Games conjecture. The first hardness result is based on a reduction from minimum vertex cover in r-uniform hypergraphs, for which NP-hardness of approximating within \(r - 1 -\varepsilon\) was shown by Dinur, Guruswami, Khot, and Regev. We include it for its simplicity, despite it being subsumed by the second hardness result. The unique games-hardness result is obtained by applying the results of Kumar, Manokaran, Tulsiani, and Vishnoi, with a slight modification, to the LP integrality gap due to Aharoni, Holzman, and Krivelevich. The modification ensures that the reduction preserves the desired structural properties of the hypergraph. The reduction for the nearly optimal NP-hardness result relies on the multilayered PCP of Dinur, Guruswami, Khot, and Regev and uses a gadget based on biased long codes adapted from the LP integrality gap of Aharoni, Holzman, and Krivelevich. Our reduction requires the analysis of several long codes with different biases, for which we prove structural properties of the so-called cross-intersecting collections of set families, variants of which have been studied in extremal set theory