Sushant Sachdeva

Design of Fast Algorithms

Eulerian Graph Sparsification by Effective Resistance Decomposition

SODA 2025

with A. Jambulapati, A. Sidford, K. Tian, Y. Zhao

Abstract

Conference

arXiv

We provide an algorithm that, given an $n$-vertex $m$-edge Eulerian graph with polynomially bounded weights, computes an $\breve{O}(n\log^{2} n \cdot \varepsilon^{-2})$-edge $\varepsilon$-approximate Eulerian sparsifier with high probability in $\breve{O}(m\log^3 n)$ time (where $\breve{O}(\cdot)$ hides $\polyloglog(n)$ factors). Due to a reduction from [Peng-Song, STOC '22], this yields an $\breve{O}(m\log^3 n + n\log^6 n)$-time algorithm for solving $n$-vertex $m$-edge Eulerian Laplacian systems with polynomially-bounded weights with high probability, improving upon the previous state-of-the-art runtime of $\Omega(m\log^8 n + n\log^{23} n)$. We also give a polynomial-time algorithm that computes $O(\min(n\log n \cdot \varepsilon^{-2} + n\log^{5/3} n \cdot \varepsilon^{-4/3}, n\log^{3/2} n \cdot \varepsilon^{-2}))$-edge sparsifiers, improving the best such sparsity bound of $O(n\log^2 n \cdot \varepsilon^{-2} + n\log^{8/3} n \cdot \varepsilon^{-4/3})$ [Sachdeva-Thudi-Zhao, ICALP '24]. In contrast to prior Eulerian graph sparsification algorithms which used either short cycle or expander decompositions, our algorithms use a simple efficient effective resistance decomposition scheme we introduce. Our algorithms apply a natural sampling scheme and electrical routing (to achieve degree balance) to such decompositions. Our analysis leverages new asymmetric variance bounds specialized to Eulerian Laplacians and tools from discrepancy theory.

Almost-Linear Time Algorithms for Decremental Graphs:
Min-Cost Flow and More via Duality

FOCS 2024

with Jvd. Brand, L. Chen, R. Kyng, Y. P. Liu, S. Meierhans, M. P. Gutenberg

Abstract

Conference

arXiv

We give the first almost-linear total time algorithm for deciding if a flow of cost at most F still exists in a directed graph, with edge costs and capacities, undergoing decremental updates, i.e., edge deletions, capacity decreases, and cost increases. This implies almost-linear time algorithms for approximating the minimum-cost flow value and s-t distance on such decremental graphs. Our framework additionally allows us to maintain decremental strongly connected components in almost-linear time deterministically. These algorithms also improve over the current best known runtimes for statically computing minimum-cost flow, in both the randomized and deterministic settings.

We obtain our algorithms by taking the dual perspective, which yields cut-based algorithms. More precisely, our algorithm computes the flow via a sequence of $m^{1+o(1)}$ dynamic min-ratio cut problems, the dual analog of the dynamic min-ratio cycle problem that underlies recent fast algorithms for minimum-cost flow. Our main technical contribution is a new data structure that returns an approximately optimal min-ratio cut in amortized $m^{o(1)}$ time by maintaining a tree-cut sparsifier. This is achieved by devising a new algorithm to maintain the dynamic expander hierarchy of [Goranci-Räcke-Saranurak-Tan, SODA 2021] that also works in capacitated graphs. All our algorithms are deterministc, though they can be sped up further using randomized techniques while still working against an adaptive adversary.

Better Sparsifiers for Directed Eulerian Graphs

ICALP 2024

with A. Thudi, Y. Zhao

Abstract

Conference

arXiv

Spectral sparsification for directed Eulerian graphs is a key component in the design of fast algorithms for solving directed Laplacian linear systems. Directed Laplacian linear system solvers are crucial algorithmic primitives to fast computation of fundamental problems on random walks, such as computing stationary distributions, hitting and commute times, and personalized PageRank vectors. While spectral sparsification is well understood for undirected graphs and it is known that for every graph $G,$ $(1+\varepsilon)$-sparsifiers with $O(n\varepsilon^{-2})$ edges exist [Batson-Spielman-Srivastava, STOC '09] (which is optimal), the best known constructions of Eulerian sparsifiers require $\Omega(n\varepsilon^{-2}\log^4 n)$ edges and are based on short-cycle decompositions [Chu et al., FOCS '18].

In this paper, we give improved constructions of Eulerian sparsifiers, specifically:

1. We show that for every directed Eulerian graph $\vec{G},$ there exists an Eulerian sparsifier with $O(n\varepsilon^{-2} \log^2 n \log^2\log n + n\varepsilon^{-4/3}\log^{8/3} n)$ edges. This result is based on combining short-cycle decompositions [Chu-Gao-Peng-Sachdeva-Sawlani-Wang, FOCS '18, SICOMP] and [Parter-Yogev, ICALP '19], with recent progress on the matrix Spencer conjecture [Bansal-Meka-Jiang, STOC '23].

2. We give an improved analysis of the constructions based on short-cycle decompositions, giving an $m^{1+\delta}$-time algorithm for any constant $\delta > 0$ for constructing Eulerian sparsifiers with $O(n\varepsilon^{-2}\log^3 n)$ edges.

Optimal Electrical Oblivious Routing on Expanders

ICALP 2024

with C. Florescu, R. Kyng, M. P. Gutenberg

Abstract

Conference

arXiv

"In this paper, we investigate the question of whether the electrical flow routing is a good oblivious routing scheme on an $m$-edge graph $G = (V, E)$ that is a $\Phi$-expander, i.e. where $|\partial S| \geq \Phi \cdot \text{vol}vol(S)$ for every $S \subseteq V, \text{vol}(S) \leq \text{vol}(V)/2$. Beyond its simplicity and structural importance, this question is well-motivated by the current state-of-the-art of fast algorithms for $\ell_{\infty}$ oblivious routings that reduce to the expander-case which is in turn solved by electrical flow routing.

Our main result proves that the electrical routing is an $O(\Phi^{-1} \log m)$-competitive oblivious routing in the $\ell_1$- and $\ell_\infty$-norms. We further observe that the oblivious routing is $O(\log^2 m)$-competitive in the $\ell_2$-norm and, in fact, $O(\log m)$-competitive if $\ell_2$-localization is $O(\log m)$ which is widely believed.

Using these three upper bounds, we can smoothly interpolate to obtain upper bounds for every $p \in [2, \infty]$ and $q$ given by $1/p + 1/q = 1$. Assuming $\ell_2$-localization in $O(\log m)$, we obtain that in $\ell_p$ and $\ell_q$, the electrical oblivious routing is $O(\Phi^{-(1-2/p)}\log m)$ competitive. Using the currently known result for $\ell_2$-localization, this ratio deteriorates by at most a sublogarithmic factor for every $p, q \neq 2$.

We complement our upper bounds with lower bounds that show that the electrical routing for any such $p$ and $q$ is $\Omega(\Phi^{-(1-2/p)}\log m)$-competitive. This renders our results in $\ell_1$ and $\ell_{\infty}$ unconditionally tight up to constants, and the result in any $\ell_p$- and $\ell_q$-norm to be tight in case of $\ell_2$-localization in $O(\log m)$.

Universal Matrix Sparsifiers and Fast Deterministic Algorithms for Linear Algebra

ITCS 2024

with R. Bhattacharjee, G. Dexter, C. Musco, A. Ray, D. P. Woodruff

Abstract

Conference

arXiv

Let $S\in\mathbb{R}^{n\times n}$ satisfy $\|\mathbb{1}-S\|_2 \le \varepsilon n$, where $\mathbb{1}$ is the all ones matrix and $\|\cdot\|_2$ is the spectral norm. It is well-known that there exists such an $S$ with just $O(n/\varepsilon^2)$ non-zero entries: we can let $S$ be the scaled adjacency matrix of a Ramanujan expander graph. We show that such an $S$ yields a universal sparsifier for any positive semidefinite (PSD) matrix. In particular, for any PSD $A\in\mathbb{R}^{n\times n}$ with entries bounded in magnitude by 1, $\|A-A\circ S\|_2 \le \varepsilon n$, where $\circ$ denotes the entrywise (Hadamard) product. Our techniques also give universal sparsifiers for non-PSD matrices. In this case, letting $S$ be the scaled adjacency matrix of a Ramanujan graph with $\tilde{O}(n/\varepsilon^4)$ edges, we have $\|A-A\circ S\|_2 \le \varepsilon \cdot \max(n,\|A\|_1)$, where $\|A\|_1$ is the nuclear norm. We show that the above bounds for both PSD and non-PSD matrices are tight up to log factors.

Since $A\circ S$ can be constructed deterministically, our result for PSD matrices derandomizes and improves upon known results for randomized matrix sparsification, which require randomly sampling $O(n\log n \varepsilon^{-2})$ entries. We also leverage our results to give the first deterministic algorithms for several problems related to singular value approximation that run in faster than matrix multiplication time.

Finally, if $A\in\{−1,0,1\}^{n\times n}$ is PSD, we show that $\tilde{A}$ with $\|A-\tilde{A}\|_2 \le \varepsilon n$ can be obtained by deterministically reading $\tilde{O}(n/\varepsilon)$ entries of $A$. This improves the $1/\varepsilon$ dependence on our result for general PSD matrices and is near-optimal.

Electrical Flows for Polylogarithmic Competitive Oblivious Routing

ITCS 2024

with G. Goranci, M. Henzinger, H. Räcke, A. R. Sricharan

Abstract

Conference

arXiv

Oblivious routing is a well-studied paradigm that uses static precomputed routing tables for selecting routing paths within a network. Existing oblivious routing schemes with polylogarithmic competitive ratio for general networks are tree-based, in the sense that routing is performed according to a convex combination of trees. However, this restriction to trees leads to a construction that has time quadratic in the size of the network and does not parallelize well.

In this paper we study oblivious routing schemes based on electrical routing. In particular, we show that general networks with n vertices and m edges admit a routing scheme that has competitive ratio $O( \log^2 n)$ and consists of a convex combination of only ${O}(\sqrt{m})$ electrical routings. This immediately leads to an improved construction algorithm with time $\tilde{O}(m^{3/2})$ that can also be implemented in parallel with $\tilde{O}(\sqrt{m})$ depth.

Fast Algorithms for Separable Linear Programs

SODA 2024

with S. Dong, G. Goranci, L. Li, G. Ye

Abstract

Conference

arXiv

In numerical linear algebra, considerable effort has been devoted to obtaining faster algorithms for linear systems whose underlying matrices exhibit structural properties. A prominent success story is the method of generalized nested dissection [Lipton-Rose-Tarjan’79] for separable matrices. On the other hand, the majority of recent developments in the design of efficient linear program (LP) solves do not leverage the ideas underlying these faster linear system solvers nor consider the separable structure of the constraint matrix.

We give a faster algorithm for separable linear programs. Specifically, we consider LPs of the form $\min_{Ax=b,\ell\le x \le u} c^{\top}x$, where the graphical support of the constraint matrix $A \in R^{n \times m}$ is $O(n^\alpha)$-separable. These include flow problems on planar graphs and low treewidth matrices among others. We present an $\tilde{O}((m + m^{1/2+2\alpha}) \log (1/\varepsilon)$ time algorithm for these LPs, where $\varepsilon$ is the relative accuracy of the solution.

Our new solver has two important implications: for the $k$-multicommodity flow problem on planar graphs, we obtain an algorithm running in eO(k5/2m3/2) time in the high accuracy regime; and when the support of A is $O(n^\alpha)$-separable with $\alpha \le 1/4$ 1/4, our algorithm runs in $\tilde{O}(m)$ time, which is nearly optimal. The latter significantly improves upon the natural approach of combining interior point methods and nested dissection, whose time complexity is lower bounded by $\Omega(\sqrt{m}(m + m^{\alpha \omega} )) = \Omega(m^{3/2})$, where $\omega$ is the matrix multiplication constant. Lastly, in the setting of low-treewidth LPs, we recover the results of [ DLY21a] and [ GS22] with significantly simpler data structure machinery.

Incremental Approximate Maximum Flow on Undirected Graphs
in Subpolynomial Update Time

SODA 2024

with Jvd. Brand, L. Chen, R. Kyng, Y. P. Liu, R. Peng, M. P. Gutenberg, A. Sidford

Abstract

Conference

arXiv

We provide an algorithm which, with high probability, maintains a $(1 — \varepsilon)$-approximate maximum flow on an undirected graph undergoing m-edge additions in amortized $m^{o(1)}\varepsilon^{-3}$ time per update. To obtain this result, we provide a more general algorithm that solves what we call the incremental, thresholded, p-norm flow problem that asks to determine the first edge-insertion in an undirected graph that causes the minimum $\ell_p$-norm flow to decrease below a given threshold in value. Since we solve this thresholded problem, our data structure succeeds against an adaptive adversary that can only see the data structure's output. Furthermore, since our algorithm holds for p = 2, we obtain improved algorithms for dynamically maintaining the effective resistance between a pair of vertices in an undirected graph undergoing edge insertions.

Our algorithm builds upon previous dynamic algorithms for approximately solving the minimum-ratio cycle problem that underlie previous advances on the maximum flow problem [Chen-Kyng-Liu-Peng-Probst Gutenberg-Sachdeva, FOCS ‘22] as well as recent dynamic maximum flow algorithms [v.d.Brand-Liu-Sidford, STOC ‘23]. Instead of using interior point methods, which were a key component of these recent advances, our algorithm uses an optimization method based on $\ell_p$-norm iterative refinement and the multiplicative weight update method. This ensures a monotonicity property in the minimum-ratio cycle subproblems that allows us to apply known data structures and bypass issues arising from adaptive queries.

A Deterministic Almost-Linear Time Algorithm for Minimum-Cost Flow

FOCS 2023

with Jvd. Brand, L. Chen, R. Kyng, Y. P. Liu, R. Peng, M. P. Gutenberg, A. Sidford

Abstract

Conference

arXiv

We give a deterministic $m^{1+o(1)}$ time algorithm that computes exact maximum flows and minimum-cost flows on directed graphs with m edges and polynomially bounded integral demands, costs, and capacities. As a consequence, we obtain the first running time improvement for deterministic algorithms that compute maximum-flow in graphs with polynomial bounded capacities since the work of Goldberg-Rao [J.ACM ’98].Our algorithm builds on the framework of Chen-Kyng-Liu-Peng-Gutenberg-Sachdeva [FOCS ’22] that computes an optimal flow by computing a sequence of $m^{1+o(1)}$-approximate undirected minimum-ratio cycles. We develop a deterministic dynamic graph data-structure to compute such a sequence of minimum-ratio cycles in an amortized $m^{o(1)}$ time per edge update. Our key technical contributions are deterministic analogues of the vertex sparsification and edge sparsification components of the data-structure from Chen et al. For the vertex sparsification component, we give a method to avoid the randomness in Chen et al. which involved sampling random trees to recurse on. For the edge sparsification component, we design a deterministic algorithm that maintains an embedding of a dynamic graph into a sparse spanner. We also show how our dynamic spanner can be applied to give a deterministic data structure that maintains a fully dynamic low-stretch spanning tree on graphs with polynomially bounded edge lengths, with subpolynomial average stretch and subpolynomial amortized time per edge update.

A Simple and Efficient Parallel Laplacian Solver

SPAA 2023

with Y. Zhao

Abstract

Conference

arXiv

A symmetric matrix is called a Laplacian if it has nonpositive off-diagonal entries and zero row sums. Since the seminal work of Spielman and Teng (2004) on solving Laplacian linear systems in nearly linear time, several algorithms have been designed for the task. Yet, the work of Kyng and Sachdeva (2016) remains the simplest and most practical sequential solver. They presented a solver purely based on random sampling and without graph-theoretic constructions such as low-stretch trees and sparsifiers.

In this work, we extend the result of Kyng and Sachdeva to a simple parallel Laplacian solver with $O(m \log^3 n\log\log n)$ or $O((m+n\log^5 n)\log n\log\log n)$ work and $O(\log^2 n\log\log n)$ depth using the ideas of block Cholesky factorization from Kyng et al. (2016). Compared to the best known parallel Laplacian solvers that achieve polylogarithmic depth due to Lee et al. (2015), our solver achieves both better depth and, for dense graphs, better work.

A New Approach to Estimating Effective Resistances
and Counting Spanning Trees in Expander Graphs

SODA 2023

with Lawrence Li

Abstract

Conference

arXiv

We demonstrate that for expander graphs, for all $\varepsilon > 0$, there exists a data structure of size $\widetilde{O}(n\varepsilon^{-1})$ which can be used to return $(1+\varepsilon)$-approximations to effective resistances in $\widetilde{O}(1)$ time per query. Short of storing all effective resistances, previous best approaches could achieve $\widetilde{O}(n\varepsilon^{-2})$ size and $\widetilde{O}(\varepsilon^{-2})$ time per query by storing Johnson–Lindenstrauss vectors for each vertex, or $\widetilde{O}(n\varepsilon^{-1})$ size and $\widetilde{O}(n\varepsilon^{-1})$ time per query by storing a spectral sketch.
Our construction is based on two key ideas: 1) $\varepsilon^{-1}$-sparse, $\varepsilon$-additive approximations to ${\sigma_u}$ for all $u$, vectors similar to $DL^+1_u$ can be used to recover $(1 + \varepsilon)$-approximations to the effective resistances, 2) In expander graphs, only $\widetilde{O}(\varepsilon^{-1})$ coordinates of ${\sigma_u}$ are larger than $\varepsilon$. We give an efficient construction for such a data structure in $\widetilde{O}(m + n\varepsilon^{-2})$ time via random walks. This results in an algorithm on expander graphs for computing $(1+\varepsilon)$-approximate effective resistances for $s$ vertex pairs that runs in $\widetilde{O}(m + n\varepsilon^{-2} + s)$ time, improving over the previously best known running time of $m^{1 + o(1)} + (n + s)n^{o(1)}\varepsilon^{-1.5}$ for $s = \omega(n\varepsilon^{-0.5})$.
We employ the above algorithm to compute a $(1+\delta)$-approximation to the number of spanning trees in an expander graph, or equivalently, approximating the (pseudo)determinant of its Laplacian in $\widetilde{O}(m + n^{1.5}\delta^{-1})$ time. This improves on the previously best known result of $m^{1+o(1)} + n^{1.875+o(1)}\delta^{-1.75}$ time, and matches the best known size of determinant sparsifiers.

A Simple Framework for Finding Balanced Sparse Cuts via APSP

SOSA 2023

with Li Chen, Rasmus Kyng, Maximilian Probst Gutenberg

Abstract

Conference

arXiv

We present a very simple and intuitive algorithm to find balanced sparse cuts in a graph via shortest-paths. Our algorithm combines a new multiplicative-weights framework for solving unit-weight multi-commodity flows with standard ball growing arguments. Using Dijkstra's algorithm for computing the shortest paths afresh every time gives a very simple algorithm that runs in time $\widetilde{O}(m^2/\phi)$ and finds an $\widetilde{O}(\phi)$-sparse balanced cut, when the given graph has a $\phi$-sparse balanced cut. Combining our algorithm with known deterministic data-structures for answering approximate All Pairs Shortest Paths (APSP) queries under increasing edge weights (decremental setting), we obtain a simple deterministic algorithm that finds $m^{o(1)}\phi$-sparse balanced cuts in $m^{1+o(1)}/\phi$ time. Our deterministic almost-linear time algorithm matches the state-of-the-art in randomized and deterministic settings up to subpolynomial factors, while being significantly simpler to understand and analyze, especially compared to the only almost-linear time deterministic algorithm, a recent breakthrough by Chuzhoy-Gao-Li-Nanongkai- Peng-Saranurak (FOCS 2020).

Maximum Flow and Minimum-Cost Flow in Almost-Linear Time
Best Paper Award, Frontiers of Science Award

FOCS 2022
CACM 2023

with Li Chen, Rasmus Kyng, Yang P. Liu, Richard Peng, Maximilian Probst Gutenberg

Abstract

Conference

arXiv

Short Video

Long Video

Quanta

We give an algorithm that computes exact maximum flows and minimum-cost flows on directed graphs with m edges and polynomially bounded integral demands, costs, and capacities in $m^{1+o(1)}$ time. Our algorithm builds the flow through a sequence of $m^{1+o(1)}$ approximate undirected minimum-ratio cycles, each of which is computed and processed in amortized $m^{o(1)}$ time using a new dynamic graph data structure.
Our framework extends to algorithms running in $m^{1+o(1)}$ time for computing flows that minimize general edge-separable convex functions to high accuracy. This gives almost-linear time algorithms for several problems including entropy-regularized optimal transport, matrix scaling, p-norm flows, and p-norm isotonic regression on arbitrary directed acyclic graphs.

Nested Dissection Meets IPMs: Planar Min-Cost Flow in Nearly-Linear Time

SODA 2022

with Sally Dong, Yu Gao, Gramoz Goranci, Yin Tat Lee, Richard Peng, Guanghao Ye

Abstract

Conference

arXiv

We present a nearly-linear time algorithm for finding a minimum-cost flow in planar graphs with polynomially bounded integer costs and capacities. The previous fastest algorithm for this problem was based on interior point methods (IPMs) and worked for general sparse graphs in $O(n^{1.5} \textrm{poly}(\log n))$ time [Daitch-Spielman, STOC'08].
Intuitively, $\Omega(n^{1.5})$ is a natural runtime barrier for IPM based methods, since they require iterations, each routing a possibly-dense electrical flow. To break this barrier, we develop a new implicit representation for flows based on generalized nested-dissection [Lipton-Rose-Tarjan, JSTOR'79] and approximate Schur complements [Kyng-Sachdeva, FOCS'16]. This implicit representation permits us to design a data structure to route an electrical flow with sparse demands in roughly update time, resulting in a total running time of $O(n \cdot \textrm{poly}(\log n))$.
Our results immediately extend to all families of separable graphs.

Almost-linear Time Weighted $\ell_p$-norm Solvers in Slightly Dense Graphs

ICALP 2021

with Deeksha Adil, Brian Bullins, Rasmus Kyng

Abstract

Conference

arXiv

Video

We give almost-linear-time algorithms for constructing sparsifiers with $n\cdot \text{poly}(\log n)$ edges that approximately preserve weighted $\ell^{2}_2 + \ell^{p}_p$ flow or voltage objectives on graphs. For flow objectives, this is the first sparsifier construction for such mixed objectives beyond unit $\ell_p$ weights, and is based on expander decompositions. For voltage objectives, we give the first sparsifier construction for these objectives, which we build using graph spanners and leverage score sampling. Together with the iterative refinement framework of [Adil et al, SODA 2019], and a new multiplicative-weights based constant-approximation algorithm for mixed-objective flows or voltages, we show how to find $(1+2^{-\text{poly}(\log n)})$ approximations for weighted $\ell_p$-norm minimizing flows or voltages in $p(m^{1+o(1)} + n^{4/3 + o(1)})$ time for $p=\omega(1)$, which is almost-linear for graphs that are slightly dense ($m \ge n^{4/3 + o(1)}$).

Graph Sparsification, Spectral Sketches, and Faster Resistance Computation,
via Short Cycle Decompositions

SICOMP 2023
FOCS 2018

with Timothy Chu, Yu Gao, Richard Peng, Saurabh Sawlani, Junxing Wang

Abstract

Journal

Conference

arXiv

Video

We develop a framework for graph sparsification and sketching, based on a new tool, short cycle decomposition -- a decomposition of an unweighted graph into an edge-disjoint collection of short cycles, plus few extra edges. A simple observation gives that every graph $G$ on $n$ vertices with m edges can be decomposed in $O(mn)$ time into cycles of length at most $2 \log n$, and at most $2n$ extra edges. We give an $m^{1+o(1)}$ time algorithm for constructing a short cycle decomposition, with cycles of length $n^{o(1)}$, and $n^{1+o(1)}$ extra edges. These decompositions enable us to make progress on several open questions:

* We give an algorithm to find $(1\pm\epsilon)$-approximations to effective resistances of all edges in time $m^{1+o(1)}\epsilon^{-1.5}$, improving over the previous best of $\widetilde{O}(\min\{m\epsilon^{-2},n^2 \epsilon^{-1}\})$. This gives an algorithm to approximate the determinant of a Laplacian up to $(1\pm\epsilon)$ in $m^{1 + o(1)} + n^{15/8+o(1)}\epsilon^{-7/4}$ time.

* We show existence and efficient algorithms for constructing graphical spectral sketches -- a distribution over sparse graphs H such that for a fixed vector $x$, we have w.h.p. $x'L_Hx=(1\pm\epsilon)x'L_Gx$ and $x'L_H^+x=(1\pm\epsilon)x'L_G^+x$. This implies the existence of resistance-sparsifiers with about $n\epsilon^{-1}$ edges that preserve the effective resistances between every pair of vertices up to $(1\pm\epsilon).$

* By combining short cycle decompositions with known tools in graph sparsification, we show the existence of nearly-linear sized degree-preserving spectral sparsifiers, as well as significantly sparser approximations of directed graphs. The latter is critical to recent breakthroughs on faster algorithms for solving linear systems in directed Laplacians.

Improved algorithms for constructing short cycle decompositions will lead to improvements for each of the above results.

Faster p-norm minimizing flows, via smoothed q-norm problems

SODA 2020

with Deeksha Adil

Abstract

Conference

arXiv

We present faster high-accuracy algorithms for computing $\ell_p$-norm minimizing flows. On a graph with $m$ edges, our algorithm can compute a $(1+1/\text{poly}(m))$-approximate unweighted $\ell_p$-norm minimizing flow with $pm^{1 + \frac{1}{p-1} + o(1)}$ operations, for any $p \ge 2$, giving the best bound for $p \gtrsim 5.24$. Combined with the algorithm from (Adil et al., SODA '19), we can now compute such flows for any $2 \le p \le \poly(\log m)$ in time at most $O(m^{1.24})$. In comparison, the previous best running time for computing unweighted $\ell_p$-norm minimizing flows was $\Omega(m^{1.33})$ for large constant $p$. For $p \sim \delta^{-1}\log m,$ our algorithm computes a $(1+\delta)$-approximate maximum flow on undirected graphs using $m^{1+o(1)}\delta^{-1}$ operations, matching the current best bound, albeit only for unit-capacity graphs. We also give an algorithm for solving general $\ell_{p}$-norm regression problems for large $p$. Our algorithm makes $p m^{\frac{1}{3} + o(1)} \log^{2} (1/\varepsilon)$ calls to a linear solver. This gives the first high-accuracy algorithm for computing weighted $\ell_{p}$-norm minimizing flows that runs in time $o(m^{1.5})$ for $p = n^{\Omega(1)}$. Our main technical tool is smoothed $\ell_p$-norm problems introduced by Adil et al. (SODA '19), and our key technical contribution is to show that these problems are interreducible for different values of $p$. We show that $\ell_p$-norm problems can be solved to high-accuracy by making $p m^{\max\{\frac{1}{p-1}, \frac{1}{q}\}} \log^{2} \frac{m}{\varepsilon}$ calls to an algorithm for solving smoothed $\ell_q$-norm problems.

Flows in Almost Linear Time via Adaptive Preconditioning

STOC 2019

with Richard Peng, Rasmus Kyng, Di Wang

Abstract

Conference

arXiv

We present algorithms for solving a large class of flow problems on unit weighted graphs to high accuracy in almost-linear time. Our framework is inspired by the routing based solver for Laplacian linear systems by Spielman and Teng (STOC '04, SIMAX '14), and has a similar overall recursive structure. Our adaptation of it to flow problems requires several new components including adaptive non-linear preconditioning, tree routings that give approximations in two norms simultaneously, and gradient-preserving sparsification for wide class of objective functions that interpolate between $\ell_2$ and $\ell_{\infty}$ norms.

Using this framework, we obtain algorithms for computing $1 + 1/\textrm{poly}(n)$-approximate p-norm minimizing flows on unit weighted graphs in about $2^{p^{3/2}} m^{1+1/{\sqrt{p}}}$ time when $p > 2$. For $p = \sqrt{\log n} $ this is an almost linear time algorithm. Using this algorithm as an oracle inside multiplicative weights update gives: (1) a new approach for computing approximate maximum flows on undirected graphs that does not use oblivious routings; (2) the first almost-linear time high-accuracy algorithms for $p$-norm semi-supervised learning on graphs when $p < 2$; and (3) the first almost-linear time algorithm for approximate total variation minimization.

Iterative Refinement for $\ell_p$-norm Regression

SODA 2019

with Deeksha Adil, Rasmus Kyng, Richard Peng

Abstract

Conference

arXiv

We give improved algorithms for the $\ell_p$-regression problem, $\min \|x\|_p$ such that $Ax = b$, for all $p \in (1,2) \cup (2,\infty)$. Our algorithms obtain a high accuracy solution in $\widetilde{O}_p(m^{\frac{|p-2|}{2p+|p-2|}}) \le \widetilde{O}_p(m^{{1}/{3}})$ iterations, where each iteration requires solving an $m \times m$ linear system, with $m$ being the dimension of the ambient space.

Incorporating a procedure for maintaining an approximate inverse of the linear systems that we need to solve at each iteration, we give algorithms for solving $\ell_p$-regression to 1/poly(n) accuracy that runs in time $\widetilde{O}_p(m^{\max\{\omega, 7/3\}}) $, where $\omega$ is the matrix multiplication constant. For the current best value of $\omega > 2.37$, this means that we can solve $\ell_p$ regression as fast as $\ell_p$ regression, for all constant $p$ bounded away from 1.

Our algorithms can be combined with nearly-linear time solvers for linear systems in graph Laplacians to give minimum $\ell_p$-norm flow / voltage solutions to 1/poly(n) accuracy on an undirected graph with $m$ edges in $\widetilde{O}_p(m^{1+\frac{|p-2|}{2p+|p-2|}}) \le \widetilde{O}_p(m^{{4}/{3}})$ time.

For sparse graphs and for matrices with similar dimensions, our iteration counts and running times improve upon the p-norm regression algorithm by [Bubeck-Cohen-Lee-Li STOC‘18], as well as general purpose convex optimization algorithms. At the core of our algorithms is an iterative refinement scheme for $\ell_p$-norms, using the quadratically-smoothed $\ell_p$-norms introduced in the work of Bubeck et al. Formally, given an initial solution, we construct a problem that seeks to minimize a quadratically-smoothed $\ell_p$ norm over a subspace, such that a crude solution to this problem allows us to improve the initial solution by a constant factor, leading to algorithms with fast convergence.

Short Cycles via Low-Diameter Decompositions

SODA 2019

with Yang P. Liu, Zejun Yu

Abstract

Conference

arXiv

We present improved algorithms for short cycle decomposition of a graph. Short cycle decompositions were introduced in the recent work of Chu et al, and were used to make progress on several questions in graph sparsification.

For all constants $\delta \in (0,1]$, we give an $O(mn^\delta)$ time algorithm that, given a graph G, partitions its edges into cycles of length $O(\log n)^{\frac{1}{\delta}}$, with $O(n)$ extra edges not in any cycle. This gives the first subquadratic, in fact almost linear time, algorithm achieving polylogarithmic cycle lengths. We also give an $m\exp(O(\sqrt{\log n}))$ time algorithm that partitions the edges of a graph into cycles of length $exp(O(\sqrt{\log n}\log \log n))$, with $O(n)$ extra edges not in any cycle. This improves on the short cycle decomposition algorithms given in Chu et al in terms of all parameters, and is significantly simpler.

As a result, we obtain faster algorithms and improved guarantees for several problems in graph sparsification -- construction of resistance sparsifiers, graphical spectral sketches, degree preserving sparsifiers, and approximating the effective resistances of all edges.

Sampling Random Spanning Trees Faster than Matrix Multiplication

STOC 2017

with David Durfee, Rasmus Kyng, John Peebles, Anup B. Rao

Abstract

Conference

arXiv

We present an algorithm that, with high probability, generates a random spanning tree from an edge-weighted undirected graph in $\widetilde{O}(n^{\frac{4}{3}} m^{\frac{1}{2}} + n^2)$ time (The $\widetilde{O}$ notation hides polylog(n) factors). The tree is sampled from a distribution where the probability of each tree is proportional to the product of its edge weights. This improves upon the previous best algorithm due to Colbourn et al. that runs in matrix multiplication time, $O(n^\omega)$. For the special case of unweighted graphs, this improves upon the best previously known running time of $\widetilde{O}(\min\{n^\omega, m\sqrt{n}, m^{\frac{4}{3}}\})$ for $m \gg n^{\frac{5}{3}}$ (Colbourn et al. '96, Kelner-Madry '09, Madry et al. '15).

The effective resistance metric is essential to our algorithm, as in the work of Madry et al., but we eschew determinant-based and random walk-based techniques used by previous algorithms. Instead, our algorithm is based on Gaussian elimination, and the fact that effective resistance is preserved in the graph resulting from eliminating a subset of vertices (called a Schur complement). As part of our algorithm, we show how to compute ϵ-approximate effective resistances for a set S of vertex pairs via approximate Schur complements in $\widetilde{O}(m + (n+|S|)\epsilon^{-2})$ time, without using the Johnson-Lindenstrauss lemma which requires $\widetilde{O} \min\{(m+|S|)\epsilon^{-2}, m +n \epsilon^{-4} + |S| \epsilon^{-2}\} $ time. We combine this approximation procedure with an error correction procedure for handing edges where our estimate isn't sufficiently accurate.

A Framework for Analyzing Resparsification Algorithms

SODA 2017

with Rasmus Kyng, Jakub Pachocki, Richard Peng

Abstract

Conference

arXiv

A spectral sparsifier of a graph G is a sparser graph H that approximately preserves the quadratic form of G, i.e., for all vectors x, $x^{\top}L_G x \approx x^{\top}L_H x$ , where $L_G, L_H$ denote the respective graph Laplacians. Spectral sparsifiers generalize cut sparsifiers, and have found several applications in designing graph algorithms. In recent years, there has been interest in computing spectral sparsifiers in the semi-streaming and dynamic settings. Natural algorithms in these settings involve repeated sparsification of a dynamic graph. We present a framework for analyzing algorithms for graph sparsification that perform repeated sparsifications. The framework yields analyses that avoid a worst-case error accumulation across various resparsification steps, and only incur the error corresponding to a single sparsification step leading to better results. As an application, we show how to maintain a spectral sparsifier of a graph, with $O(n \log n)$ edges in a semi-streaming setting: We present a simple algorithm that, for a graph G on n vertices and m edges, computes a spectral sparsifier of G with $O(n \log n)$ edges in a single pass over G, using only $O(n \log n)$ space, and $O(m \log^2 n)$ total time. This improves on previous best constructions in the semi-streaming set- ting for both spectral and cut sparsifiers by a factor of log n in both space and runtime. The algorithm also extends to semi-streaming row sampling for general PSD matrices. As another example, we use this framework to combining an algorithm due to Koutis with improved spanner constructions to give a parallel algorithm for construction $O(n \log^2 n \log \log n)$ sized spectral sparsifiers in $O(m \log^2 n \log \log n)$ time. This is the best combinatorial graph sparsification algorithm to date, and the size of the sparsifiers produced is only a factor $\log n \log \log n$ more than numerical ones.

The mixing time of the Dikin walk in a polytope—A simple proof

OR Letters 2016

with Nisheeth K. Vishnoi

Abstract

Journal

arXiv

Sampling points from the uniform distribution on a polytope is a well-studied problem, and is an important ingredient in several computational tasks involving polytopes, such as volume estimation. This is achieved by setting up a random walk inside the polytope, with its stationary distribution being uniform in the interior of the polytope. Kannan-Narayanan and Narayanan proposed the Dikin walk based on interior point methods, where the next point is sampled, roughly, from the Dikin ellipsoid at the current point. In this paper, we give a simple proof of the mixing time of the Dikin walk, using well-known properties of Gaussians, and concentration of Gaussian polynomials.

Approximate Gaussian Elimination for Laplacians:
Fast, Sparse, and Simple

FOCS 2016

with Rasmus Kyng

Abstract

Conference

arXiv

Video

We show how to perform sparse approximate Gaussian elimination for Laplacian matrices. We present a simple, nearly linear time algorithm that approximates a Laplacian by a matrix with a sparse Cholesky factorization, the version of Gaussian elimination for symmetric matrices. This is the first nearly linear time solver for Laplacian systems that is based purely on random sampling, and does not use any graph theoretic constructions such as low-stretch trees, sparsifiers, or expanders. The crux of our analysis is a novel concentration bound for matrix martingales where the differences are sums of conditionally independent variables.

Sparsified Cholesky and Multigrid Solvers for Connection Laplacians

STOC 2016

with Rasmus Kyng, Yin Tat Lee, Richard Peng, Daniel A. Spielman

Abstract

Conference

arXiv

We introduce the sparsified Cholesky and sparsified multigrid algorithms for solving systems of linear equations. These algorithms accelerate Gaussian elimination by sparsifying the nonzero matrix entries created by the elimination process.

We use these new algorithms to derive the first nearly linear time algorithms for solving systems of equations in connection Laplacians—a generalization of Laplacian matrices that arise in many problems in image and signal processing.

We also prove that every connection Laplacian has a linear sized approximate inverse. This is an LU factorization with a linear number of nonzero entries that is a strong approximation of the original matrix. Using such a factorization one can solve systems of equations in a connection Laplacian in linear time. Such a factorization was unknown even for ordinary graph Laplacians.

Faster Algorithms via Approximation Theory

FnTTCS Monograph

with Nisheeth K. Vishnoi

Abstract

Now Publishers

PDF

Faster Algorithms via Approximation Theory illustrates how classical and modern techniques from approximation theory play a crucial role in obtaining results that are relevant to the emerging theory of fast algorithms. The key lies in the fact that such results imply faster ways to approximate primitives such as $A^s v, A^{-1}v, \exp(-A)v$, eigenvalues and eigenvectors, which are fundamental to many spectral algorithms.

The first half of the book is devoted to the ideas and results from approximation theory that are central, elegant, and may have wider applicability in theoretical computer science. These include not only techniques relating to polynomial approximations but also those relating to approximations by rational functions and beyond. The remaining half illustrates a variety of ways that these results can be used to design fast algorithms.

Faster Algorithms via Approximation Theory is self-contained and should be of interest to researchers and students in theoretical computer science, numerical linear algebra, and related areas.

Approximating the Exponential, the Lanczos Method and
an $\widetilde{O}(m)$-Time Spectral Algorithm for Balanced Separator

STOC 2012

with Lorenzo Orecchia, Nisheeth K. Vishnoi

Abstract

Conference

arXiv

Video

We give a novel spectral approximation algorithm for the balanced separator problem that, given a graph G, a constant balance b $\in (0,\frac{1}{2}],$ and a parameter $\gamma,$ either finds an $\Omega(b)$-balanced cut of conductance $O(\sqrt{\gamma})$ in G, or outputs a certificate that all b-balanced cuts in G have conductance at least $\gamma,$ and runs in time $\widetilde{O}(m).$ This settles the question of designing asymptotically optimal spectral algorithms for balanced separator. Our algorithm relies on a variant of the heat kernel random walk and requires, as a subroutine, an algorithm to compute exp(-L)v where L is the Laplacian of a graph related to G and v is a vector. Algorithms for computing the matrix-exponential-vector product efficiently comprise our next set of results. Our main result here is a new algorithm which computes a good approximation to exp(-A)v for a class of symmetric positive semidefinite (PSD) matrices A and a given vector v, in time roughly $\widetilde{O}(m_A),$ where $m_A$ is the number of non-zero entries of A. This uses, in a non-trivial way, the breakthrough result of Spielman and Teng on inverting symmetric and diagonally-dominant matrices in $\widetilde{O}(m_A)$ time. Finally, we prove that $e^{-x}$ can be uniformly approximated up to a small additive error, in a non-negative interval [a,b] with a polynomial of degree roughly $\sqrt{b-a}.$ While this result is of independent interest in approximation theory, we show that, via the Lanczos method from numerical analysis, it yields a simple algorithm to compute exp(-A)v for symmetric PSD matrices that runs in time roughly $O(t_A\cdot \sqrt{\|A\|}),$ where $t_A$ is time required for the computation of the vector Aw for given vector w. As an application, we obtain a simple and practical algorithm, with output conductance $O(\sqrt{\gamma}),$ for balanced separator that runs in time $\widetilde{O}(\frac{m}{\sqrt{\gamma}}).$ This latter algorithm matches the running time, but improves on the approximation guarantee of the Evolving-Sets-based algorithm by Andersen and Peres for balanced separator.

Matrix Inversion is as easy as Exponentiation

Manuscript

with Nisheeth K. Vishnoi

Abstract

arXiv

We prove that the inverse of a positive-definite matrix can be approximated by a weighted-sum of a small number of matrix exponentials. Combining this with a previous result [OSV12], we establish an equivalence between matrix inversion and exponentiation up to polylogarithmic factors. In particular, this connection justifies the use of Laplacian solvers for designing fast semi-definite programming based algorithms for certain graph problems. The proof relies on the Euler-Maclaurin formula and certain bounds derived from the Riemann zeta function.

Algorithmic Questions in Machine Learning and Statistics

Optimal Methods for Higher-Order Smooth Monotone Variational Inequalities

Preprint

with Deeksha Adil, Brian Bullins, Arun Jambulapati

Abstract

arXiv

In this work, we present new simple and optimal algorithms for solving the variational inequality (VI) problem for pth-order smooth, monotone operators -- a problem that generalizes convex optimization and saddle-point problems. Recent works (Bullins and Lai (2020), Lin and Jordan (2021), Jiang and Mokhtari (2022)) present methods that achieve a rate of $\widetilde{O}(\varepsilon^{-2/(p+1)})$ for $p \ge 1$, extending results by (Nemirovski (2004)) and (Monteiro and Svaiter (2012)) for p=1,2. A drawback to these approaches, however, is their reliance on a line search scheme. We provide the first pth-order method that achieves a rate of $\widetilde{O}(\varepsilon^{-2/(p+1)})$. Our method does not rely on a line search routine, thereby improving upon previous rates by a logarithmic factor. Building on the Mirror Prox method of Nemirovski (2004), our algorithm works even in the constrained, non-Euclidean setting. Furthermore, we prove the optimality of our algorithm by constructing matching lower bounds. These are the first lower bounds for smooth MVIs beyond convex optimization for p>1. This establishes a separation between solving smooth MVIs and smooth convex optimization, and settles the oracle complexity of solving pth-order smooth MVIs.

Unifying Width-Reduced Methods for Quasi-Self-Concordant Optimization

NeurIPS 2021

with Deeksha Adil, Brian Bullins

Abstract

Conference

arXiv

We provide several algorithms for constrained optimization of a large class of convex problems, including softmax, $\ell_p$ regression, and logistic regression. Central to our approach is the notion of width reduction, a technique which has proven immensely useful in the context of maximum flow [Christiano et al., STOC'11] and, more recently, ℓp regression [Adil et al., SODA'19], in terms of improving the iteration complexity from $O(m^{1/2})$ to $\widetilde{O}(m^{1/3})$, where $m$ is the number of rows of the design matrix, and where each iteration amounts to a linear system solve. However, a considerable drawback is that these methods require both problem-specific potentials and individually tailored analyses. As our main contribution, we initiate a new direction of study by presenting the first unified approach to achieving $m^{1/3}$-type rates. Notably, our method goes beyond these previously considered problems to more broadly capture quasi-self-concordant losses, a class which has recently generated much interest and includes the well-studied problem of logistic regression, among others. In order to do so, we develop a unified width reduction method for carefully handling these losses based on a more general set of potentials. Additionally, we directly achieve $m^{1/3}$-type rates in the constrained setting without the need for any explicit acceleration schemes, thus naturally complementing recent work based on a ball-oracle approach [Carmon et al., NeurIPS'20].

Regularized linear autoencoders recover the principal components, eventually

NeurIPS 2020

with Xuchan Bao, James Lucas, Roger Grosse

Abstract

Conference

arXiv

Our understanding of learning input-output relationships with neural nets has improved rapidly in recent years, but little is known about the convergence of the underlying representations, even in the simple case of linear autoencoders (LAEs). We show that when trained with proper regularization, LAEs can directly learn the optimal representation -- ordered, axis-aligned principal components. We analyze two such regularization schemes: non-uniform L2 regularization and a deterministic variant of nested dropout [Rippel et al, ICML' 2014]. Though both regularization schemes converge to the optimal representation, we show that this convergence is slow due to ill-conditioning that worsens with increasing latent dimension. We show that the inefficiency of learning the optimal representation is not inevitable -- we present a simple modification to the gradient descent update that greatly speeds up convergence empirically.

Faster Graph Embeddings via Coarsening

ICML 2020

with Matthew Fahrbach, Gramoz Goranci, Richard Peng, Chi Wang

Abstract

Conference

arXiv

Graph embeddings are a ubiquitous tool for machine learning tasks, such as node classification and link prediction, on graph-structured data. However, computing the embeddings for large-scale graphs is prohibitively inefficient even if we are interested only in a small subset of relevant vertices. To address this, we present an efficient graph coarsening approach, based on Schur complements, for computing the embedding of the relevant vertices. We prove that these embeddings are preserved exactly by the Schur complement graph that is obtained via Gaussian elimination on the non-relevant vertices. As computing Schur complements is expensive, we give a nearly-linear time algorithm that generates a coarsened graph on the relevant vertices that provably matches the Schur complement in expectation. Our experiments involving prediction tasks on graphs demonstrate that computing embeddings on the coarsened graph, rather than the entire graph, leads to significant time savings without sacrificing accuracy.

Fast, Provably convergent IRLS Algorithm for p-norm Linear Regression

Neurips 2019

with Deeksha Adil, Richard Peng

Abstract

Conference

arXiv

Video

Code

Linear regression in $\ell_p$-norm is a canonical optimization problem that arises in several applications, including sparse recovery, semi-supervised learning, and sig- nal processing. Generic convex optimization algorithms for solving lp-regression are slow in practice. Iteratively Reweighted Least Squares (IRLS) is an easy to implement family of algorithms for solving these problems that has been studied for over 50 years. However, these algorithms often diverge for p > 3, and since the work of Osborne (1985), it has been an open problem whether there is an IRLS algorithm that is guaranteed to converge rapidly for p > 3. We propose p-IRLS, the first IRLS algorithm that provably converges geometrically for any $p \in [2,\infty)$. Our algorithm is simple to implement and is guaranteed to find a high accuracy solution in a sub-linear number of iterations. Our experiments demonstrate that it performs even better than our theoretical bounds, beats the standard Matlab/CVX implementation for solving these problems by 10–50x, and is the fastest among available implementations in the high-accuracy regime.

Which Algorithmic Choices Matter at Which Batch Sizes?
Insights From a Noisy Quadratic Model

Neurips 2019

Guodong Zhang, Lala Li, Zachary Nado, James Martens, Sushant Sachdeva, George E. Dahl, Christopher J. Shallue, Roger Grosse

Abstract

Conference

arXiv

Increasing the batch size is a popular way to speed up neural network training, but beyond some critical batch size, larger batch sizes yield diminishing returns. In this work, we study how the critical batch size changes based on properties of the optimization algorithm, including acceleration and preconditioning, through two different lenses: large scale experiments, and analysis of a simple noisy quadratic model (NQM). We experimentally demonstrate that optimization algorithms that employ preconditioning, specifically Adam and K-FAC, result in much larger critical batch sizes than stochastic gradient descent with momentum. We also demonstrate that the NQM captures many of the essential features of real neural network training, despite being drastically simpler to work with. The NQM predicts our results with preconditioned optimizers, previous results with accelerated gradient descent, and other results around optimal learning rates and large batch training, making it a useful tool to generate testable predictions about neural network optimization.

Multi-component Graph based Semi-Supervised Learning

AISTATS 2019

with Krishnamurthy Viswanathan, Andrew Tomkins, Sujith Ravi

Abstract

Conference

We present a new approach for graph based semi-supervised learning based on a multi- component extension to the Gaussian MRF model. This approach models the observa- tions on the vertices as a Gaussian ensemble with an inverse covariance matrix that is a weighted linear combination of multiple matrices. Building on randomized matrix trace estimation and fast Laplacian solvers, we de- velop fast and efficient algorithms for comput- ing the best-fit (maximum likelihood) model and the predicted labels using gradient de- scent. Our model is considerably simpler, with just tens of parameters, and a single hyperparameter, in contrast with state-of-the- art approaches using deep learning techniques. Our experiments on benchmark citation net- works show that the best-fit model estimated by our algorithm leads to significant improve- ments on all datasets compared to baseline models. Further, our performance compares favorably with several state-of-the-art methods on these datasets, and is comparable with the best performances.

Convergence Results for Neural Networks via Electrodynamics

ITCS 2018

with Rina Panigrahy, Ali Rahimi, Qiuyi Zhang

Abstract

Conference

arXiv

We study whether a depth two neural network can learn another depth two network using gradient descent. Assuming a linear output node, we show that the question of whether gradient descent converges to the target function is equivalent to the following question in electrodynamics: Given k fixed protons in $\mathbb{R}^d$, and k electrons, each moving due to the attractive force from the protons and repulsive force from the remaining electrons, whether at equilibrium all the electrons will be matched up with the protons, up to a permutation. Under the standard electrical force, this follows from the classic Earnshaw's theorem. In our setting, the force is determined by the activation function and the input distribution. Building on this equivalence, we prove the existence of an activation function such that gradient descent learns at least one of the hidden nodes in the target network. Iterating, we show that gradient descent can be used to learn the entire network one node at a time.

Fast, Provable Algorithms for Isotonic Regression in all $\ell_{p}$-norms

NIPS 2015

with Rasmus Kyng, Anup Rao

Abstract

Conference

arXiv

Code

Given a directed acyclic graph G, and a set of values y on the vertices, the Isotonic Regression of y is a vector x that respects the partial order described by G, and minimizes $\|x−y\|$, for a specified norm. This paper gives improved algorithms for computing the Isotonic Regression for all weighted $\ell_{p}$-norms with rigorous performance guarantees. Our algorithms are quite practical, and their variants can be implemented to run fast in practice.

Algorithms for Lipschitz Learning on Graphs

COLT 2015

with Rasmus Kyng, Anup Rao, Daniel Spielman

Abstract

Conference

arXiv

Video

Code

We develop fast algorithms for solving regression problems on graphs where one is given the value of a function at some vertices, and must find its smoothest possible extension to all vertices. The extension we compute is the absolutely minimal Lipschitz extension, and is the limit for large p of p-Laplacian regularization. We present an algorithm that computes a minimal Lipschitz extension in expected linear time, and an algorithm that computes an absolutely minimal Lipschitz extension in expected time $\widetilde{O}(mn)$. The latter algorithm has variants that seem to run much faster in practice. These extensions are particularly amenable to regularization: we can perform $\ell_{0}$-regularization on the given values in polynomial time and $\ell_{1}$-regularization on the initial function values and on graph edge weights in time $\widetilde{O}(m^{\frac{3}{2}})$.

Provable ICA with Unknown Gaussian Noise, and Implications
for Gaussian Mixtures and Autoencoders

Algorithmica 2015
NIPS 2012

with Sanjeev Arora, Rong Ge, Ankur Moitra

Abstract

Journal

Conference

arXiv

We present a new algorithm for Independent Component Analysis (ICA) which has provable performance guarantees. In particular, suppose we are given samples of the form y = Ax + $\eta$ where A is an unknown n X n matrix and x is chosen uniformly at random from $\{+1, -1\}^n$, $\eta$ is an n-dimensional Gaussian random variable with unknown covariance $\Sigma$: We give an algorithm that provable recovers A and $\Sigma$ up to an additive $\epsilon$ whose running time and sample complexity are polynomial in n and $1 / \epsilon$. To accomplish this, we introduce a novel "quasi-whitening" step that may be useful in other contexts in which the covariance of Gaussian noise is not known in advance. We also give a general framework for finding all local optima of a function (given an oracle for approximately finding just one) and this is a crucial step in our algorithm, one that has been overlooked in previous attempts, and allows us to control the accumulation of error when we find the columns of A one by one via local search.

Finding Overlapping Communities in Social Networks:
Toward a Rigorous Approach

EC 2012

with Sanjeev Arora, Rong Ge, Grant Schoenebeck

Abstract

Conference

arXiv

A community in a social network is usually understood to be a group of nodes more densely connected with each other than with the rest of the network. This is an important concept in most domains where networks arise: social, technological, biological, etc. For many years algorithms for ﬁnding communities implicitly assumed communities are nonoverlapping (leading to use of clustering-based approaches) but there is increasing interest in ﬁnding overlapping communities. A barrier to ﬁnding communities is that the solution concept is often deﬁned in terms of an NP-complete problem such as Clique or Hierarchical Clustering.

This paper seeks to initiate a rigorous approach to the problem of ﬁnding overlapping communities, where “rigorous” means that we clearly state the following: (a) the object sought by our algorithm (b) the assumptions about the underlying network (c) the (worst-case) running time.
Our assumptions about the network lie between worst-case and average-case. An average-case analysis would require a precise probabilistic model of the network, on which there is currently no consensus. However, some plausible assumptions about network parameters can be gleaned from a long body of work in the sociology community spanning ﬁve decades focusing on the study of individual communities and ego-centric networks (in graph theoretic terms, this is the subgraph induced on a node’s neighborhood). Thus our assumptions are somewhat “local” in nature. Nevertheless they suﬃce to permit a rigorous analysis of running time of algorithms that recover global structure.

Our algorithms use random sampling similar to that in property testing and algorithms for dense graphs. We note however that our networks are not necessarily dense graphs, not even in local neighborhoods. Our algorithms explore a local-global relationship between ego-centric and socio-centric networks that we hope will provide a fruitful framework for future work both in computer science and sociology.

Hardness of Approximation

Inapproximability of Minimum Vertex Cover on
k-Uniform k-Partite Hypergraphs

SIDMA 2015

with Venkatesan Guruswami, Rishi Saket

Abstract

Journal

ECCC

We study the problem of computing the minimum vertex cover on k-uniform k-partite hypergraphs when the k-partition is given. On bipartite graphs (k=2), the minimum vertex cover can be computed in polynomial time. For k$\ge 3,$ this problem is known to be NP-hard. For general k, the problem was studied by Lovász, who gave a $\frac{k}{2}$-approximation based on the standard LP relaxation. Subsequent work by Aharoni, Holzman, and Krivelevich showed a tight integrality gap of $\frac{k}{2} - o(1))$ for the LP relaxation. We further investigate the inapproximability of minimum vertex cover on k-uniform k-partite hypergraphs and present the following results (here $\varepsilon > 0$ is an arbitrarily small constant): NP-hardness of obtaining an approximation factor of $(\frac{k}{4} - \varepsilon)$ for even k and $(\frac{k}{4} - \frac{1}{4k} - \varepsilon)$ for odd k, NP-hardness of obtaining a nearly optimal approximation factor of $(\frac{k}{2}-1+\frac{1}{2k}-\varepsilon)$, and an optimal unique games-hardness for approximation within factor $(\frac{k}{2} - \varepsilon)$, showing the optimality of Lovász's algorithm if one assumes the Unique Games conjecture. The first hardness result is based on a reduction from minimum vertex cover in r-uniform hypergraphs, for which NP-hardness of approximating within $r - 1 -\varepsilon$ was shown by Dinur, Guruswami, Khot, and Regev. We include it for its simplicity, despite it being subsumed by the second hardness result. The unique games-hardness result is obtained by applying the results of Kumar, Manokaran, Tulsiani, and Vishnoi, with a slight modification, to the LP integrality gap due to Aharoni, Holzman, and Krivelevich. The modification ensures that the reduction preserves the desired structural properties of the hypergraph. The reduction for the nearly optimal NP-hardness result relies on the multilayered PCP of Dinur, Guruswami, Khot, and Regev and uses a gadget based on biased long codes adapted from the LP integrality gap of Aharoni, Holzman, and Krivelevich. Our reduction requires the analysis of several long codes with different biases, for which we prove structural properties of the so-called cross-intersecting collections of set families, variants of which have been studied in extremal set theory

Optimal Inapproximability for Scheduling Problems
via Structural Hardness for Hypergraph Vertex Cover

CCC 2013

with Rishi Saket

Abstract

Conference

This work studies the inapproximability of Minimum Vertex Cover on uniform hypergraphs from a specific structural perspective. Our study is motivated by applications to two well known scheduling problems: Concurrent Open Shop and the Assembly Line problem. For both these problems, Bansal and Khot [BK10] obtained tight $(2-\epsilon)$-factor inapproximability, assuming the Unique Games Conjecture (UGC). In this work, we prove optimal $(2-\epsilon)$-factor NP-hardness of approximation for both these problems unconditionally, i.e., without assuming UGC.

Our results for the scheduling problems follow from a structural hardness for Minimum Vertex Cover on hypergraphs -- an unconditional but weaker analog of a similar result of Bansal and Khot [BK10] which however, is based on UGC.

Nearly Optimal NP-Hardness of Vertex Cover on $k$-Uniform
$k$-Partite Hypergraphs

Approx 2011

with Rishi Saket

Abstract

Conference

arXiv

We study the problem of computing the minimum vertex cover on $k$-uniform $k$-partite hypergraphs when the $k$-partition is given. On bipartite graphs $(k=2)$, the minimum vertex cover can be computed in polynomial time. For general $k$, the problem was studied by Lovász, who gave a $\frac{k}{2}$-approximation based on the standard LP relaxation. Subsequent work by Aharoni, Holzman and Krivelevich showed a tight integrality gap of $\frac{k}{2} - o(1)$ for the LP relaxation. While this problem was known to be NP-hard for $k\geq 3$, the first non-trivial NP-hardness of approximation factor of $\frac{k}{4}-\epsilon$ was shown in a recent work by Guruswami and Saket. They also showed that assuming Khot's Unique Games Conjecture yields a $\frac{k}{2}-\epsilon$ inapproximability for this problem, implying the optimality of Lovász's result.

In this work, we show that this problem is NP-hard to approximate within $\frac{k}{2}-1+\frac{1}{2k}-\epsilon$. This hardness factor is off from the optimal by an additive constant of at most 1 for $k\geq 4$. Our reduction relies on the Multi-Layered PCP of Dinur et al. and uses a gadget -- based on biased Long Codes -- adapted from the LP integrality gap of Aharoni et al. The nature of our reduction requires the analysis of several Long Codes with different biases, for which we prove structural properties of the so called cross-intersecting collections of set families -- variants of which have been studied in extremal set theory.

Cuts in Cartesian Products of Graphs

Manuscript

with Madhur Tulsiani

Abstract

arXiv

Video

The $k$-fold Cartesian product of a graph $G$ is defined as a graph on tuples $(x_1,\ldots,x_k)$ where two tuples are connected if they form an edge in one of the positions and are equal in the rest. Starting with $G$ as a single edge gives $G^{\Box k}$ as a $k$-dimensional hypercube. We study the distributions of edges crossed by a cut in $G^{\Box k}$ across the copies of $G$ in different positions. This is a generalization of the notion of influences for cuts on the hypercube.

We show the analogues of the statements of Kahn, Kalai and Linial (KKL theorem) and that of Friedgut (Friedgut's Junta Theorem), for the setting of Cartesian products of arbitrary graphs. We also extends the work on studying isoperimetric constants for these graphs to the value of semidefinite relaxations for expansion. We connect the optimal values of the relaxations for computing expansion, given by various semidefinite hierarchies, for $G$ and $G^{\Box k}$.

Other Publications

Near-optimal approximation algorithm for simultaneous Max-Cut

SODA 2018

with Amey Bhangale, Subhash Khot, Swastik Kopparty, Devanathan Thiruvenkatachari

Abstract

Conference

arXiv

In the simultaneous Max-Cut problem, we are given $k$ weighted graphs on the same set of $n$ vertices, and the goal is to find a cut of the vertex set so that the minimum, over the $k$ graphs, of the cut value is as large as possible. Previous work [BKS15] gave a polynomial time algorithm which achieved an approximation factor of $\frac{1}{2}-o(1)$ for this problem (and an approximation factor of $\frac{1}{2} + \epsilon_k$ in the unweighted case, where $\epsilon_k \to 0$ as $k \to \infty$).
In this work, we give a polynomial time approximation algorithm for simultaneous Max-Cut with an approximation factor of 0.8780 (for all constant $k$). The natural SDP formulation for simultaneous Max-Cut was shown to have an integrality gap of $\frac{1}{2} + \epsilon_k$ in [BKS15]. In achieving the better approximation guarantee, we use a stronger Sum-of-Squares hierarchy SDP relaxation and a rounding algorithm based on Raghavendra-Tan [RT12], in addition to techniques from [BKS15].

Simultaneous Approximation of Constraint Satisfaction Problems

ICALP 2015

with Amey Bhangale, Swastik Kopparty

Abstract

Conference

arXiv

ECCC

Given k collections of 2SAT clauses on the same set of variables V, can we find one assignment that satisfies a large fraction of clauses from each collection? We consider such simultaneous constraint satisfaction problems, and design the first nontrivial approximation algorithms in this context.

Our main result is that for every CSP F, for $k < \widetilde{O}(\log^{\frac{1}{4}} n)$, there is a polynomial time constant factor Pareto approximation algorithm for k simultaneous Max-F-CSP instances. Our methods are quite general, and we also use them to give an improved approximation factor for simultaneous Max-w-SAT (for $k <\widetilde{O}(\log^{\frac{1}{3}} n)$). In contrast, for $k=\omega(\log n)$, no nonzero approximation factor for k simultaneous Max-F-CSP instances can be achieved in polynomial time (assuming the Exponential Time Hypothesis).

These problems are a natural meeting point for the theory of constraint satisfaction problems and multiobjective optimization. We also suggest a number of interesting directions for future research.

New Results in the Theory of Approximation:
Fast Graph Algorithms and Inapproximability

Thesis
Sept 2013

Abstract

PDF

For several basic optimization problems, it is NP-hard to find an exact solution. As a result, understanding the best possible trade-off between the running time of an algorithm and its approximation guarantee, is a fundamental question in theoretical computer science, and the central goal of the theory of approximation.

There are two aspects to the theory of approximation : (1) efficient approximation algorithms that establish trade-offs between approximation guarantee and running time, and (2) inapproximability results that give evidence against them. In this thesis, we contribute to both facets of the theory of approximation.

In the first part of this thesis, we present the first near-linear-time algorithm for Balanced Separator - given a graph, partition its vertices into two roughly equal parts without cutting too many edges - that achieves the best approximation guarantee possible for algorithms in its class. This is a classic graph partitioning problem and has deep connections to several areas of both theory and practice, such as metric embeddings, Markov chains, clustering, etc.

As an important subroutine for our algorithm for Balanced Separator, we provide a near-linear-time algorithm to simulate the heat-kernel random walk on a graph, equivalent to computing $\exp(-L)v$, where $L$ is the Laplacian of the graph, and $v$ is a vector. This algorithm combines techniques from approximation theory and numerical linear algebra to reduce the problem of approximating the matrix exponential to solving a small number of Laplacian systems. We also give a reduction in the reverse direction, from matrix inversion to matrix exponentiation, hence justifying the use of Laplacian system solvers.

In the second part of this thesis, we prove inapproximability results for several basic optimization problems. We address some classic scheduling problems, viz. Concurrent Open Shop and the Assembly Line problem, and variants of the Hypergraph Vertex Cover problem. For all these problems, optimal inapproximability results were previously known under the Unique Games Conjecture. We are able to prove near-optimal inapproximability results for these problems without using the conjecture.

An Arithmetic Analogue of Fox's Triangle Removal Argument

OJAC

with Pooya Hatami, Madhur Tulsiani

Abstract

Journal

arXiv

Video

We give an arithmetic version of the recent proof of the triangle removal lemma by Fox [Fox11], for the group $\mathbb{F}_2^n$.

A triangle in $\mathbb{F}_2^n$ is a tuple $(x,y,z)$ such that $x+y+z = 0$. The triangle removal lemma for $\mathbb{F}_2^n$ states that for every $\epsilon > 0$ there is a $\delta > 0$, such that a subset $A$ of $\mathbb{F}_2^n$ which is $\epsilon$-far from being triangle free, must contain at least $\delta \cdot 2^{2n}$ triangles. We give a direct proof which gives an improved lower bound for $\delta$, analogous to the one obtained by Fox for triangle removal in graphs.

The improved lower bound was already known to follow (for triangle-removal in all groups), using Fox's removal lemma for directed cycles and a reduction by Kral, Serra and Vena. The purpose of this note is to provide a direct Fourier-analytic proof for the group $\mathbb{F}_2^n$.

Testing Permanent Oracles - Revisited

Random 2012

with Sanjeev Arora, Arnab Bhattacharyya, Rajsekar Manokaran

Abstract

Conference

arXiv

ECCC

Suppose we are given an oracle that claims to approximate the permanent for most matrices X, where X is chosen from the Gaussian ensemble the matrix entries are i.i.d.~univariate complex Gaussians). Can we test that the oracle satisfies this claim? This paper gives a polynomial-time algorithm for the task.

The oracle-testing problem is of interest because a recent paper of Aaronson and Arkhipov showed that if there is a polynomial-time algorithm for simulating boson-boson interactions in quantum mechanics, then an approximation oracle for the permanent (of the type described above) exists in BPP^NP. Since computing the permanent of even 0/1 matrices is #P-complete, this seems to demonstrate more computational power in quantum mechanics than Shor's factoring algorithm does. However, unlike factoring, which is in NP, it was unclear previously how to test the correctness of an approximation oracle for the permanent, and this is the contribution of the paper.

The technical difficulty overcome here is that univariate polynomial self-correction, which underlies similar oracle-testing algorithms for permanent over finite fields ---and whose discovery led to a revolution in complexity theory---does not seem to generalize to complex (or even, real) numbers. We believe that this tester will motivate further progress on understanding the permanent of Gaussian matrices.

The Power of Weak v/s Strong Triangle Inequalities

Manuscript

with Mohammad Moharrami

Abstract

PDF

In a recent paper, Lee and Moharrami construct a family of metrics $(X_n,d_n)$ with $|X_n|=N$, such that $(X_n,\sqrt{d_n})$ embeds into $\ell_2$ with constant distortion, but embedding $(X_n,d_n)$ into a metric of negative type requires distortion $\Omega((\log N)^{\frac{1}{4}})$. In this paper, we build on their analysis, and improve their result by showing a $\Omega((\log N)^{\frac{1}{3}})$ lower bound for embedding $(X_n,d_n)$ into a metric of negative type. Moreover, we show that this analysis is essentially tight by constructing a map that has distortion $O((\log N)^{\frac{1}{3}})$.

This result implies a lower bound of $\Omega((\log N)^{\frac{1}{3}})$ for the integrality gap of the relaxed version of Goemans-Linial semidefinite program with weak triangle inequalities for Non-uniform Sparsest Cut.

A Reformulation of the Arora-Rao-Vazirani Structure Theorem

Manuscript

with Sanjeev Arora, James Lee

Abstract

arXiv

In a well-known paper, Arora, Rao and Vazirani obtained an $O(\sqrt{\log n})$ approximation to the Balanced Separator problem and Uniform Sparsest Cut. At the heart of their result is a geometric statement about sets of points that satisfy triangle inequalities, which also underlies subsequent work on approximation algorithms and geometric embeddings. In this note, we give an equivalent formulation of the Structure theorem in [ARV] in terms of the expansion of large sets in geometric graphs on sets of points satisfying triangle inequalities.