Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Q: Consider the following algorithm to check connectivity of a graph defined by

ID: 672627 • Letter: Q

Question

Q: Consider the following algorithm to check connectivity of a graph defined by its adjacency matrix.

ALGORITHM Connected (A [0..n1,0...n1])

//Input: Adjacency matrix A[0..n1,0..n1]) of an undirected graph G

//Output: 1 (true) if G is connected and 0 (false) if it is not

if n=1 return 1 //one-vertex graph is connected by definition

else

      if not Connected (A[0..n2,0..n2]) return 0

      else for j0 to n2 do

            if A[n1,j] return 1

     return 0

Does this algorithm work correctly for every undirected graph with n> 0 vertices? If you answer yes, indicate the algorithm’s efficiency class in the worst case; if you answer no, explain why.

Explanation / Answer

The adjacency matrix model, where the graph is given as the adjacency matrix M {0, 1} n×n, with Mij = 1 if and only if (vi , vj ) E. The adjacency array model, where we are given the out-degrees of the vertices d + 1 , . . . , d+ n and for every vertex u an array with its neighbors fi : [d + i ] [n]. So fi(j) returns the j th neighbor of vertex i, according to some arbitrary but fixed numbering of the outgoing edges of i. In this paper the upper bounds for this model are all at least n, so we assume henceforth that the degrees are given as part of the input and we account only queries to the arrays fi . In addition the arrays satisfy the simple graph promise i [n], j, j [k], j 6= j : fi(j) 6= fi(j ) ensuring the graph is not a multigraph, i.e. does not have multiple edges between any two vertices. For undirected graphs we require an additional promise on the input, namely that M is symmetric in the matrix model, and for the array model that i, i [n] if j [k] : fi(j) = i then j [k] : fi (j ) = i. Note that in the matrix model this symmetry assumption does not make undirected graph problems, promise problems since we may assume that the input is upper triangular. Weighted graphs are encoded by a weight matrix, where for convenience we set Mij = if (vi , vj ) 6 E. In the adjacency array model, the graph is encoded by a sequence of functions fi : [d + i ] [n] × N, such that if fi(j) = (i , w) then there is an edge (vi , vi ) and it has weight w. We emphasize that the array model is different from the standard list model. In the latter, we have access to the neighbors of a given vertex only as a list, and thus querying the i th neighbor requires i accesses to the list. This is also true on a quantum computer, so its speedup is quite restricted. Many other query models are of course possible, for example we could be given an array of edges f : [m] [n] × [n], or an ordered array (which is up to O(n) preprocessing the same as the adjacency array model). For simplicity, we use the array model as presented above. For the quantum query complexity of general monotone graph properties, a lower bound of ( n) is known in the matrix model, as shown by Buhrman, Cleve, de Wolf and Zalka [9].1 We are not aware of any quantum nor classical lower bounds in the array model. In this paper we show that the quantum query complexity of Connectivity is (n 3/2 ) in the matrix model and (n) in the array model. The classical randomized query complexity of Connectivity in the matrix model is (n 2 ) by a sensitivity argument: Distinguishing the graph consisting of two length n/2 paths from the graph consisting of those two paths, plus an additional edge connecting them, (n 2 ) queries are required. We study the complexity of three other problems. In Strong Connectivity we are given a directed graph and have to decide if there is a directed path between any pair of vertices. In Minimum Spanning Tree we are given a weighted graph and have to compute a spanning tree with minimal total edge weight. In Single Source Shortest Paths we have to compute the shortest paths according to the total edge weight from a given source vertex to every other vertex. The quantum query complexity of these three problems is (n 3/2 ) in the matrix model and ( nm) in the array model. We give almost tight upper bounds. problem matrix model array model minimum spanning tree (n 3/2 ) ( nm) connectivity (n 3/2 ) (n) strong connectivity (n 3/2 ) ( nm), O( nm log n) single src. short. paths (n 3/2 ), O(n 3/2 log2 n) ( nm), O( nm log2 n) Table 1: Quantum query complexity of some graph problems We note that for graphs with a large number of edges (m = (n 2 )), the complexities are (almost) the same in the matrix and array model for all problems but Connectivity. However, the models still differ in that case. For example the test (u, v) E costs a single query in the matrix model and O( q min{d + u , d+ v }) queries in the array model since we do not assume any order on the arrays fu and fv. The time complexities of the algorithms are the same as the query complexities up to log-factors. The algorithms given for connectivity and strong connectivity can be altered to also output the (strongly) connected components without increasing the asymptotic complexity. The space requirement is O(log n) qubits and O(n log n) classical bits. If we constraint the space (both classical and quantum) to O(log n) qubits, the problems may be solved by random walks. Quantum random walks has been the subject of several papers [1, 10, 15], in particular for the st-Connectivity problem [22]. Other work on the query complexity of graphs problems has been done independently of us. For example [4] shows that testing whether a given graph is bi-partite needs (n 3/2 ) queries in the matrix model. Note that a graph is bi-partite iff it does not contain an odd cycle. For the array model a lower bound can be constructed from our lower bound for connectivity, showing that ( nm) queries are necessary for any bounded error quantum algorithm which distinguishes a single even cycle from two disjoint odd cycles. Matching upper bounds follow from our connectivity algorithms: simply construct a spanning forest of the graph. Then color alternating black-white the nodes of each tree. And finally use the quantum search procedure to find an edge with endpoints of the same color. Such an edge creates an odd cycle, and exists iff the graph contains an odd cycle. 2 Tools used in this paper We use two fundamental tools. The first is amplitude amplification [7, 8], which we use when proving the upper bounds, the second is Ambainis’ lower bound technique [2]. Amplitude amplification is a generalization of Lov Grover’s search algorithm [13]. Since it is the most important tool used in our algorithms, we restate the exact results we require. We are given a boolean function F defined on a domain of size n. The function is given as a black box so that the only way we can obtain information about F is via evaluating F on elements in the domain. The search problem considered by Grover is to find an element x for which F(x) = 1, provided one exists. We say that x is a solution to the search problem, and that x is good. We use three generalizations of the search algorithm—all of which we refer to as “the search algorithm”. • If there are t elements mapped to 1 under F, with t > 0, the search algorithm returns a solution after an expected number of at most 9 10 p n/t queries to F. The output of the algorithm is chosen uniformly at random among the t solutions. The algorithm does not require prior knowledge of t [6]. • A second version uses O( n) queries to F in the worst case and outputs a solution with probability at least a constant, provided there is one [6]. • A third version uses O( p n log 1/) queries to F and finds a solution with probability at least 1 , provided there is one [9]. We note that for very sparse graphs given in the adjacency matrix model, it may for some applications be efficient to initially learn all entries of the matrix by reiterating the first version of the search algorithm, for instance as formalized in Fact 1. Fact 1 Let k be given. There is a quantum algorithm that takes as input an n × n boolean matrix M, uses O(n k) queries to M, and outputs a set S of 1-entries of M so that with probability at least 1 2 , S is of cardinality at least k or contains all 1-entries of M in case M has less than k 1-entries. Our lower bounds uses a technique introduced by Andris Ambainis. Theorem 2 ([2, theorem 6]) Let L {0, 1} be a decision problem. Let X L be a set of positive instances and Y L a set of negative instances. Let R X × Y be a relation between instances of same size. Let values m, m , x,i and y,i for x, y {0, 1} n and x X, y Y , i [n] be so that • for every x X there are at least m different y Y in relation with x, • for every y Y there are at least m different x X in relation with y,• for every x X and i [n] there are at most x,i different y Y in relation with x which differ from x at entry i, • for every y Y and i [n] there are at most y,i different x X in relation with y which differ from y at entry i. Then the quantum query complexity of L is (p mm/max) where max is the maximum of x,i y,i subject to xRy and xi 6= yi. .
3 Minima finding Many graph problems are optimization problems, as are finding a minimum spanning tree, single source shortest paths, and largest connected components. Most quantum algorithms for such optimization problems utilize the search algorithm discussed above. A very basic and abstract optimization problem is as follows. Suppose we are given a function f defined on a domain of size n, and we want to find an index i so that f(i) is a minimum in the image of f. This minimization problem was considered in [11] which gives an optimal quantum algorithm that uses O( n) queries to f and finds such an i with constant probability. It is very simple to analyze, and we present it now, since it will make the rest of the section easier to understand. 1. Initially let j [N] be an index chosen uniformly at random. 2. Repeat forever (a) Find an index i [N] such that f(i) < f(j). (b) Set j := i. Theorem 3 ([11]) The expected number of queries to f, until j contains the index of a minimum in the image of f is O( N). Proof: Without loss of generality assume that f is injective. Now every index j [N] has a rank, which we define as the number of indexes i such that f(i) f(j). Let pr be the probability that at some moment of the algorithm, the index j will have rank r. We claim that pr = 1/r. Consider the first moment when j will have rank less or equal r. This moment will happen with probability 1. At that moment, because of the uniform choice in step 1 and since step 2(a) uses the quantum search procedure, j will be uniformly chosen among all indexes with rank less or equal r, so pr = 1/r. If j has rank r, the search procedure of step 2(a) will require c p N/(r 1) expected number of queries, for some constant c. Therefore the total number of queries until j contains the solution is X N r=2 prc p N/(r 1) < c N N X1 r=1 r 3/2 < c N 1 + Z N1 r=1 r 3/2dr ! = O( N) Let c N be the expected number of queries to f until j contains the index of a minima. Stopping the algorithm after 2c queries gives a quantum algorithm with error probability upper bounded by 1/2. For the purposes of this paper, we require the following generalizations of the minimum finding problem, illustrated in Figure 1. Problem 1 (Find d smallest values of a function) Let N denote N {}. Given function f : [N] N and an integer d [N], we wish to find d distinct indexes mapping to smallest values, i.e. a subset I [N] of cardinality d such that for any j [N] I we have that f(i) f(j) for all i I

In the rest of this section, we assume d N/2. In the following problem we are given a different function g : [N] N, such that g(j) defines the type of j. Let e = |{g(j) : j [N]}| be the number of different types. Problem 2 (Find d elements of different type) Given function g and an integer d we wish to find integer d = min{d , e} and a subset I [N] of cardinality d such that g(i) 6= g(i ) for all distinct i, i I. Now we present a generalization of both previous problems. Problem 3 (Find d smallest values of different type) Given two functions f, g and an integer d we wish to find integer d = min{d , e} and a subset I [N] of cardinality d such that g(i) 6= g(i ) for all distinct i, i I and such that for all j [N] I and i I, if f(j) < f(i) then f(i ) f(j) for some i I with g(i ) = g(j).

It is clear that Problems 1 and 2 are special cases of Problem 3. In this section, we give an upper bound of O( dN) for Problem 3. In Section 8, we then show a lower bound of ( dN) for Problems 1 and 2, implying that all three problems are of complexity ( dN). We prove the upper bound by a simple greedy algorithm. Consider a subset I [N] of d indices of different types. We say an index j [N] is good for I if 1. either g(j) = g(i) and f(j) < f(i) for some i I, 2. or g(j) 6 g(I) and f(j) < f(i) for some i I. In the former case we say j is a good index of known type, in the latter that j is a good index of unknown type. In each iteration of the greedy algorithm, we find a good index j by the search algorithm and then improve I by replacing some index in I by j. 1. Initially, let I = {N + 1, . . . , N + d } be a set of artificial indices of unique different types and unique maximal value. 2. Repeat forever (a) Let t denote the number of good elements for I. (Note: This step is not required, but only included for the purpose of simplifying the analysis of the algorithm.) (b) Use the first version of the search algorithm to find a good element j [N] for I. (c) Set I = improve(I, j) where we improve I by replacing with j the element in I that has the same type as j if j is of known type, and by replacing with j some element in I with largest f-value if j is of unknown type. The next lemma shows we only need an expected number of O(d) iterations of the main loop to eliminate a constant fraction of the remaining good elements.

Lemma 4 Let I [N] be any subset of d indices of different types with t > 0 good elements of e types. After an expected number of O(d) iterations of the main loop there are at most 3 4 t good elements for I. Here d = min{d , e}. Proof: For notational simplicity assume f is injective. Set I0 = I and let T0 = T be the set of good elements for I. Let Tj denote the set of good elements after j iterations of the main loop, for j > 0. Similarly, let Ij denote the selected index-set after j iterations, for j > 0. Set tk = |Tk|. In particular I0 = I and t0 = t. Let ymid denote the t/2 th smallest of the t elements according to f. For any subset S [N + d ], let low(S) denote the number of elements in S that are no bigger than ymid according to f. Note that initially • low(T0) = t/2 and • low(I0) < d. By the nature of the greedy algorithm, low(Tk+1) low(Tk) and low(Ik+1) low(Ik) for any k 0. Note that • if low(Tk) < t 4 , then we have eliminated at least a fraction of 1 4 of the initially t good elements for I, and similarly, • if low(Ik) = d, then we have eliminated at least a fraction of 1 2 of the initially t good elements for I. We claim that in each iteration of the main loop, as long as low(Tk) t 4 , with probability at least 1 32 , at least one of the following two events happens • low(Tk+1) low(Tk) 1 1 32d , • low(Ik+1) = low(Ik) + 1. Assume low(Tk) t 4 , since otherwise we are done. Consider the element j picked in Step 2b. First suppose the majority of the low(Tk) indices are of unknown type with respect to Ik. Then, with probability at least 1 8 , index j is among these, in which case low(Ik+1) = low(Ik) + 1. Now suppose the majority of the low(Tk) indices are of known type with respect to Ik. Then, with probability at least 1 8 , index j is among these. Conditioned on this happens, with probability at least 1 2 , there are at least low(Tk) 4d good elements for Ik of the same type as j. With probability at least 1 2 , at least half of these are not good for Ik+1. Thus, with probability at least 1 32 , we have eliminated at least t 32d of the remaining elements in Tj. This proves the claim. It follows that after an expected number of O(d) iterations of the main loop, we have eliminated at least a fraction of 1 4 of the initially t good elements. The above lemma implies that, for t > 2d, after an expected number of O(d p N/t ) applications of function f, the number of good elements is at most t 2 . Hence, for any t > 2d, the expected number applications of function f required till we have that t 2d for the first time is in the order of d r N d + r N 2d + r N 4d + r N 8d + · · · O( dN). Once t 2d for the first time, the expected number of applications of f required before t = 0 for the first time is in the order of P2d j=1 p N/j which is in O( dN ). Corollary 5 In the greedy algorithm given above, after an expected number of O( dN) applications of function f, there are no good elements for I, that is, t = 0. The next theorem follows immediately.

Theorem 6 The problem Find d Smallest Values of Different Type has bounded error quantum query complexity O( dN). Nayak and Wu give in [19] a bounded error quantum algorithm that given a function f : [N] N and two integers d and , outputs an index i such that the rank of f(i) is between d and d + . The query complexity of their algorithm is O(M log M log log M) where M = p N/ + p d(N d)/. Setting = 1 2 it would find the d th smallest element with O( dN log N log log N) queries. Nayak [18] later improved this algorithm to O( dN), matching the lower bound given in [19]. His method is different from ours. Remark 1 The algorithm above uses c dN queries for some constant c and outputs the solution with probability at least 1/2. In order to reduce the error probability to 1/2 k one could run the algorithm k times and among the dk resulting indices, output the d smallest values of different type. However starting each run with randomly chosen points of different type regardless of the previous outcome, is a waste of information. So it is much more clever to run the algorithm only once and stop it after kc dN queries.

4 Minimum Spanning Tree In this section we consider undirected graphs with weighted edges. In Minimum Spanning Tree we wish to compute a cycle-free edge set of maximal cardinality that has minimum total weight. To be precise if the graph is not connected this is actually a spanning forest. Classically, there are a number of different approaches to finding minimum spanning trees efficiently, including the algorithms of Boruvka [5, 20], Kruskal [17], and Prim [21]. To construct an efficient quantum algorithm, we use Boruvka’s algorithm since it is of a highly parallel nature. This allows us to use the minima finding algorithms given in Section 3. Boruvka’s algorithm consists of at most log n iterations. In brief, initially it starts with a collection of n spanning trees, each tree containing a single vertex. In each iteration, it finds a minimum weight edge out of each tree in the collection, adds the edges to the trees, and merges them into larger and fewer trees. After at most log n iterations, there is only one tree left, which is a minimum spanning tree. The correctness of Boruvka’s algorithm rests on the following simple fact about spanning trees. Fact 7 Let U V be a set of vertices of a connected graph G = (V, E) and let e be a minimum weight edge of (U × U) E. Then there is a minimum spanning tree containing e. In our quantum version of Boruvka’s algorithm, we make a few adjustments to keep the overall error probability small without sacrificing in the number of queries. We adjust it slightly so that the th iteration errs with probability at most 1 2 +2 , ensuring that the overall error is at most 1 4 . This increases the cost of the th iteration by a factor of , but since the cost of the first few iterations dominates, this is asymptotically negligible. The details follow. 1. Let T1, T2, . . . , Tk be a spanning forest. Initially, k = n and each tree Tj contains a single vertex. 2. Set = 0. 3. Repeat until there is only a single spanning tree (i.e., k = 1). (a) Increment . (b) Find edges e1, e2, . . . , ek satisfying that ej is a minimum weight edge leaving Tj. Interrupt when the total number of queries is ( + 2)c km for some appropriate constant c. (c) Add the edges e j to the trees, merging them into larger trees. 4. Return the spanning tree T1.

To find the minimum edges e1, . . . , ek in Step 3b, we use the following functions. In the array model, any edge (u, v) is coded twice, u appears as neighbor of v, but v also appears as neighbor of u. Enumerate the directed edges from 0 to 2m 1. Let f : [2m] N denote the function that maps every directed edge (u, v) to its weight if u and v belong to different trees of the current spanning forest and to otherwise. Let g : [2m] [k] denote the function that maps every directed edge (u, v) to the index j of the tree Tj containing u. We then apply the algorithm for Finding k smallest values of different type, interrupting it after ( + 2)c km queries to obtain an error probability at most 1/2 +2, see Remark 1. Theorem 8 Given an undirected graph with weighted edges, the algorithm above outputs a spanning tree that is minimum with probability at least 1 4 . The algorithm uses O( nm) queries in the array model and O(n 3/2 ) queries in the matrix model. Proof: To simplify the proof, consider the matrix model an instance of the array model with m = n(n1) edges. At the beginning of the th iteration of the main loop, the number of trees k is at most n/2 1 , and thus it uses at most ( + 2)c p nm/2 1 queries. Summing over all iterations, the total number of queries is at most P 1 ( + 2)c p nm/2 1, which is in O( nm). The th iteration introduces an error with probability at most 1 2+2 . The overall error probability is thus upper bounded by P 1 1 2 +2 1 4 . For our algorithms, a O(log n) factor applies when considering the bit computational model, rather than the algebraic computational model. Apart from that, the time complexity is the same as the query complexity by using an appropriate data structure. Each vertex holds a pointer to another vertex in the same component, except for a unique vertex per component that holds a null pointer. This vertex is called the canonical representative of the component. To decide if two vertices are in the same component, we need only to determine the canonical representative of each vertex by pointer chasing. To merge two components, we change the pointer of one of the canonical representative to point to the other. Using pointer chasing, the time complexity of the th iteration is a factor of larger than its query complexity. However, as in the case of the error reduction, this is insignificant: P 1 ( + 2)c p nm/2 1 is also in O( nm). Thus, the time complexity is asymptotically the same as the query complexity.

5 Connectivity A special case of Minimum Spanning Tree when all edge weights are equal, is Graph Connectivity. The input is an undirected graph and the output is a spanning tree, provided the graph is connected. For the matrix model, the algorithm for minimum spanning tree given in the previous section implies an O(n 3/2 ) upper bound for graph connectivity as well. Below, we give a somewhat simpler and arguably more natural quantum algorithm of query complexity O(n 3/2 ), which is optimal by the lower bound given in Section 8 below. For the array model, we give a quantum algorithm that uses only O(n) queries. Both algorithms start with a collection of n connected components, one for each vertex, and greedily construct a spanning tree by repeatedly picking an edge that connects two of the components. Theorem 9 Given the adjacency matrix M of an undirected graph G, the algorithm below outputs a spanning tree for G after an expected number of O(n 3/2 ) queries to M, provided G is connected, and otherwise runs forever. Proof: Consider the following algorithm. 1. Initially the edge set A is empty. 2. Repeat until A connects the graph.

(a) Search for a good edge, i.e., an edge that connects two different components in A, and add it to A. Use the version of the search algorithm that returns a solution in expected O(n 2/t) queries if there are t > 0 good edges and otherwise runs forever. 3. Return the edge set A. Suppose the graph is connected and consider the expected total number of queries used by the algorithm. There are exactly n 1 iterations of the main loop. The number of good edges is at least k 1 when A consists of k components, and thus the expected total number of queries is in the order of Pn k=2 p n2/(k 1), which is in O(n 3/2 ). When implementing the above algorithm, we maintain an appropriate data structure containing information about the connected components in the graph induced by A. This introduces an additional O(n log n) term in the running time of the algorithm which is negligible compared to O(n 3/2 ). We may choose to stop the algorithm after twice the expected total number of queries, giving an O(n 3/2 ) query algorithm with bounded one-sided error. 5.1 The array model Lemma 10 Given an undirected graph G in the array model, we can in O(n) classical queries construct a set of connected components {C1, . . . , Ck} for some integer k, so that for each component C, its total degree mC = P iC di is no more than |C| 2 . Proof: The algorithm is classical and is as follows. 1. Initially the edge set A is empty. 2. Let S = V be the set of vertices not yet placed in some component. 3. Let k = 0 be the number of components constructed thus far. 4. While S is non-empty (a) Take the vertex v of highest degree in S and set D = {v}. (b) Go through v’s list of neighbors one by one, each time adding the neighbor w to D and the edge (v, w) to A, until one of two events happens: (1) We reach the end of the list, or (2) we reach a neighbor w already assigned to some component Cj with j k. (c) In case (1), set k = k + 1, Ck = D, and remove D from S. In case (2), add D to Cj , and remove D from S. 5. Output k, A, and C1, C2, . . . , Ck. The algorithm uses nk queries in total, one query for each vertex but the first added to each component. Edge set A contains the union of spanning trees of the components C1 through Ck. To show correctness, let v be the vertex chosen in Step 4a and d its degree. Then d |Cj | for each components constructed so far, since the size of a freshly created component is the degree of one of its vertices, which by choice in Step 4a must be no less than d, and components can only grow. To show that the total degree of every component Cj is no more than |Cj | 2 , consider the two cases in Step 4b. In case (1), D is the set of v and its neighbors, each neighbor having degree no larger than d, implying the total degree is at most d(d + 1) which is strictly less than (d + 1)2 = |D| 2 . In case (2), let a be the size of the component Cj to which D is merged, and b the size of D. Then b d a. The total degree is no more than a 2 + bd which is strictly less than (a + b) 2 . Theorem 11 Given an undirected graph G in the array model, the algorithm below outputs a spanning tree for G using an expected number of O(n) queries, provided G is connected, and otherwise runs forever. 9 Proof: Consider the following algorithm. 1. Construct the edge set A using the above lemma. 2. Repeat until A connects the graph. (a) Pick a connected component C in A with smallest total degree, i.e, a component minimizing mC = P iC di . (b) Search for an edge out of C, i.e., an edge that connects C to some other component in A, and add it to A. Use the version of the search algorithm that returns a solution in expected O( mC ) queries if there is at least one such edge and otherwise runs forever. 3. Return the edge set A. Suppose the graph is connected and consider the expected total number of queries used by the algorithm. We first construct k components, each component C having total degree mC at most |C| 2 . In each iteration of the main loop, we pick the component with smallest total degree and search for an edge out of C. The expected cost of finding such an edge is at most mC for some constant . We distribute this cost evenly among each of the mC edge endpoints in C, each endpoint paying /mC . Fix an arbitrary edge endpoint. Enumerate from 0 up to at most log m the successive components that were chosen by the algorithm for a search and that contain this fixed edge endpoint. Let mi be the number of edge endpoint in the i th component. Then mi+1 2mi . The total cost assigned to our fixed edge endpoint is upper bounded by log Xm i=0 mi log Xm i=0 2 im0 4 m0 . Let C be any of the k components constructed in the first step. The total cost assigned over all edge endpoints in C is thus upper bounded by 4 mC , which is at most 4|C|. Summing over all k components, the total cost assigned in the main loop is at most 4n, which is linear in n.

6 Strong Connectivity We give two quantum algorithms for strong connectivity, first one for the matrix model and then one for the array model. The input is a directed graph and the output is a set of at most 2(n 1) edges that proves the graph is strongly connected, provided it is. It follows from the discussions below that such sets always exist. Theorem 12 Given the adjacency matrix M of a directed graph G and a vertex v0, the algorithm below uses O( nm log n) queries to M and outputs a directed tree A E rooted at v0. With probability at least 9 10 , A spans all of G, provided such a spanning tree exists. Proof: Consider the following simple algorithm. 1. Initially the edge set A is empty. 2. Let S = {v0} be a set of reachable vertices, and T = {v0} a stack of vertices to be processed. 3. While T 6= {} do (a) Let u be the top most vertex of stack T . (b) Search for a neighbor v of u not in S. Use the version of the search algorithm that uses O( p d + u log n) queries and outputs a solution with probability at least 1 1 20n , provided one exists. (c) If succeed, add (u, v) to A, add v to S, and push v onto T . (d) Otherwise, remove u from T . 4. Return edge set A. For any vertex u let b + u the out-degree in the tree A produced by the algorithm. Then the total number of queries spent in finding the b + u neighbors of u is in the order of Pb + u t=1 q d + u /t log n which is in O( p b + u d + u log n). Summing over all vertices u this gives X uV q b + u d + u log n sX u b + u sX u d + u p log n = O( p nm log n). The first inequality follows from the general statement that the inner product of two vectors is upper bounded by the product of their 2-norms. The second inequality uses the fact that a tree has only O(n) edges. The algorithm spends in addition O( P u p d + u log n) queries for the unsuccessful neighbor searches, but this is dominated by the previous cost. The overall error probability is upper bounded by 1 10 since each of the at most 2n 1 searches has error at most 1 20n . Theorem 12 implies that Strong Connectivity, too, can be solved using an expected number of queries in O(n 3/2 log n) in the adjacency matrix model. In fact the previous algorithm can be used for the matrix model with m = n(n 1). We first check that some fixed vertex v0 can reach any other vertex u, producing some spanning tree rooted at v0. We then check that all vertices can reach v0 by repeating the previous step on the transposed adjacency matrix. The two stages produces a set at most 2(n 1) edges that proves the graph is strongly connected. This is not possible in array model, since we store for any vertex u only the neighbors at which edges are pointing, and there is no easy access to vertices which are connected with a directed edge to u. The following theorem circumvents this obstacle. Theorem 13 Given a directed graph G in the array model and a vertex v0, the algorithm below uses O( nm log n) queries and outputs an edge set E E covering v0. If G is strongly connected, then with probability at least 1 4 E is strongly connected. Proof: In a first stage we use the previous algorithm to construct a directed depth first spanning tree A E rooted in v0. Assume vertexes to be named according to the order they are added to T . Then in a second stage we search for every vertex vi V , the neighbor vj with smallest index. The result is a set of backward edges B E. We claim that the graph G(V, E) is strongly connected iff its subgraph G (V, A B) is strongly connected. Clearly if G is strongly connected then so is G since A B E. Therefore to show the converse assume G strongly connected. For a proof by contradiction let vi be the vertex with smallest index, which is not connected to v0 in G . However by assumption there is a path in G from vi to v0. Let (vl , vl ) be its first edge with l i and l < i. We use the following property of depth first search. Lemma 14 Let vl and vl be two vertexes in the graph G with l < l . If there is a path from vl to vl in G(V, E) then vl is in the subtree of G(V, A) with root vl. Therefore we can replace in the original path the portion from vi to vl by a path only using edges from A. Let vl be the neighbor of vl with smallest index. Clearly l l < i. By the choice of vi , there exists a path from vl to v0 in G . Together this gives a path from vi to v0 in G contradicting the assumption and therefore concluding the correctness of the algorithm. Now we analyze the complexity. During the first stage, set A is computed in time O( p n3/2 log n). The second stage can be done with O( nm) queries using the minima finding for the mapping from an edge number in [1, m] to the source-target vertex pair. Both stages can be made succeed with probability at least 7/8.

6.1 The matrix model Here all we need is to construct an oriented tree, rooted in some vertex v0, and it not need to be depth-first. We want this tree to cover all vertices reachable by v0. There is a tricky method for constructing such a tree with bounded error, without a log-factor in the running time as the previous algorithm. The idea is to classify vertices covered by the current tree, into sets T0, . . . , Tq such that the confidence that vertices from Ti have no new neighbors is increasing with i. Whenever a search of an edge (u, v) with u R and v 6 T0 . . . Tq is successful, for some subset R Ti , the vertices R and u will be moved into T0, otherwise, R will be moved into Ti+1. We make it formal now. 1. Let S be the tree consisting of the single vertex v0. Let be the partitioning of the vertex set covered by S into T0 = {v0} and T1 = . . . = Tq = {}, for q = log2 (n) + 1. 2. While there is a set Ti with |Ti | 2 i do (a) Let i be the smallest index such that |Ti | 2 i . (b) If |Ti | < 2 i+1 , R = Ti otherwise R is an arbitrary subset of Ti with |R| = 2i . (c) Remove R from Ti . (d) Search for an edge (u, v) with u R and v 6 S in search space of size O(2in) with the version of the quantum search procedure which uses O(23i/4 n) queries and find a solution with probability 1 1/2 2i+2 provided such an edge exists. (e) If the search was successful, add (u, v) to S and R {v0} to T0, otherwise add R to Ti+1. 3. Output S. Now we show some properties of the algorithm. For convenience we define tj = |T0| + |T1| + . . . + |Tj|. Lemma 15 At the beginning and the end of an iteration we have the following invariant. Let k be the smallest index and be the largest index of a non-empty set Tj . Then |T| 2 1 (1) k j < : tj 2 j (2) Proof: by induction on the iterations of the algorithm. Initially, when T0 is the unique non-empty set, the claim holds. Assume the claim holds before an iteration. First observe that by the induction assumption (2) if k < then |Tk| 2 k , and so the index chosen by the algorithm is always i = k. Now if the search was successful, we have already t0 2 k , and therefore tj 2 k for all 0 j k and for all j > k, tj increased by one, preserving condition (2). If the search was not successful, then by the choice of R, after the iteration either tk = 0 or tk 2 k . For all values j > k, tj is not modified, preserving condition (2). Condition (1) is preserved because after decreasing every set Tj satisfies either |Tj | = 0 or |Tj| 2 j , and because whenever a set Tj becomes non-empty, it contains at least 2j1 elements. As a consequence, when the algorithm stops, there is a unique non-empty set Ti and moreover 2i1 |Ti | < 2 i . Also since at most n 1 searches can be successful, the algorithm stops after O(n log n) iterations. Lemma 16 When the algorithm stops, S covers all vertices reachable by v0 with probability at least 2/3. Proof: Suppose the algorithm failed so there is an edge (u, v) with u S and v 6 S which was not found for each call to the search procedure with a set Tj containing u. The probability of this event is the product of the failure probability over all calls. It is roughly upper bounded by the failure probability of the last 6.1 The matrix model Here all we need is to construct an oriented tree, rooted in some vertex v0, and it not need to be depth-first. We want this tree to cover all vertices reachable by v0. There is a tricky method for constructing such a tree with bounded error, without a log-factor in the running time as the previous algorithm. The idea is to classify vertices covered by the current tree, into sets T0, . . . , Tq such that the confidence that vertices from Ti have no new neighbors is increasing with i. Whenever a search of an edge (u, v) with u R and v 6 T0 . . . Tq is successful, for some subset R Ti , the vertices R and u will be moved into T0, otherwise, R will be moved into Ti+1. We make it formal now. 1. Let S be the tree consisting of the single vertex v0. Let be the partitioning of the vertex set covered by S into T0 = {v0} and T1 = . . . = Tq = {}, for q = log2 (n) + 1. 2. While there is a set Ti with |Ti | 2 i do (a) Let i be the smallest index such that |Ti | 2 i . (b) If |Ti | < 2 i+1 , R = Ti otherwise R is an arbitrary subset of Ti with |R| = 2i . (c) Remove R from Ti . (d) Search for an edge (u, v) with u R and v 6 S in search space of size O(2in) with the version of the quantum search procedure which uses O(23i/4 n) queries and find a solution with probability 1 1/2 2i+2 provided such an edge exists. (e) If the search was successful, add (u, v) to S and R {v0} to T0, otherwise add R to Ti+1. 3. Output S. Now we show some properties of the algorithm. For convenience we define tj = |T0| + |T1| + . . . + |Tj|. Lemma 15 At the beginning and the end of an iteration we have the following invariant. Let k be the smallest index and be the largest index of a non-empty set Tj . Then |T| 2 1 (1) k j < : tj 2 j (2) Proof: by induction on the iterations of the algorithm. Initially, when T0 is the unique non-empty set, the claim holds. Assume the claim holds before an iteration. First observe that by the induction assumption (2) if k < then |Tk| 2 k , and so the index chosen by the algorithm is always i = k. Now if the search was successful, we have already t0 2 k , and therefore tj 2 k for all 0 j k and for all j > k, tj increased by one, preserving condition (2). If the search was not successful, then by the choice of R, after the iteration either tk = 0 or tk 2 k . For all values j > k, tj is not modified, preserving condition (2). Condition (1) is preserved because after decreasing every set Tj satisfies either |Tj | = 0 or |Tj| 2 j , and because whenever a set Tj becomes non-empty, it contains at least 2j1 elements. As a consequence, when the algorithm stops, there is a unique non-empty set Ti and moreover 2i1 |Ti | < 2 i . Also since at most n 1 searches can be successful, the algorithm stops after O(n log n) iterations. Lemma 16 When the algorithm stops, S covers all vertices reachable by v0 with probability at least 2/3. Proof: Suppose the algorithm failed so there is an edge (u, v) with u S and v 6 S which was not found for each call to the search procedure with a set Tj containing u. The probability of this event is the product of the failure probability over all calls. It is roughly upper bounded by the failure probability of the last 6.1 The matrix model Here all we need is to construct an oriented tree, rooted in some vertex v0, and it not need to be depth-first. We want this tree to cover all vertices reachable by v0. There is a tricky method for constructing such a tree with bounded error, without a log-factor in the running time as the previous algorithm. The idea is to classify vertices covered by the current tree, into sets T0, . . . , Tq such that the confidence that vertices from Ti have no new neighbors is increasing with i. Whenever a search of an edge (u, v) with u R and v 6 T0 . . . Tq is successful, for some subset R Ti , the vertices R and u will be moved into T0, otherwise, R will be moved into Ti+1. We make it formal now. 1. Let S be the tree consisting of the single vertex v0. Let be the partitioning of the vertex set covered by S into T0 = {v0} and T1 = . . . = Tq = {}, for q = log2 (n) + 1. 2. While there is a set Ti with |Ti | 2 i do (a) Let i be the smallest index such that |Ti | 2 i . (b) If |Ti | < 2 i+1 , R = Ti otherwise R is an arbitrary subset of Ti with |R| = 2i . (c) Remove R from Ti . (d) Search for an edge (u, v) with u R and v 6 S in search space of size O(2in) with the version of the quantum search procedure which uses O(23i/4 n) queries and find a solution with probability 1 1/2 2i+2 provided such an edge exists. (e) If the search was successful, add (u, v) to S and R {v0} to T0, otherwise add R to Ti+1. 3. Output S. Now we show some properties of the algorithm. For convenience we define tj = |T0| + |T1| + . . . + |Tj|. Lemma 15 At the beginning and the end of an iteration we have the following invariant. Let k be the smallest index and be the largest index of a non-empty set Tj . Then |T| 2 1 (1) k j < : tj 2 j (2) Proof: by induction on the iterations of the algorithm. Initially, when T0 is the unique non-empty set, the claim holds. Assume the claim holds before an iteration. First observe that by the induction assumption (2) if k < then |Tk| 2 k , and so the index chosen by the algorithm is always i = k. Now if the search was successful, we have already t0 2 k , and therefore tj 2 k for all 0 j k and for all j > k, tj increased by one, preserving condition (2). If the search was not successful, then by the choice of R, after the iteration either tk = 0 or tk 2 k . For all values j > k, tj is not modified, preserving condition (2). Condition (1) is preserved because after decreasing every set Tj satisfies either |Tj | = 0 or |Tj| 2 j , and because whenever a set Tj becomes non-empty, it contains at least 2j1 elements. 6.1 The matrix model Here all we need is to construct an oriented tree, rooted in some vertex v0, and it not need to be depth-first. We want this tree to cover all vertices reachable by v0. There is a tricky method for constructing such a tree with bounded error, without a log-factor in the running time as the previous algorithm. The idea is to classify vertices covered by the current tree, into sets T0, . . . , Tq such that the confidence that vertices from Ti have no new neighbors is increasing with i. Whenever a search of an edge (u, v) with u R and v 6 T0 . . . Tq is successful, for some subset R Ti , the vertices R and u will be moved into T0, otherwise, R will be moved into Ti+1. We make it formal now. 1. Let S be the tree consisting of the single vertex v0. Let be the partitioning of the vertex set covered by S into T0 = {v0} and T1 = . . . = Tq = {}, for q = log2 (n) + 1. 2. While there is a set Ti with |Ti | 2 i do (a) Let i be the smallest index such that |Ti | 2 i . (b) If |Ti | < 2 i+1 , R = Ti otherwise R is an arbitrary subset of Ti with |R| = 2i . (c) Remove R from Ti . (d) Search for an edge (u, v) with u R and v 6 S in search space of size O(2in) with the version of the quantum search procedure which uses O(23i/4 n) queries and find a solution with probability 1 1/2 2i+2 provided such an edge exists. (e) If the search was successful, add (u, v) to S and R {v0} to T0, otherwise add R to Ti+1. 3. Output S. Now we show some properties of the algorithm. For convenience we define tj = |T0| + |T1| + . . . + |Tj|. Lemma 15 At the beginning and the end of an iteration we have the following invariant. Let k be the smallest index and be the largest index of a non-empty set Tj . Then |T| 2 1 (1) k j < : tj 2 j (2) Proof: by induction on the iterations of the algorithm. Initially, when T0 is the unique non-empty set, the claim holds. Assume the claim holds before an iteration. First observe that by the induction assumption (2) if k < then |Tk| 2 k , and so the index chosen by the algorithm is always i = k. Now if the search was successful, we have already t0 2 k , and therefore tj 2 k for all 0 j k and for all j > k, tj increased by one, preserving condition (2). If the search was not successful, then by the choice of R, after the iteration either tk = 0 or tk 2 k . For all values j > k, tj is not modified, preserving condition (2). Condition (1) is preserved because after decreasing every set Tj satisfies either |Tj | = 0 or |Tj| 2 j , and because whenever a set Tj becomes non-empty, it contains at least 2j1 elements.

Lemma 16 When the algorithm stops, S covers all vertices reachable by v0 with probability at least 2/3. Proof: Suppose the algorithm failed so there is an edge (u, v) with u S and v 6 S which was not found for each call to the search procedure with a set Tj containing u. The probability of this event is the product of the failure probability over all calls. It is roughly upper bounded by the failure probability of the last 12 call, which is at most pi := 2i/2 2i+2 , where i = |S|. Let qi be the probability that the algorithm outputs a vertex set S with |S| = i. Then the failure probability is upper bounded by Xn i=0 qipi n max i=0 pi = p1 1/3. Now we analyze the complexity of the algorithm. Lemma 17 The expected number of queries done by the algorithm is O(|S| n). Proof: To analyze the total number of queries done by the search procedures, we group the calls of the search procedures into sequences of unsuccessful searches ending with a success, plus the last sequence of unsuccessful searches. For the first case, let (u, v) be an arbitrary edge found by the algorithm. Then the probability that it was found when u Ti is upper bounded by the probability that it was not found when u Ti1, which is 1/2 2 i . The cost of this search and the i 1 unsuccessful searches over sets R containing u is order of X i j=0 2 3i/4 n = O(23i/4 n). Therefore the expected cost of finding (u, v) is at most Xq i=0 2 3i/4 2 2 i n = O( n). To complete the analysis we upper bound the total work of all the O(log n) unsuccessful searches which were made after the last successful search. Let i be such that 2i1 |S| < 2 i . There where at most 2i/2 j searches for sets R with |R| = 2j . Therefore the total work is order of X i i=0 2 ij 2 3j/4 n = O(|S| n). This concludes the proof.

Single source shortest path:

Let G be a directed graph with non-negative edge weights and a fixed vertex v0. We want to compute for every vertex v a shortest path from v0 to v. It may happen that it is not unique. Using for example the lexicographical ordering on vertex sequences, we choose to compute a single canonical shortest path. From now on assume that different paths have different lengths. As a result, the union over all vertices v of the shortest paths from v0 to v is a shortest path tree. Let (u, v) be the weight of edge (u, v) and (v0, v) the shortest path length from v0 to v. Classically Single Source Shortest Path may be solved by Dijkstra’s algorithm. It maintains a subtree T with the “shortest path subtree” invariant: for any vertex v T , the shortest path from v0 to v uses only vertices from T . An edge (u, v) is called a border edge (of T ) if u T and v 6 T , and u is called the source vertex, v the target vertex. The cost of (u, v) is (v0, u) + (u, v). Dijkstra’s algorithm starts with T = {v0} and iteratively adds the cheapest border edge to it. Our improvement lays in the selection of the cheapest border edge. We give the algorithm for the array model. Setting m = n 2 implies the required bound for the matrix model. 13 Theorem 18 The bounded error query complexity of single source shortest path in the array model is O( nm log3/2 n). Proof: As in Dijkstra’s algorithm we construct iteratively a tree T , such that for every vertex v T , the shortest path from v0 to v is in T . We also maintain a partition of the vertices covered by T , into a set sequence. Its length is denoted by l. 1. T = {v0}, l = 1, P1 = {v0} 2. Repeat until T covers the graph (a) For Pl compute up to |Pl | cheapest border edges with disjoint target vertices. For this purpose set N = P vPl d(v), and number all edges with source in Pl from 1 to N. Define the functions f : [N] N and g : [N] V , where g(i) is target vertex of the i th edge and f(i) is its weight if g(i) 6 T and otherwise. Apply the algorithm of section 3 on f and g with d = |Pl | to find the d lowest cost edges with distinct target vertices. Let Al be the resulting edge set. (b) Let (u, v) be the minimal weighted edge of A1 . . .Al with v 6 P1 . . . Pl . Set T = T {(u, v)}, Pl+1 = {v} and l = l + 1. (c) As long as l 2 and |Pl1| = |Pl |, merge Pl into Pl1, and set l = l 1. All steps but 2(b) constructed a vertex set sequence P1, . . . , Pl , the cardinality of each being a power of 2, and of strictly decreasing sizes. Figure 2 shows an example of this partitioning of the vertices in T . v0 v1 v2 v3 v4 v5 v6 v0 v1 v2 v3 v4 v5 v6 A2 A1 A3 P1 P2 P3 P1 P2 P3 Figure 2: Left: example of the set decomposition for |T | = 7 into powers of 2. Right: example of corresponding edge sets. The closest border edge of T belongs to one of A1, A2, A3. Therefore each set Pi is strictly larger than the union of all the following sets, since Pk1 i=0 2 i = 2k 1. If Ai contained |Pi | edges, than at least one of them has its target vertex outside of P1 . . . , Pl . Let (u, v) be the cheapest border edge of T . Let Pi be the vertex set containing u. Then Ai must contain this edge, and step 2(b) selects it. Only step 2(a) generates queries to the graph. What is the total number of queries related to sets Pi of some size s? There are at most n/s sets of this size s. Therefore total work is order of Pn/s j=1 smj, where mj is the number of edges with source in the j th vertex set. We have Pn/s j=1 mj = m. This worst case is when mi = sm/n. In that case the total work is O( nm) for the fixed size s. There are log n different set sizes in the algorithm. Each of the O(n log n) queries to the minimum finding procedures should succeed with probability 1 1/2n log n at least This introduces a O(log n) factor, see Remark 1, and we obtain the claimed complexity.