New Algorithms for Maximum Disjoint Paths Based on Tree-Likeness

We study the classical NP-hard problems of finding maximum-size subsets from given sets of $k$ terminal pairs that can be routed via edge-disjoint paths (MaxEDP) or node-disjoint paths (MaxNDP) in a given graph. The approximability of MaxEDP/NDP is currently not well understood; the best known lower bound is $\Omega(\log^{1/2-\epsilon}{n})$, assuming NP$~\not\subseteq~$ZPTIME$(n^{\mathrm{poly}\log n})$. This constitutes a significant gap to the best known approximation upper bound of $O(\sqrt{n})$ due to Chekuri et al. (2006) and closing this gap is currently one of the big open problems in approximation algorithms. In their seminal paper, Raghavan and Thompson (Combinatorica, 1987) introduce the technique of randomized rounding for LPs; their technique gives an $O(1)$-approximation when edges (or nodes) may be used by $O(\frac{\log n}{\log\log n})$ paths. In this paper, we strengthen the above fundamental results. We provide new bounds formulated in terms of the feedback vertex set number $r$ of a graph, which measures its vertex deletion distance to a forest. In particular, we obtain the following. * For MaxEDP, we give an $O(\sqrt{r}\cdot \log^{1.5}{kr})$-approximation algorithm. As $r\leq n$, up to logarithmic factors, our result strengthens the best known ratio $O(\sqrt{n})$ due to Chekuri et al. * Further, we show how to route $\Omega(\mathrm{OPT})$ pairs with congestion $O(\frac{\log{kr}}{\log\log{kr}})$, strengthening the bound obtained by the classic approach of Raghavan and Thompson. * For MaxNDP, we give an algorithm that gives the optimal answer in time $(k+r)^{O(r)}\cdot n$. If $r$ is at most triple-exponential in $k$, this improves the best known algorithm for MaxNDP with parameter $k$, by Kawarabayashi and Wollan (STOC 2010). We complement these positive results by proving that MaxEDP is NP-hard even for $r=1$, and MaxNDP is W$[1]$-hard for parameter $r$.

(or nodes) may be used by O (log n/ log log n) paths. In this paper, we strengthen the fundamental results above. We provide new bounds formulated in terms of the feedback vertex set number r of a graph, which measures its vertex deletion distance to a forest. In particular, we obtain the following results: -For MaxEDP, we give an O( √ r log(kr))-approximation algorithm. Up to a logarithmic factor, our result strengthens the best known ratio O( √ n) due to Chekuri et al., as r ≤ n.
-Further, we show how to route Ω(OPT * ) pairs with congestion bounded by O(log(kr)/ log log(kr)), strengthening the bound obtained by the classic approach of Raghavan and Thompson. -For MaxNDP, we give an algorithm that gives the optimal answer in time (k + r ) O(r ) · n. This is a substantial improvement on the run time of 2 k r O(r ) · n, which can be obtained via an algorithm by Scheffler.
We complement these positive results by proving that MaxEDP is NP-hard even for r = 1, and MaxNDP is W[1]-hard when r is the parameter. This shows that neither problem is fixed-parameter tractable in r unless FPT = W [1] and that our approximability results are relevant even for very small constant values of r .

Introduction
In this paper, we study disjoint paths routing problems. In this setting, we are given an undirected graph G and a collection M = {(s 1 , t 1 ), . . . , (s k , t k )} of vertex pairs, called terminal pairs, that can be thought of being source-destination pairs. The goal is to select a maximum-sized subset M ⊆ M of the pairs that can be feasibly routed, where a routing of M is a collection P of paths such that, for each pair (s i , t i ) ∈ M , there is a path in P connecting s i to t i . In the Maximum Edge Disjoint Paths (MaxEDP) problem, a routing P is feasible if its paths are pairwise edge-disjoint, and in the Maximum Node Disjoint Paths (MaxNDP) problem, a routing P is feasible if its paths are pairwise node-disjoint. Throughout this paper, a solution to MaxEDP or MaxNDP is a feasible routing P of a subset M ⊆ M. Disjoint paths problems are fundamental problems with a long history and significant connections to optimization and structural graph theory. The decision versions EDP of MaxEDP and NDP of MaxNDP ask whether all of the pairs can be routed. When the number of pairs is part of the input, EDP and NDP are NP-complete [22,29]. In undirected graphs, MaxEDP and MaxNDP are solvable in polynomial time when the number of pairs is a fixed constant; this is a very deep result of Robertson and Seymour [42] that builds on several fundamental results in structural graph theory from their graph minors project.
In this paper, we consider the optimization problems MaxEDP and MaxNDP when the number of pairs is part of the input. In this setting, the best approximation ratio for MaxEDP is achieved by an O( √ n)-approximation algorithm [11,35], that is, by an algorithm that routes Ω(OPT / √ n) pairs, where OPT is the number of pairs in an optimum routing and n is the number of nodes. However, the best known lower bound for undirected graphs is only 2 Ω( √ log n) , assuming NP DTIME(n O(log n) ) [19]. Bridging this gap is a fundamental open problem that seems quite challenging.
Most of the results for routing on disjoint paths use a natural multi-commodity flow relaxation as a starting point. A well-known integrality gap instance due to Garg et al. [26] shows that this relaxation has an integrality gap of Ω( √ n), and this is the main obstacle for improving the O( √ n)-approximation ratio in general graphs. This led Chekuri et al. [15] to study the approximability of MaxEDP with respect to the treewidth of the underlying graph. In particular, they pose the following conjecture: Conjecture 1 [12] The integrality gap of the standard multi-commodity flow relaxation for MaxEDP is Θ(w), where w is the treewidth of the graph.
Recently, Ene et al. [21] showed that MaxEDP admits an O(w 3 )-approximation algorithm on graphs of treewidth at most w. Theirs is the best known approximation ratio in terms of w, improving on an earlier O(w · 3 w )-approximation algorithm due to Chekuri et al. [15]. This shows that the problem seems more amenable on "tree-like" graphs.
However, for w = ω(n 1/6 ), the bound is weaker than the bound of O( √ n). In fact, EDP remains NP-hard even for graphs of constant treewidth, namely treewidth w = 2 [39]. This further rules out the existence of a fixed-parameter algorithm for MaxEDP parameterized by treewidth, assuming P = NP. Therefore, to obtain fixed-parameter tractability results as well as better approximation guarantees, one needs to resort to parameters stronger than treewidth.
Another route to bridge the large gap between approximation lower and upper bounds for MaxEDP is to allow the paths to have congestion c: that is, instead of requiring the routed paths to be pairwise disjoint, at most c paths can use an edge. We can also think of this problem that each edge has a capacity c; thus, on unit-capacity graphs we ask for solutions without congestion. In their groundbreaking work, Raghavan and Thompson [40] introduced the technique of randomized rounding of LPs to obtain polynomial-time approximation algorithms for combinatorial problems. Their approach allows to route Ω(OPT * ) pairs of paths with congestion O (log n/ log log n), where OPT * denotes the value of an optimum solution to the multi-commodity flow relaxation. This extensive line of research [2,17,31] has culminated in a log O (1) kapproximation algorithm with congestion 2 for MaxEDP [20]. A slightly weaker result also holds for MaxNDP [10].

Motivation and contribution
The goal of this work is to study disjoint paths problems under another natural measure for how "far" a graph is from being a tree. In particular, we propose to examine MaxEDP and MaxNDP under the feedback vertex set number. It denotes the smallest size r of a feedback vertex set of a graph G, which is a subset R of nodes for which G − R is a forest. Note that the treewidth of G is at most r + 1. Therefore, given the NP-hardness of EDP for treewidth w = 2 and the current gap between the best known upper bound O(w 3 ) and the linear upper bound suggested by Conjecture 1, it is interesting to study the stronger restriction of bounding the feedback vertex set number r of the input graph. Our approach is further motivated by the fact that MaxEDP is efficiently solvable on trees by means of the algorithm of Garg et al. [26]. Similarly, MaxNDP is easy on trees (see Theorem 3). Throughout this work, the parameter r will denote the feedback vertex set number of a graph.
Our main insight is that one can in fact obtain bounds in terms of r that either strengthen the best known bounds or are almost tight (see Table 1). It therefore seems that the parameter r correlates quite well with the "difficulty" of disjoint paths problems.
Our first result allows the paths to have small congestion: in this setting, we strengthen the result, obtained by the classic randomized LP-rounding approach of Raghavan and Thompson [40], that one can always route Ω(OPT * ) pairs with congestion O (log n/ log log n) with constant probability. Theorem 1 There is a polynomial-time algorithm for MaxEDP that produces-with constant probability-a routing of Ω(OPT * ) paths with congestion O (log(kr)/ log log(kr)) where OPT * is the value of an optimum solution to the multicommodity flow relaxation, k is the number of terminal pairs and r is the feedback vertex set number. In other words, we show that there is an O(1)-approximation algorithm for MaxEDP with congestion O (log(kr)/ log log(kr)).
Our second main result builds upon Theorem 1 and uses it as a subroutine. We show how to use a routing for MaxEDP with low congestion to obtain a polynomial-time approximation algorithm for MaxEDP without congestion that performs well in terms of r . Theorem 2 There is a polynomial-time algorithm for MaxEDP that produces-with constant probability-a routing of OPT * /O( √ r log(kr)) paths with no congestion where OPT * is the value of an optimum solution to the multi-commodity flow relaxation, k is the number of terminal pairs and r is the feedback vertex set number. In particular, our algorithm strengthens the best known approximation algorithm for MaxEDP on general graphs [11] as always r ≤ n, and indeed it matches that algorithm's performance up to a logarithmic factor. Substantially improving upon our bounds would also improve the current state of the art of MaxEDP. Conversely, the result implies that it suffices to study graphs with close to linear feedback vertex set number in order to improve the currently best upper bound of O( √ n) on the approximation ratio [11].
Our algorithmic approaches harness the forest structure of G − R for any feedback vertex set R. However, the technical challenge comes from the fact that the edge set running between G − R and R is unrestricted. Therefore, the "interaction" between R and G − R is non-trivial, and flow paths may run between the two parts in an arbitrary manner and multiple times. In fact, we show that MaxEDP is already NP-hard if R consists of a single node (Theorem 5); this contrasts the efficient solvability on forests [26].
In order to overcome the technical hurdles, we propose several new concepts, which we believe could be of interest in future studies of disjoint paths or routing problems.
In the randomized rounding approach of Raghavan and Thompson [40], it is shown that the probability that the congestion on any fixed edge is larger than c log n/ log log n for some constant c is at most 1/n O (1) . Combining this with the fact that there are at most n 2 edges, yields that every edge has bounded congestion with high probability. The number of edges in the graph may, however, be unbounded in terms of r and k. Hence, in order to prove Theorem 1, we propose a non-trivial preprocessing step of the optimum LP solution that is applied prior to the randomized rounding. In this step, we aggregate the flow paths by a careful rerouting so that the flow "concentrates" in O(kr 2 ) nodes (so-called hot spots) in the sense that if all edges incident on hot spots have low congestion, then so have all edges in the graph. Unfortunately, for any such hot spot the number of incident edges carrying flow may still be unbounded in terms of k and r . We are, however, able to give a refined probabilistic analysis that suitably relates the probability of exceeding the congestion bound to the amount of flow on the respective edge. Since the total amount of flow traversing any given hot spot is at most k, the probability that there is an edge incident on this hot spot that violates the congestion bound is inverse polynomial in r and k.
The known O( √ n)-approximation algorithm for MaxEDP by Chekuri et al. [11] employs a clever LP-rounding approach. If there are many long flow paths in the LP solution, then there must be a single node carrying a significant fraction of the total flow and a good fraction of this flow can be realized by integral paths by solving a single-source flow problem. If the LP solution contains many short flow paths, then greedily routing these short paths yields the bound. Essentially, this follows from the fact that routing a short path blocks only a small amount of flow. In order to prove Theorem 2, we also distinguish two cases. We are interested, however, in the number of nodes in R that a flow path is visiting rather than in its length. In the first case, there are many paths, each of which is visiting a large number of nodes in R. Here, we reduce to a single-source flow problem in a similar way to the approach of Chekuri et al. The second case where a majority of the flow paths visit only a few nodes in R turns out to be more challenging, since any such path may still visit an unbounded number of edges in terms of k and r . We use two main ingredients to overcome these difficulties. First, we apply our Theorem 1 as a building block to obtain a solution with logarithmic congestion while losing only a constant factor in the approximation ratio. Secondly, we introduce the concept of irreducible routings with low congestion which allows us to exploit the structural properties of the graph and the congestion property to identify a sufficiently large number of flow paths blocking only a small amount of flow.
Note that the natural greedy approach of always routing the shortest conflict-free path gives only an approximation ratio of O( √ m) for MaxEDP, where m is the number of edges. We believe that it is non-trivial to obtain our bounds via a more direct or purely combinatorial approach.
Our third result is a fixed-parameter algorithm for MaxNDP in k + r . This run time is polynomial for constant r . We also note that, for small r , our algorithm is asymptotically significantly faster than the fastest known algorithm for NDP, by Kawarabayashi and Wollan [30], which requires time at least quadruple-exponential in k [1]. Namely, if r is asymptotically less than triple-exponential in k, our algorithm is asymptotically faster than theirs. We achieve this result by the idea of so-called essential pairs and realizations, which characterizes the "interaction" between the feedback vertex set R and the paths in an optimum solution. Note that in our algorithm of Theorem 3 the parameter k does not appear in the exponent of the run time at all. Hence, whenever r = o(k/ log k), our algorithm is asymptotically faster than reducing MaxNDP to NDP by guessing the subset of pairs to be routed (at an expense of 2 k in the run time) and using Scheffler's [43] algorithm for NDP with run time 2 O(r log r ) · n; for r = Ω(k/ log k), our algorithm is asymptotically not slower. Once a fixed-parameter algorithm for a problem has been obtained, the existence of a polynomial-size kernel comes up. Here we note that MaxNDP does not admit a polynomial kernel for the combined parameter k + r , unless NP ⊆ coNP/poly [7].
Another natural question is whether the run time f (k, r ) · n in Theorem 3 can be improved to f (r ) · n O (1) . We answer this question in the negative, ruling out the existence of a fixed-parameter algorithm for MaxNDP parameterized by r (assuming FPT = W [1]):

Theorem 4 MaxNDP in unit-capacity graphs is W[1]-hard parameterized by feedback vertex set number.
This contrasts the known result that NDP is fixed-parameter tractable in feedback vertex set number [43]-which further stresses the relevance of understanding this parameter.
For MaxEDP, we prove that the situation is, in a sense, even worse: This theorem also shows that our algorithms are relevant for small values of r , and that they nicely complement the NP-hardness for MaxEDP in capacitated trees [26].
Our results are summarized in Table 1.
Related work Our study of the parameter feedback vertex set number is in line with the general attempt to obtain bounds for MaxEDP (or related problems) that are independent of the input size. Besides the above-mentioned works that provide bounds in terms of the treewidth of the input graph, Günlük [27] and Chekuri et al. [16] give bounds on the flow-cut gap for the closely related integer multi-commodity flow problem; their bounds are logarithmic with respect to the vertex cover number of a graph. This improved upon earlier bounds of O(log n) [36] and O(log k) [4,37]. As every vertex cover is in particular a feedback vertex set of a graph, our results for disjoint path problems address a generalization of graphs with bounded vertex cover number. Bodlaender et al. [7] showed that NDP does not admit a polynomial kernel parameterized by vertex cover number and the number k of terminal pairs, unless NP ⊆ coNP/poly; therefore, NDP is unlikely to admit a polynomial kernel in k + r either. Ene et al. [21] showed that MaxNDP is W[1]-hard parameterized by tree-depth, which is another restriction of treewidth that is incomparable to feedback vertex set number. The basic gap in understanding the approximability of MaxEDP has led to several improved results for special graph classes, and also our results can be seen in this light. For example, polylogarithmic approximation algorithms are known for graphs whose global minimum cut value is Ω(log 5 n) [41], for bounded-degree expanders [8,9,25,32,36], and for Eulerian planar or 4-connected planar graphs [31]. Constant factor approximation algorithms are known for capacitated trees [13,26], grids and grid-like graphs [3,5,33,34]. For planar graphs, there is a constant-factor approximation algorithm with congestion 2 [44]. Very recently, Chuzhoy et al. [18] gave aÕ(n 9/19 )-approximation algorithm for MaxNDP on planar graphs. However, improving the O( √ n)-approximation algorithm for MaxEDP remains elusive even for planar graphs.

Preliminaries
We use standard graph theoretic notation. For a graph G, let V (G) denote its vertex set and E(G) its edge set. The length of a path is the number of its edges. A feedback vertex set of a graph G is a set R ⊆ V (G) such that G − R is a forest. A minor of a graph G is a graph H that is obtained by successively contracting edges from a subgraph of G (and deleting any occurring loops). A class G of graphs is minor-closed if for any graph in G also all its minors belong to G.
For an instance (G, M) of MaxEDP/MaxNDP, we refer to the vertices participating in the pairs M as terminals. It is convenient to assume that M forms a matching on the terminals; this can be ensured by making several copies of the terminals and attaching them as leaves. Hence, we can also assume that all terminals are leaves.

Multi-commodity flow relaxation
We use the following standard multi-commodity flow relaxation for MaxEDP that we will call MaxEDP LP (there is an analogous relaxation for MaxNDP). We use P(u, v) to denote the set of all paths in G from u to v, for each pair (u, v) of nodes. Since the pairs in M form a matching, the sets . The LP has a variable f (P) for each path P ∈ P representing the amount of flow on P. For each pair (s i , t i ) ∈ M, the LP has a variable x i denoting the total amount of flow routed for the pair (in the corresponding integer program, x i denotes whether the pair is routed or not). The LP imposes the constraint that there is a flow from s i to t i of value x i . Additionally, the LP has constraints that ensure that the total amount of flow on paths using a given edge (respectively node for MaxNDP) is at most 1.
It is well-known that the relaxation MaxEDP LP can be solved in polynomial time, since there is an efficient separation oracle for the dual LP (alternatively, one can write a compact relaxation). We use ( f, x) to denote a feasible solution to MaxEDP LP for an instance (G, M) of MaxEDP.
As noted in the introduction, MaxEDP LP has an integrality gap of Ω( √ n) as shown by Garg et al. [26]. The integrality instance on an n × n grid (of treewidth Θ( √ n)) exploits a topological obstruction in the plane that prevents a large integral routing; see Fig. 1.
We will use the following result by Chekuri et al. [11,Sect. 3.1]; see also Proposition 3.3 of Chekuri et al. [14]. Proposition 1 (Chekuri et al. [11]) Let ( f, x) be a fractional solution to the LP relaxation of a MaxEDP instance (G, M). If some node v is contained in all flow An instance with an integrality gap of Ω( √ n) for MaxEDP [26]: Any integral routing routes at most one pair, whereas a fractional multi-commodity flow can send 1/2 unit of flow for each pair (s i , t i ) along the canonical path from s i to t i in the grid paths of f , then we can find an integral routing of size at least i x i /12 in polynomial time.
As a corollary of Theorem 2, we immediately obtain the following proposition about the integrality gap of MaxEDP LP.

Corollary 1 The integrality gap of the multi-commodity flow relaxation for MaxEDP with k terminal pairs is O(
√ r log(kr)) for graphs with feedback vertex set number r .
Let f be a multi-commodity flow assigning to each path P ∈ P a nonnegative flow value f (P). The flow f is said to have congestion c if it satisfies a modification of MaxEDP LP where we replace, for each edge e ∈ E(G), the constraint P∈P: e∈P f (P) ≤ 1 with P∈P: e∈P f (P) ≤ c. In the particular case where f is integral we also speak of a routing f with congestion c.

Bi-criteria approximation for MaxEDP with low congestion
We present a randomized rounding algorithm that will lead to the proof of Theorem 1. First we will modify a fractional solution to the multi-commodity flow relaxation and then run a randomized rounding procedure.

Algorithm
Consider an instance (G, M) of MaxEDP. Let k denote the number of terminal pairs in M, and let R be a feedback vertex set of G that we construct by taking the union of the terminals in M and any 2-approximate minimum feedback vertex set; note that such an approximation can be obtained in polynomial time [6]. Thus, |R| ≤ 2r + 2k.
First, solve the corresponding MaxEDP LP. We obtain an optimal extreme point solution ( f, x). For each (s i , t i ) ∈ M, this gives us a set P (s i , t i ) of positive weighted paths that satisfy the LP constraints. Formally, Since we have an extreme point solution, the number of tight constraints is not smaller than the number of variables. Hence, given the numbers of constraints and variables, the number of constraints that are not tight is polynomially bounded in the input size. Consequently, the same bound holds for the cardinality of the In what follows, we will modify P and then select an (unweighted) subset P Sol of P that will form our integral solution.
Each P ∈ P has the form (r 1 , . . . , r 2 , . . . , r ) where r 1 , . . . , r are the nodes in R that are traversed by P in this order. For every j with 1 ≤ j ≤ − 1, we call the path (r j , . . . , r j+1 ) a subpath of P. For every subpath P of P, we set f (P ) = f (P). Let S be the multi-set of all subpaths of all paths in P . Let F = G − R be the forest obtained by removing R.
We now modify some paths in P , one by one, and at the same time, we incrementally construct a subset H Alg ⊆ V (F) in several steps. We will refer to the nodes in H Alg as hot spots. When the construction of H Alg is complete, every subpath in S will contain at least one hot spot, that is, a node in H Alg .
Example of the flow aggregation step: a A subpath P (highlighted in dashed gray) enters a tree (solid black edges) where h(P) (white node) is its closest node to the root. A path P (highlighted in solid gray) contains a different subpath with the same endpoints u, v ∈ R as P. b We reroute P by replacing its subpath between u and v with a copy of P Initially, let H Alg = ∅. Consider any tree T in F and fix any of its nodes as a root. Then let S T be the multi-set of all subpaths in S that, excluding the endpoints, are contained in T . For each subpath P ∈ S T , define its highest node h(P) as the node on P closest to the root. Note that P ∩ T equals P ∩ F and that P ∩ T is a path. Now, pick a subpath P ∈ S T that does not contain any node in H Alg and whose highest node h(P) is farthest away from the root. Consider the multi-set S[P] of all subpaths in S T that are identical to P (but may be subpaths of different flow paths in P ).

Note that the weight f (S[P]) of S[P] defined as P∈S[P] f (P)
is at most 1 by the constraints of the LP. Let u, v ∈ R be the endpoints of P. We define S uv as the set of all subpaths in S\S[P] that have u and v as their endpoints and that do not contain any node in H Alg .
Intuitively speaking, we now aggregate flow on P by rerouting as much flow as possible from S uv to P. To this end, we repeatedly perform the following operation as long as f (S[P]) < 1 and S uv = ∅. We pick a path P in S that contains a subpath in S uv ; see Fig. 2. We reroute flow from P by creating a new path P that arises from P by replacing its subpath between u and v with a new path identical to P, and assign it the weight f (P ) equal to min{ f (P ), 1 − f (S[P])}. Then we set the weight of (the original path) P to max{0, f (P ) + f (S[P]) − 1}. We update the sets P , P (s i , t i ), S, S T , S[P] and S uv accordingly.
As soon as f (S[P]) = 1 or S uv = ∅, we mark h(P) as a hot spot and add it to H Alg . Then, we proceed with the next P ∈ S T that does not contain a hot spot and whose highest node h(P) is farthest away from the root. If no such P is left, we consider the next tree T in F.
At the end, we create our solution P Sol by randomized rounding: We route every terminal pair (s i , t i ) with probability x i . In case (s i , t i ) is routed, we randomly select a path from P (s i , t i ) and add it to P Sol where the probability that the path P is taken is f (P)/x i .

Analysis
First, observe that x did not change during our modifications of the paths, as the total flow between any terminal pair did not change. Thus, the expected number of pairs routed in our solution P Sol is k i=1 x i ≥ OPT * . Using the Chernoff bound, the probability that we route less than OPT * /2 pairs is at most e −1/8 OPT * < 1/2, assuming OPT * > 8.
In the above algorithm, we guarantee that when we aggregate flow on a path P, then the total amount of all flow paths containing P as a subpath has increased to at most 1.
Nevertheless, the flow f may have congestion greater than 1 after this modification. This is because P may intersect flow paths that contain only a proper subset of the edges of P. For instance, consider the situation where we increase f (S[P ]) for a subpath P that initially contained a tight edge e (that is, an edge e with P∈P: e∈P f (P) = 1). After increasing f (S[P ]), the total amount of flow paths going through e is greater than 1. However, the congestion of the modified flow f is always at most 2 as shown by the following lemma.

Lemma 1 The congestion of the flow f is at most 2.
Proof In our algorithm, we increase the flow only along flow subpaths that are pairwise edge-disjoint. To see this, consider two distinct flow subpaths P and P on which we increase the flow. If there were an edge e lying on P and P , then both subpaths traverse the same tree in the forest F. Assume, without loss of generality, that P was considered before P by the algorithm. Then the path from e to the root would first visit h(P) and then h(P ). Hence, h(P) would be an internal node of P . This yields a contradiction, as h(P) was already marked as a hot spot when P was considered. This shows that we increased the flow along any edge by at most one unit. Hence, f has congestion at most 2.
We now bound the congestion of the integral solution obtained by randomized rounding. In the algorithm, we constructed a set H Alg of hot spots. As a part of the analysis, we will now extend this set to a set H as follows. Initially, H = H Alg . We build a sub-forest F of F consisting of all edges of F that lie on a path connecting two hot spots. Then we add to H all nodes that have degree at least 3 in F . Since the number of nodes of degree 3 in any forest is at most its number of leaves and since every leaf of F is a hot spot, it follows that this can at most double the size of H to 2|H Alg |. Finally, we add all nodes of the feedback vertex set R to H and mark all nodes in H as hot spots.

Lemma 2
The number |H | of hot spots is at most 2k|R| 2 + |R|.
Proof To this end, fix two nodes u, v ∈ R and consider the set of flow subpaths with endpoints u and v for which we added their hot spots to H Alg . Due to the aggregation of flows in our algorithm, all except possibly one of the subpaths are saturated, that is, they carry precisely one unit of flow. Since no two of these subpaths are contained in a same flow path of f and since the flow value of f is bounded from above by k, we added at most k hot spots for the pair u, v. Since there are at most |R| 2 pairs in R, the claim follows.
Definition 1 A hot spot u ∈ H is good if the congestion on any edge incident on u is bounded by 12 log(k|R|)/ log log(k|R|); otherwise, u is bad.

Lemma 3 Let u ∈ H be a hot spot. The probability that u is bad is bounded from above by 1/(k 2 |R| 3 ).
Proof Let e 1 = uv 1 , . . . , e = uv be the edges incident on u and, for each i with 1 ≤ i ≤ , let f i be the total flow on the edge uv i . Since any flow path visits at most two of the edges incident on u, the total flow i=1 f i on the edges incident on u is at most 2k.
For any i with 1 ≤ i ≤ , we have f i = P : P e i f (P), where P runs over the set of all paths connecting some terminal pair and containing e i . For 1 ≤ j ≤ k, we define as the total amount of flow sent across e i by the terminal pair (s j , t j ). Recall that x j is the total flow sent for the terminal pair (s j , t j ). The probability that the randomized rounding procedure picks a certain path P ∈ P(s j , t j ) is precisely x j · f (P)/x j = f (P). Given the disjointness of the respective events, the probability that the pair (s j , t j ) routes a path across e i is precisely f i j . Let X i j be the binary random variable indicating whether the pair (s j , t j ) routes a path across e i . Then P X i j = 1 = f i j . Let X i = k j=1 X i j be the number of paths routed across e i by the algorithm. By linearity of expectation, In the following, we assume that k is sufficiently big ( k ≥ e e e ). Note that this assumption is feasible as MaxEDP can be efficiently solved when k is constant [42]. Fix any edge e i . Set δ = 6 · log(k|R|) log log(k|R|) and δ = 2δ/ f i − 1. Note that, for fixed i, the variables in {X i j | 1 ≤ j ≤ k} are independent. Hence, by the Chernoff bound, we have  Now, applying the union bound, we can infer that the probability that any of the edges incident on u carries more than 2δ paths, that is, more than 12 log(k|R|)/ log log(k|R|) paths, is at most

Lemma 4 If every hot spot is good, then the congestion on every edge is bounded from above by 24 log(k|R|)/ log log(k|R|).
Proof Consider an arbitrary edge e = uv that is not incident on any hot spot. In particular, this means that e lies in the forest F = G − R. A hot spot z in F is called direct to e if the path in F from z to e excluding e does not contain any hot spot other than z.
We claim that there can be at most two distinct hot spots z, z direct to e. If there were a third hot spot z direct to e, then consider the unique node z 0 ∈ V (F) such that no two of the hot spots z, z , z are connected in F − z 0 . Such a node z 0 exists, since z, z , z cannot lie on a common path in F as they are all direct to e. The node z 0 , however, would be added as a hot spot at the latest when H was built. Now this is a contradiction, because then one of the paths connecting z, z or z to e would contain z 0 and thus one of these hot spots would not be direct to e. Now we show the lemma assuming that there are two distinct hot spots z, z direct to e. If there were only one or no hot spot direct to e, then we can apply a similar argument as the following one. Now, let P be an arbitrary path that is routed by our algorithm and that traverses e, and let P ∈ S be the subpath of P visiting e; see Fig. 3.
Consider the two paths in F connecting z to e and z to e. Let e z and e z be the edges on these paths incident on z and z , respectively. By our construction, P must visit a hot spot in F. If P visited neither z nor z , then P would contain a hot spot direct to u or to v that is distinct from z and z -a contradiction. Hence P and thus also P visit e z or e z . The claim now follows from the facts that, first, this holds for any path traversing e, and that, secondly, z and z are good, and that, thirdly, therefore altogether at most 2 · (12 log(k|R|)/ log log(k|R|) paths visit e z or e z . Now we are ready to prove Theorem 1.

Proof of Theorem 1
We show that the algorithm presented in Sect. 3.1 produceswith constant probability-a routing with Ω(OPT * ) paths with congestion O (log(kr)/log log(kr)). As argued above, the probability that we route less than OPT * /2 paths is at most 1/2. By Lemma 2, the number of hot spots is at most 2k|R| 2 +|R| ≤3k|R| 2 . Thus, Lemma 3 implies an upper bound of 3k|R| 2 /(k 2 |R| 3 ) = 3/(k|R|) on the probability that at least one of these hot spots is bad. Hence, by Lemma 4, we route with probability 1 − 1/2 − 3/(k|R|) at least OPT * /2 pairs with congestion at most 24 log(k|R|)/ log log(k|R|). Since the probability is bounded from below by a positive constant for sufficiently big k, the statement of the theorem follows by using |R| ≤ 2r + 2k and |R| ≥ r .

Refined approximation bound for MaxEDP
In this section, we provide an improved approximation guarantee for MaxEDP without congestion, thereby proving Theorem 2.

Irreducible routings with low congestion
We first develop the concept of irreducible routings with low congestion, which is (besides Theorem 1) a key ingredient of our strengthened bound on the approximability of MaxEDP based on feedback vertex set number.
Consider any multigraph G and any set P of (not necessarily simple) paths in G with congestion c. We say that an edge e is redundant in P if there is an edge e = e such that the set of paths in P covering (containing) e is a subset of the set of paths in P covering e . For instance, if G contains at least two edges, then any edge that is not covered by any path in P is redundant in P.

Definition 2
The set P is called an irreducible routing with congestion c if each edge belongs to at most c paths of P and there is no edge redundant in P.
In contrast to a feasible routing of a MaxEDP instance, we do not require an irreducible routing to connect a set of terminal pairs. If there is an edge e redundant in P, we can apply the following reduction rule: we contract e in G and we contract e in every path of P that covers e. By this, we obtain a minor G of G and a set P of paths that consists of all the contracted paths and of all paths in P that were not contracted. Thus, there is a one-to-one correspondence between the paths in P and P .
We make the following observation about P and P .

Observation 1 A subset of paths in P is edge-disjoint in G if and only if the corresponding subset of paths in P is edge-disjoint in G.
As applying the reduction rule strictly decreases the number of redundant edges, an iterative application of this rule yields an irreducible routing on a minor of the original graph.

Theorem 6 Let G be a minor-closed class of multigraphs and let p G be a positive integer. If for each graph G ∈ G and every non-empty irreducible routing P on G with congestion c there exists a path in P of length at most p G , then the average length of the paths in P is at most c
Proof Take a path P 0 of length at most p G . Contract all edges of P 0 in G and obtain a minor G ∈ G of G. For each path in P contract all edges shared with P 0 to obtain a set P of paths. Remove P 0 along with all degenerated paths from P , thus |P | < |P|. Note that P is an irreducible routing on G with congestion c. We repeat this reduction procedure recursively on G and P until P is empty; this happens after at most |P| steps. At each step, we decrease the total path length by at most c · p G . Hence, the total length of paths in P is at most |P| · c · p G .
As a consequence of Theorem 6, we get the following result for forests.

Lemma 5 Let F be a forest and let P be a non-empty irreducible routing on F with congestion c. The average path length in P is at most 2c.
Proof We show that P contains a path of length at most 2. Then the lemma follows immediately by applying Theorem 6 and using the fact that (simple) forests are minorclosed.
Take any tree in F, root it with any node and consider a leaf v of maximum depth. If v is adjacent to the root, then the tree is a star and every path in the tree has length at most 2. Otherwise, let e 1 and e 2 be the first two edges on the path from v to the root. By the definition of irreducible routing, the set of all paths covering e 1 is not a subset of the paths covering e 2 ; hence, e 1 is covered by a path which does not cover e 2 . Since all other edges incident to e 1 end in a leaf, this path has length at most 2.
Note that the bound provided in Lemma 5 is actually tight up to a constant. Let c be an arbitrary integer greater than one. Consider a graph that is a path of length c − 1 with a star of c − 1 leaves attached to one of its endpoints. The c − 1 paths of length c together with the 2c − 2 paths of length 1 form an irreducible routing with congestion c. The average path length is

Approximation algorithm
Consider an instance (G, M) of MaxEDP with k terminal pairs. Let R be a 2approximate minimum feedback vertex set in G; recall that we can obtain R in polynomial time [6]. Furthermore, let c = O (log(kr)/ log log(kr)) be the bound on the congestion of our algorithm in Theorem 1. We solve the corresponding MaxEDP LP and obtain an optimal extreme point solution ( f, x) of total flow | f | = OPT * . By the same argument as in Sect. 3, the number of all paths with a positive flow value is polynomially bounded in the input size. Let ρ = √ |R|/c and let P be the set of all paths with a positive flow value that visit at most ρ nodes of R.
Below we argue how to use R, P and f to obtain a feasible routing of Ω | f |/(c √ |R|) paths. This routing yields an overall approximation ratio of O √ r log(kr) and will prove Theorem 2. We distinguish the following two cases.

Case 1
The total flow of P is at least | f |/2. We compute a new flow ( f , x ), where we set f (P) = f (P) for every path P in P, and f (P) = 0 for any other path P. Thus, we have | f | ≥ | f |/2. By applying our algorithm of Sect. 3 on ( f , x ), we efficiently compute with constant probability a routing P with congestion c containing Ω(| f |) = Ω(| f |) paths. Note that all paths in P visit at most ρ nodes of R. Initialize P with P. As long as there is an edge e not adjacent to R that is redundant in P , we iteratively apply the reduction rule (see Sect. 4.1) on e by contracting e in the graph as well as in every path that covers it. Let G be the obtained minor of G with forest F = G − R.
Note that F is simple (in contrast to G that might contain multiple edges) as we contracted edges only in the (simple) forest G − R. The obtained set P is a set of (not necessarily simple) paths in G corresponding to P. In order to obtain a feasible routing for (G, M) of size Ω (| f |/(cρ)), it suffices by iterated application of Observation 1 to P and P that we efficiently find a subset P Sol ⊆ P of pairwise edge-disjoint paths of size |P Sol | = Ω |P|/(cρ) .
To obtain P Sol , we first bound the total path length in P . Removing R from G "decomposes" the set P into a set S of subpaths lying in F , that is, Observe that S is an irreducible set of F with congestion c, as the reduction rule is not applicable anymore. (Note that a single path in P may lead to many paths in the cover S which are considered distinct.) Thus, by Lemma 5, the average path length in S is at most 2c.
Let P be an arbitrary path in P . Each edge on P that is not in a subpath in S is incident on a node in R, and each node in R is incident on at most two edges in P. Together with the fact that P visits less than ρ nodes in R, there are less than 2ρ edges of P outside S. By the same fact, P contributes at most ρ subpaths to S. Given that the average length of the subpaths in S is at most 2c, we can upper bound the total path length P∈P |P| by |P |ρ(2c + 2). Let P be the set of the |P |/2 shortest paths in P . Hence, each path in P has length at most 4ρ(c + 1).
We greedily construct a feasible solution P Sol by iteratively picking an arbitrary path P from P , adding it to P Sol and removing all paths from P that share some edge with P (including P itself). We stop when P is empty. As P has congestion c, we remove at most 4ρc(c + 1) paths from P per iteration. Thus, |P Sol | ≥ |P |/(4ρc(c + 1)) = Ω |P|/(c |R|) .

Case 2
The flow of P is less than | f |/2. Then, the flow of all paths visiting at least ρ nodes of R is at least | f |/2. Let P be the subset of these paths and let f be the sum of all these flows. Note that f provides a feasible solution to relaxation MaxEDP LP for (G, M) of value at least | f |/2. Since every flow path in f has length at least ρ, the total inflow of the nodes in R is at least | f |ρ. By averaging, there must be a node v ∈ R of inflow at least ρ| f |/|R| = | f |/(c √ |R|). Let f be the subflow of f consisting of all flow paths visiting v. This subflow corresponds to a feasible solution ( f , x ) of the LP relaxation of value at least | f |/(c √ |R|) ≥ | f |/(2c √ |R|).
Using Proposition 1, we can recover an integral feasible routing of size at least This completes the proof of Theorem 2.

Fixed-parameter algorithm for MaxNDP
We give a fixed-parameter algorithm for MaxNDP that solves any instance (G, M) in time (k + r ) O(r ) · n, where r denotes the feedback vertex set number of G, k = |M| and n = |V (G)|. A feedback vertex set R of size r can be computed in time r O(r ) · n [38]. By the matching assumption (see Sect. 2), each terminal in M is a leaf. We can thus assume that none of the terminals is contained in R.
Consider an optimal routing P of the given MaxNDP instance and the set M R ⊆ M of terminal pairs that are connected via P by a path that visits at least one node in R. Let P ∈ P be a path connecting a terminal pair (s i , t i ) ∈ M R . This path has the form (s i , . . . , r 1 , . . . , r 2 , . . . , r , . . . , t i ), where r 1 , . . . , r are the nodes in R that are traversed by P in this order. The pairs (s i , r 1 ) and (r , t i ) as well as (r j , r j+1 ) for j = 1, . . . , − 1 are called essential pairs for P. A node pair is called essential if it is essential for some path in P. Let M e be the set of essential pairs.
Let F be the forest that arises when deleting R from the input graph G. Let (u, v) be any pair of nodes in G. A path P in G with endpoints u and v is said to realize (u, v) if all internal nodes of P lie in F. A set P of paths is said to realize a set of node pairs if every pair in this set is realized by some path in P and if two paths in P can only intersect at their endpoints. Note that the optimal routing P induces a realization of M e in a natural way: The realization consists of all maximal subpaths of paths in P whose internal nodes all lie in F. Conversely, for any realization P of M e , we can concatenate paths in P to obtain a feasible routing that connects all terminal pairs in M R . Therefore, we consider P (slightly abusing notation) also as a feasible routing for M R .
In our algorithm, we first guess the set M e of essential pairs, which implies the set M R as well as the set M R that we define as M R = M\M R . Then, by dynamic programming, we construct two sets of paths, P e and P F , where P e realizes M e and P F routes in F a subset of M R . In our algorithm, the set P e ∪ P F forms a feasible routing that maximizes |P F | and routes all pairs in M R . Recall that we consider the realization P e of M e as a feasible routing for M R . Now assume that we correctly guessed M e . Below, we will describe an algorithm that uses a dynamic programming table to compute an optimum routing in Recall that only leaf nodes can be terminals or neighbors of R For the sake of easier presentation, first we describe how to compute the cardinality of such a routing. Then we argue how to find such a routing without a significant increase in the run time.

Dynamic programming table
Before we describe the dynamic programming table, we make several technical assumptions that help to simplify the presentation. First, we modify the input instance as follows. We subdivide every edge incident on a node in R by introducing a single new node on this edge. Note that this yields an instance equivalent to the input instance. As a result, every neighbor of a node in R that lies in F, that is, every node in N G (R), is a leaf in F. Moreover, the set R is an independent set in G. Also recall that we assumed that every terminal is a leaf and that therefore R does not contain any terminal. We also assume that the forest F is a rooted tree by introducing a dummy node (which plays the role of the root) and arbitrarily connecting this node to every connected component of F by an edge. In our dynamic programming table, we will take care that no path visits this root node. We also assume that F is an ordered tree by introducing an arbitrary order among the children of every node.
For any node v, let F v be the subtree of F rooted at v. Let c v be the number deg F (v) − 1 of children of v and let v 1 , . . . v c v be the (ordered) children of v.
We introduce a dynamic programming table T . It contains an entry for every F i v and every subset M e of M e . Roughly speaking, the value of such an entry is the solution to the subproblem, where we restrict the forest to F i v , and the set of essential pairs to M e . More precisely, table T contains five parameters: Parameters v and i describing F i v , a parameter M e describing the set of essential pairs, and two more parameters u and b. The parameter u is either a terminal or a node in R, and b is in one of the three states: free, to-be-used, or blocked. The value T [v, i, M e , u, b] is the maximum cardinality of a set P F of paths with the following properties: 1. The set P F is a feasible routing of some subset of M R . 2. The set P F is completely contained in F i v . 3. There is an additional set P e of paths with the following properties: (a) The set P e is a realization of M e ∪ {(u, v)} if b = to-be-used. Else, it is a realization of M e . (b) The set P e is completely contained in F i v ∪ R and node-disjoint from the paths in P F . 4. If b = free, there is no path in P e ∪ P F visiting v.
Note that the parameter u is only relevant when b = to-be-used (otherwise, it can just be ignored). One can think of the three states of b as follows: If b = free, then there is no path in P e ∪ P F visiting v, hence, in the future we might consider to add a path through v. If b = to-be-used, then v is visited by some path in P e (connecting u to v) and we cannot add a new path through v. Eventually, if b = blocked, we may add a path to P e ∪ P F that goes through v. Hence, v is "blocked" for the future because of the possibility of having been already visited. Thus, we have Below, we describe how to compute the entries of T in a bottom-up manner. Having computed T , we obtain the cardinality of the optimum routing P by where v is the dummy root node and u is an arbitrary terminal. Base case In the base case, the node v is a leaf and we have P F = ∅. Thus, every entry for v has value either 0 or −∞, depending on whether M e can be routed. When b = free, no path can visit v and, hence, also P e = ∅. Thus we set Then we set if M e is either empty, or consists of a single pair of nodes in R ∩ N G (v), or consists of a single pair where one node is v and the other one is in R ∩ N G (v). Finally, we set For all the other cases where v is a leaf, we set Induction step For the inductive step, we first consider i = 1. We have since the path in P e realizing (u, v) has to start at a leaf node of F v 1 . For the other states of b, recall that every path in P e ∪ P F connects two leaves in F 1 v . Since v has degree 1 in F 1 v , there is no path in P e ∪ P F visiting v, and we have Now, let i be greater than 1. In a high level view, we guess which part of M e is realized in F i−1 v ∪ R and which part is realized in F v i ∪ R. For this, we consider every partition M e1 M e2 of M e . By our dynamic programming table, we find a partition that maximizes our objective. In the following, we assume that we guessed M e1 M e2 correctly. Let us consider the different states of b in more detail.
1. When b = free, node v is not allowed to be visited by any path, especially by any

When
For this, there are two possibilities: Either (u, v) is realized by a path in F i−1 v ∪ R, or there is a realizing path that first goes through F v i ∪ R and then reaches v via the edge (v i , v). Hence, for the first possibility, we consider for the second possibility, we consider Maximizing over both, we obtain T [v, i, M e , u, to-be-used]. 3. When b = blocked, we will also consider two cases. In the first one, there is no path in P e ∪ P F going through edge (v i , v), hence, we get the term In the second case, there is a path P in P e ∪ P F going through edge (v i , v).
Since P is connecting two leaves in F i v , a part of P is in F i−1 v ∪ R and the other part is in F v i ∪ R. If P ∈ P e , then it is realizing a pair of M e . Hence, for every pair (u 1 , u 2 ) ∈ M e , we have to consider the term and the symmetric term where we swap u 1 and u 2 . If P ∈ P F , then it is realizing a terminal pair of M R . Hence, for every pair (u 1 , u 2 ) ∈ M R we get the term and the symmetric term where we swap u 1 and u 2 . Note that we count the path realizing (u 1 , u 2 ) in our objective. Maximizing over all the terms of the two cases, we obtain T [v, i, M e , u, to-be-used].

Analysis
Let us analyze the run time of the algorithm described above. Given R, the forest F can be computed in time O(r · n). In order to guess M e , we enumerate all potential sets of essential pairs. To bound the number of potential sets of essential pairs, first recall that each pair contains at least one node in R. On the other hand, each node in R appears in at most two pairs and, consequently, |M e | ≤ 2r . Thus, an upper bound on the number of potential sets for M e is the number of ways to choose up to two pairs for each node in R. As each node in R is paired with a terminal node or another node in R, there are at most (2k + r − 1) candidate pairs for it. Hence, there are at most (2k + r ) 2r candidate sets to consider. For each particular guess for M e , we run the dynamic program above. The number of entries in T -as specified by the five parameters v, i, M e , u and b-for each fixed guess for M e is at most Among the different entries, those with b = blocked and i > 1 have the highest run time in the worst case. There, we do not only consider all partitions of M e , but for every partition we also consider every possible node pair that is either an essential pair in M e or a terminal pair in M R . As there are at most 2 2r partitions of M e , at most 2r essential pairs in M e and at most k terminal pairs in M R , we consider at most 2 2r + 2 · 2 2r · (k + 2r ) ≤ 2 2r +1 · (2k + 2r ) different terms, including the symmetric terms, for computing an entry. For each term, we need constant time for look-up. Hence, altogether, this gives a run time of assuming that R is given. By computing R in time r O(r ) · n, we can bound the total run time by (k + r ) O(r ) · n.

Reconstruction of an optimal routing
Above, we computed only the cardinality of the routing P. Now we discuss how to compute an optimal routing of size |P| without asymptotically increasing the total run time. For every non-leaf entry of T , we take a term that maximized its value and define the (at most two) entries appearing in the term as its children. We can do this while computing T without increasing the asymptotic run time. By considering all the children that (recursively) contributed to the entry with the optimum value of the root node, we obtain a computation tree. Going over the computation tree from bottom to top, we compute for each entry of the tree its set of paths P e ∪ P F . We store the set as a linked list with pointers to the paths which themselves are stored as linked lists of their nodes. Whenever we concatenate two lists, we will not create a new copy but reuse one of them. This will give us constant time for concatenation. Note that for almost all entries we obtain P e ∪ P F by just taking the union of the paths of its children. Hence, we just concatenate the lists of its children (at most two) in constant time. The only exception are entries where b = blocked and a path P is going through the node given by the first parameter v of the entry. Here, we obtain P by concatenating two paths, where each one belongs to a different child of the entry. Then we add the concatenated path to the union of the remaining paths of the children. The operation to find the two paths that we want to concatenate takes O(|P e ∪ P F |) = O(k + r ) time. The remaining steps to compute P e ∪ P F also take constant time. Thus, for each entry of the tree, we can bound the run time by O(k + r ). Note that in the computation tree there is exactly one entry for each subtree F i v , hence, in total there are O(n) entries. Thus, our approach takes additional time of (k + r ) · O(n) to compute the paths P e ∪ P F . Finally, the time needed to accordingly concatenate the paths in P e to get a routing for M R takes at most O(|P e | 2 ) = O(r 2 ) time. Hence, in time (8k + 8r ) 2r +3 · O(n) we can compute an optimal routing, asymptotically matching the time needed to compute its cardinality.
This finishes the proof of Theorem 3.

Parameterized intractability of MAXNDP for the Parameter r
In this section, we prove Theorem 4, that is, we show that MaxNDP is W[1]-hard parameterized by feedback vertex set number. This reduction was originally devised for the parameter tree-depth by Ene et al. [21]; we notice that the same reduction also works for the parameter r . (Both tree-depth and feedback vertex set number are restrictions of treewidth, but they are incomparable to each other.) For sake of completeness, we include the reduction here, and argue about the feedback vertex set number of the reduced graph. The reduction is from the W [1]-hard Multicolored Clique problem [23], where given a graph G and a partition of V (G) into q independent sets V 1 , . . . , V q , we are to check if there exists a q-clique in G with exactly one vertex in every set V i . By adding dummy vertices, we can assume q ≥ 2 and |V i | = n for some n with n ≥ 2 and every i with 1 ≤ i ≤ q.  We start by constructing for every set V i a gadget W i as follows. First, for every v ∈ V i , we construct a path X i v of length q − 2 on the vertex set where the vertices are connected in any order. Let first(X i v ) denote any one of the two endpoints of X i v , and let last(X i v ) denote the other endpoint of X i v . Secondly, we select an arbitrary vertex u i ∈ V i . Thirdly, for every v ∈ V i \{u i }, we add a vertex s i v and a vertex t i v . We make s i v adjacent to first(X i v ) and to first(X i u i ). Similarly, we make t i v adjacent to last(X i v ) and to last(X i u i ); see To encode adjacencies in G, we proceed as follows. For every i and j with 1 ≤ i < j ≤ q, we add a vertex p i, j adjacent to all vertices in Finally, we set the required number of paths to q(n − 1) + q 2 . This concludes the description of the instance (H, M, ).
From a clique to disjoint paths Assume that the given instance of Multicolored Clique is a "yes"-instance, and let {v i | i ∈ {1, . . . , q}} be a clique in G with v i ∈ V i for each i ∈ {1, . . . , q}. We construct a family of node-disjoint paths as follows. First, for every i ∈ {1, . . . , q} and every v ∈ V i \{u i }, we route a path from Note that in this step we have created q(n − 1) node-disjoint paths connecting terminal pairs, and in every gadget W i the only unused vertices are vertices on the path X i v i . To construct the remaining q 2 paths, for every i and j with 1 ≤ i < j ≤ q, we take the 3-vertex is indeed a terminal pair in M. From disjoint paths to a clique. In the other direction, let P be a family of nodedisjoint paths connecting terminal pairs in H . Let P st ⊆ P be the set of paths connecting terminal pairs from M st , and, in an analogous way, let P x ⊆ P be the set of paths connecting terminal pairs from M x . Eventually, let P = {p i, j | 1 ≤ i < j ≤ q}. First, observe that P separates every terminal pair from M x . Hence, every path from P x contains at least one vertex from P. Since |P| = q 2 , we have |P x | ≤ q 2 , and, consequently, Thus, P st routes all terminal pairs in M st and P x routes q 2 pairs from M x . Since |P x | = |P|, every vertex in P is contained in a path from P x . Consequently, the paths in P st cannot use any vertex in P. Therefore, every path in P st lies inside one gadget W i .
Observe that a shortest path between terminals s i v and t i v inside W i is either X i u i or X i v prolonged with the terminals at endpoints, and thus contains q + 1 vertices. Furthermore, a shortest path between two terminals in M x contains three vertices. We infer that the total number of vertices on paths in P is at least We infer that every path in P st consists of q + 1 vertices, and every path in P x consists of three vertices. In particular, for every i ∈ {1, . . . , q} and every v ∈ V i \{u i }, the path in P st that connects s i v and t i v goes either through X i v or X i u i . Consequently, for each i ∈ {1, . . . , q} there exists a vertex v i ∈ V i such that the path X i v i is not contained in any path from P st . Even more, X i v i contains all the vertices of W i that do not lie on any path from P st .
We claim that {v i | i = 1, . . . , q} is a clique in G. To this end, consider any p i, j ∈ P. Since |P x | = |P|, there exists a path in P x that goes through p i, j . Moreover, this path has exactly three vertices. Since the only neighbors of p i, j that are not used by is a terminal pair in M and, consequently, v i v j ∈ E(G). This concludes the proof of the correctness of the construction.
Bounding the feedback vertex set number We are left with a proof that H has bounded feedback vertex set number in q.
First, observe that H − P consists of q components, where each component is a gadget W i , for some i ∈ {1, . . . , q}. Secondly, consider the endpoints of the path X i This finishes the proof of Theorem 4.

Hardness of edge-disjoint paths in almost-forests
In this section, we show that EDP is NP-hard already in graphs that become forests after deleting two nodes. Though this immediately implies NP-hardness for MaxEDP in such graphs, we show that MaxEDP is NP-hard even in graphs that become forests after deleting just one node. Thus, we prove Theorem 5.

Proof of Theorem 5
We first show NP-hardness of EDP for r = 2. We reduce from the problem Edge 3-Coloring in cubic graphs, which is NP-hard [28]. In the forward direction, suppose that H is 3-edge-colorable by a proper coloring ϕ : E(H ) → {1, 2, 3}. For c ∈ {1, 2, 3}, let E c ⊆ E(H ) be the set of edges that receive color c under ϕ. Then there is a routing in G that, for every c ∈ {1, 2, 3}, routes all terminal pairs {(s, t) ∈ M | {s, t} ∈ E c } exclusively via the node v c (and thus via paths of length 2). Note that this routing indeed yields edge-disjoint paths. Otherwise there were an edge {s, v c } in E(H ) contained in at least two paths that route two terminal pairs {s, t 1 } and {s, t 2 }. Hence, the two edges in E(H ) corresponding to {s, t 1 } and {s, t 2 } would receive the same color c in ϕ; a contradiction to the proper edge-coloring ϕ as both edges are incident on s.
In the backward direction, suppose that all terminal pairs in M can be routed in G. Since H is cubic, any node s ∈ V (H ) is contained in three terminal pairs. Therefore, no path of the routing can have a node in V (H ) as an internal node and thus all paths in the routing have length 2. Then this routing naturally corresponds to a proper 3-edgecoloring ϕ of H , where any terminal pair {s, t} routed via v c ∈ {v 1 , v 2 , v 3 } means that we color the edge {s, t} ∈ E(H ) with color c under ϕ.
In order to show NP-hardness of MaxEDP for r = 1, we also reduce from Edge 3-Coloring in cubic graphs and perform a similar construction as described above: This time, we construct a bipartite graph G with one subset of the partition being {v 1 , v 2 }, the other being V (H ), and the set M of terminal pairs being again specified by the edges of H ; see Fig. 6a, c. This completes the reduction. The resulting graph G has feedback vertex set number r = 1.
We claim that H is 3-colorable if and only if we can route n = |V (H )| pairs in G.
In the forward direction, suppose that H is 3-edge-colorable by a proper coloring ϕ : E(H ) → {1, 2, 3}. For c ∈ {1, 2, 3}, let E c ⊆ E(H ) be the set of edges that receive color c under ϕ. Then there is a routing in G that, for every c ∈ {1, 2}, routes all terminal pairs {(s, t) ∈ M | {s, t} ∈ E c } exclusively via the node v c (and thus via paths of length 2). Note that the terminals corresponding to edges receiving color 3 remain unrouted. The reasoning that the resulting routing is feasible is analogous to the case of r = 2. To see that precisely n terminal pairs are routed overall, observe that, for each of the n terminals, exactly two of the three terminal pairs are routed.
In the backward direction, suppose that n terminal pairs in M can be routed in G. Since every terminal v in G has degree two, at most two paths can be routed for v. As n terminal pairs are realized, this also means that exactly two paths are routed for each terminal. Hence, none of the paths in the routing has length more than two. Otherwise, it would contain an internal node in V (H ), which then could not be part of two other paths in the routing. Then this routing naturally corresponds to a partial edge-coloring of H , where any terminal pair {s, t} routed via v c ∈ {v 1 , v 2 } implies that we color the edge {s, t} ∈ E(H ) with color c. Since each terminal v in V (H ) is involved in exactly two paths in the routing, exactly one terminal pair for v remains unrouted. Hence, exactly one edge incident on v in H remains uncolored in the partial coloring. We color all uncolored edges in H by color 3 to obtain a proper 3-edge-coloring.
Thus, we almost close the complexity gap for EDP with respect to the size of a minimum feedback vertex set, only leaving the complexity of the case r = 1 open.

Concluding remarks
In this paper, we examined the problems of routing terminal pairs by edge-and nodedisjoint paths in graphs of bounded feedback vertex set number r . We observed that our obtained approximability bounds, expressed in terms of r , either strengthen best known bounds or they are almost tight. This leads us to the conclusion that the parameter r in fact captures the "difficulty" of disjoint paths problems.
In particular, for MaxEDP, we obtained a constant-factor approximation algorithm with congestion logarithmic in k + r , where k is the number of terminal pairs. This strengthens the bound obtained by directly applying the randomized rounding technique for LPs introduced by Raghavan and Thompson [40]. Though also we applied this technique, beforehand we appropriately modified the fractional LP solution by making use of the forest that one obtains when removing the feedback vertex set from the graph. For our next result, we used the solution above to extract OPT * /O( √ r log(kr)) edge-disjoint paths out of it, where OPT * denotes the value of an optimum fractional solution. This strengthens, up to a logarithmic factor, the best known bound of OPT * /O( √ n) [11]. We achieved our result by contracting "redundant" edges in the input graph and in the routing which lead to an "irreducible" routing from which we could greedily pick up our solution. The result shows that in order to improve the best known bound it suffices to focus only on graphs with feedback vertex set number close to n.
We also complemented the upper bounds with hardness results. We observed that the complexities of both problems, routing node-disjoint paths and edge-disjoint-paths, differ when r is constant. Whereas NDP [43] and MaxNDP are efficiently solvable for any constant r , EDP and MaxEDP are NP-hard even for r = 2 and r = 1, respectively. Here, the complexity of EDP remains open for r = 1 and we conjecture that this case can be solved in polynomial time. When considering r as part of the input, we can separate NDP and MaxNDP (if FPT = W [1]). We showed W[1]-hardness of MaxNDP when parameterized by r , whereas NDP is fixed-parameterized tractable in r [43]. However, we were able to provide a fixed-parameter algorithm for the combined parameter k + r .