[go: up one dir, main page]

Academia.eduAcademia.edu
New Inapproximability Bounds for TSP arXiv:1303.6437v2 [cs.CC] 10 Jun 2013 Marek Karpinski∗ Michael Lampis† Richard Schmied‡ Abstract In this paper, we study the approximability of the metric Traveling Salesman Problem (TSP) and prove new explicit inapproximability bounds for that problem. The best up to now known hardness of approximation bounds were 185/184 for the symmetric case (due to Lampis) and 117/116 for the asymmetric case (due to Papadimitriou and Vempala). We construct here two new bounded occurrence CSP reductions which improve these bounds to 123/122 and 75/74, respectively. The latter bound is the first improvement in more than a decade for the case of the asymmetric TSP. One of our main tools, which may be of independent interest, is a new construction of a bounded degree wheel amplifier used in the proof of our results. 1 Introduction The Traveling Salesman Problem (TSP) is one of the best known and most fundamental problems in combinatorial optimization. Determining how well it can be approximated in polynomial time is therefore a major open problem, albeit one for which the solution still seems elusive. On the algorithmic side, the best known efficient approximation algorithm for the symmetric case is still a 35-year old algorithm due to Christofides [C76] which achieves an approximation ratio of 3/2. However, recently there has been a string of improved results for the interesting special case of Graphic TSP, improving the ratio to 7/5 [OSS11, MS11, M12, SV12]. For the asymmetric case (ATSP), it is not yet known if a constant-factor approximation is even possible, with the best known algorithm achieving a ratio of O(log n/ log log n) [AGM+ 10]. Unfortunately, there is still a huge gap between the algorithmic results mentioned above and the best currently known hardness of approximation results for TSP and ATSP. For both problems, the known inapproximability thresholds are small constants (185/184 ∗ Dept. of Computer Science and the Hausdorff Center for Mathematics, University of Bonn. Supported in part by DFG grants and the Hausdorff Grant EXC59-1/2. Email: marek@cs.uni-bonn.de † KTH Royal Institute of Technology. Research supported by ERC Grant 226203 Email: mlampis@kth.se ‡ Dept. of Computer Science, University of Bonn. Work supported by Hausdorff Doctoral Fellowship. Email: schmied@cs.uni-bonn.de 1 and 117/116 (cf. [L12, PV00]), respectively). In this paper, we try to improve this situation somehow by giving modular hardness reductions that slightly improve the hardness bounds for both problems to 123/122 and 75/74, respectively. The latter bound is the first, for more than a decade now, improvement of Papadimitriou and Vempala bound [PV00] for the ATSP. The method of our solution differs essentially from that of [PV00] and uses some new paradigms of the bounded occurrence optimization which could be also of independent interest in other applications. Similarly to [L12], the hope is that the modularity of our construction, which goes through an intermediate stage of a bounded-occurrence Constraint Satisfaction Problem (CSP), will allow an easier analysis and simplify future improvements. Indeed, one of the main new ideas we rely on is a certain new variation of the wheel amplifiers first defined by Berman and Karpinski [BK01] to establish inapproximability for 3-regular CSPs. This construction, which may be of independent interest, allows us to establish inapproximability for a 3-regular CSP with a special structure. This special structure then makes it possible to simulate many of the constraints in the produced graph essentially “for free”, without using gadgets to represent them. Thus, even though for the remaining constraints we mostly reuse gadgets which have already appeared in the literature, we are still able to obtain improved bounds. Let us now recall some of the previous work on the hardness of approximation of TSP and ATSP. Papadimitriou and Yannakakis [PY93] were the first to construct a reduction that, combined with the PCP Theorem [ALM+ 98], gave a constant inapproximability threshold, though the constant was not more than 1 + 10−6 for the TSP with distances either one or two. Engebretsen [E03] gave the first explicit approximation lower bound of 5381/5380 for the problem. The inapproximability factor was improved to 3813/3812 by Böckenhauer and Seibert [BS00], who studied the restricted version of the TSP with distances one, two and three. Papadimitriou and Vempala [PV00] proved that it is NP hard to approximate the TSP with a factor better than 220/219. Presently, the best known approximation lower bound is 185/184 due to Lampis [L12]. The important restriction of the TSP, in which we consider instances with distances between cities being values in {1, . . . , B}, is often referred to as the (1, B)-TSP. The best known efficient approximation algorithm for the (1, 2)-TSP has an approximation ratio 8/7 and is due to Berman and Karpinski [BK06]. As for lower bounds, Engebretsen and Karpinski [EK06] gave inapproximability thresholds for the (1, B)-TSP problem of 741/740 for B = 2 and 389/388 for B = 8. More recently, Karpinski and Schmied [KS12, KS13] obtained improved inapproximability factors for the (1, 2)-TSP and the (1, 4)-TSP of 535/534 and 337/336, respectively. For ATSP the currently best known approximation lower bound was 117/116 due to Papadimitriou and Vempala [PV00]. When we restrict the problem to distances with values in {1, . . . , B}, there is a simple approximation algorithm with approximation ratio B that constructs an arbitrary tour as solution. Bläser [B04] gave an efficient approximation algorithm for the (1, 2)-ATSP with approximation ratio 5/4. Karpinski and Schmied [KS12, KS13] proved that it is NP -hard to approximate the (1, 2)-ATSP and the (1, 4)-ATSP within any factor less than 207/206 and 141/140, respectively. For the case B = 8, Engebretsen 2 and Karpinski [EK06] gave an inapproximability threshold of 135/134. Overview: In this paper we give a hardness proof which proceeds in two steps. First, we start from the MAX-E3-LIN2 problem, in which we are given a system of linear equations mod 2 with exactly three variables in each equation and we want to find an assignment such as to maximize the number of satisfied equations. Optimal inapproximability results for this problem were shown by Håstad [H01]. We reduce this problem to a special case where variables appear exactly 3 times and the linear equations have a particular structure. The main tool here is a new variant of the wheel amplifier graphs of Berman and Karpinski [BK01]. In the second step, we reduce this 3-regular CSP to TSP and ATSP. The general construction is similar in both cases, though of course we use different gadgets for the two problems. The gadgets we use are mostly variations of gadgets which have already appeared in previous reductions. Nevertheless, we manage to obtain an improvement by exploiting the special properties of the 3-regular CSP. In particular, we show that it is only necessary to construct gadgets for roughly one third of the constraints of the CSP instance, while the remaining constraints are simulated without additional cost using the consistency properties of our gadgets. This idea may be useful in improving the efficiency of approximation-hardness reductions for other problems. Thus, overall we follow an approach unlike that of [PV00], where the reduction is performed in one step, and closer to [L12]. The improvement over [L12] comes mainly from the idea mentioned above, which is made possible using the new wheel amplifiers, as well as several other tweaks. The end result is a more economical reduction which improves the bounds for both TSP and ATSP. An interesting question may be whether our techniques can also be used to derive improved inapproximability results for variants of the ATSP and TSP (cf. [EK06],[KS13] and [KS12]) or other graph problems, such as the Steiner Tree problem. 2 Preliminaries In the following, we give some definitions concerning directed (multi-)graphs and omit the corresponding definitions for undirected (multi-)graphs if they follow from the directed case. Given a directed graph G = (V (G), E(G)) and E ′ ⊆ E(G), for e = (x, y) ∈ E(G), we S define V (e) = {x, y} and V (E ′ ) = e∈E ′ V (e). For convenience, we abbreviate a sequence of edges (x1 , x2 ), (x2 , x3 ), . . . , (xn−1 , xn ) by x1 → x2 → x3 → . . . → xn−1 → xn . In the undirected case, we use sometimes x1 − x2 − x3 − . . . − xn−1 − xn instead of {x1 , x2 }, {x2 , x3 }, . . . , {xn−1 , xn }. Given a directed (multi-)graph G, an Eulerian cycle in G is a directed cycle that traverses all edges of G exactly once. We refer to G as Eulerian, if there exists an Eulerian cycle in G. For a multiset ET of directed edges and v ∈ V (ET ), we define the outdegree (indegree) of v with respect to ET , denoted by outdT (v) (indT (v)), to be the number of edges in ET that are outgoing of (incoming to) v. The balance of a vertex v with respect to ET is defined as balT (v) = indT (v) − outdT (v). In the case of a multiset ET of undirected edges, we define the balance balT (v) of a vertex v ∈ V (ET ) 3 to be one if the number of incident edges in ET is odd and zero otherwise. We refer to vertices v ∈ V (ET ) with balT (v) = 0 as balanced with respect to ET . It is well known that a (directed) (multi-)graph G = (V (G), E(G)) is Eulerian if and only if all edges are in the same (weakly) connected component and all vertices v ∈ V (G) are balanced with respect to E(G). Given a multiset of edges ET , we denote by conT the number of (weakly) connected components in the graph induced by ET . A quasi-tour ET in a (directed) graph G is a multiset of edges from E(G) such that all vertices are balanced with respect to ET and V (ET ) = V (G). We refer to a quasi-tour ET in G as a tour if conP T = 1. Given a cost function w : E(G) → R+ , the cost of a quasi-tour ET in G is defined by e∈ET w(e) + 2(conT − 1). In the Asymmetric Traveling Salesman problem (ATSP), we are given a directed graph G = (V (G), E(G)) with positive weights on edges and P we want to find an ordering v1 , . . . , vn of the vertices such as to minimize dG (vn , v1 ) + i∈[n−1] dG (vi , vi+1 ), where dG denotes the shortest path distance in G. In this paper, we will use the following equivalent reformulation of the ATSP: Given a directed graph G with weights on edges, we want to find a tour ET in G, that is, a spanning connected multi-set of edges that balances all vertices, with minimum cost. The metric Traveling Salesman problem (TSP) is the special case of the ATSP, in which instances are undirected graphs with positive weights on edges. 3 Bi-Wheel Amplifiers In this section, we define the bi-wheel amplifier graphs which will be our main tool for proving hardness of approximation for a bounded occurrence CSP with some special properties. Bi-wheel amplifiers are a simple variation of the wheel amplifier graphs given in [BK01]. Let us first recall some definitions (see also [BK03]). If G is an undirected graph and X ⊂ V (G) a set of vertices, we say that G is a ∆-regular amplifier for X if the following conditions hold: • All vertices of X have degree ∆ − 1 and all vertices of V (G)\X have degree ∆. • For every non-empty subset U ⊂ V (G), we have the condition that |E(U, V (G)\U)| ≥ min{ |U ∩ X|, |(V (G)\U) ∩ X| }, where E(U, V (G)\U) is the set of edges with exactly one endpoint in U. We refer to the set X as the set of contact vertices and to V (G)\X as the set of checker vertices. Amplifier graphs are useful in proving inapproximability for CSPs, in which every variable appears a bounded number of times. Here, we will rely on 3-regular amplifiers. A probabilistic argument for the existence of such graphs was given in [BK01], with the definition of wheel amplifiers. A wheel amplifier with 2n contact vertices is constructed as follows: first construct a cycle on 14n vertices. Number the vertices 1, . . . , 14n and select uniformly at random a perfect matching of the vertices whose number is not a multiple of 7. The matched 4 vertices will be our checker vertices, and the rest our contacts. It is easy to see that the degree requirements are satisfied. Berman and Karpinski [BK01] gave a probabilistic argument to prove that with high probability the above construction indeed produces an amplifier graph, that is, all partitions of the sets of vertices give large cuts. Here, we will use a slight variation of this construction, called a bi-wheel. A bi-wheel amplifier with 2n contact vertices is constructed as follows: first construct two disjoint cycles, each on 7n vertices and number the vertices of each 1, . . . , 7n. The contacts will again be the vertices whose number is a multiple of 7, while the remaining vertices will be checkers. To complete the construction, select uniformly at random a perfect matching from the checkers of one cycle to the checkers of the other. Intuitively, the reason that amplifiers are a suitable tool here is that, given a CSP instance, we can use a wheel amplifier to replace a variable that appears 2n times with 14n new variables (one for each wheel vertex) each of which appears 3 times. Each appearance of the original variable is represented by a contact vertex and for each edge of the wheel we add an equality constraint between the corresponding variables. We can then use the property that all partitions give large cuts to argue that in an optimal assignment all the new vertices take the same value. We use the bi-wheel amplifier in our construction in a similar way. The main difference is that while cycle edges will correspond to equality constraints, matching edges will correspond to inequality constraints. The contacts of one cycle will represent the positive appearances of the original variable, and the contacts of the other the negative ones. The reason we do this is that we can encode inequality constraints more efficiently than equality with a TSP gadget, while the equality constraints that arise from the cycles will be encoded in our construction “for free” using the consistency of the inequality gadgets. Before we apply the construction however, we have to prove that the bi-wheel amplifiers still have the desired amplification properties. Theorem 1. With high probability, bi-wheels are 3-regular amplifiers. Proof. Exploiting the similarity between bi-wheels and the standard wheel amplifiers of [BK01], we will essentially reuse the proof given there. First, some definitions: We say that U is a bad set if the size of its cut is too small, violating the second property of amplifiers. We say that it is a minimal bad set if U is bad but removing any vertex from U gives a set that is not bad. Recall the strategy of the proof from [BK01]: for each partition of the vertices into U and V (G)\U, they calculate the probability (over the random matchings) that this partition gives a minimal bad set. Then, they take the sum of these probabilities over all potentially minimal bad sets and prove that the sum is at most γ −n for some constant γ < 1. It follows by union bound that with high probability, no set is a minimal bad set and therefore, the graph is a proper amplifier. Our first observation is the following: consider a wheel amplifier on 14n vertices where, rather than selecting uniformly at random a perfect matching among the checkers, we select uniformly at random a perfect matching from checkers with labels in the set 5 {1, . . . , 7n − 1} to checkers with labels in the set {7n + 1, . . . , 14n − 1}. This graph is almost isomorphic to a bi-wheel. More specifically, for each bi-wheel, we can obtain a graph of this form by rewiring two edges, and vice-versa. It easily follows that properties that hold for this graph, asymptotically with high probability also hold for the bi-wheel. Thus, we just need to prove that a wheel amplifier still has the amplification property if, rather than selecting a random perfect matching, we select a random matching from one half of the checker vertices to the other. We will show this by proving that, for each set of vertices S, the probability that S is a minimal bad set is roughly the same in both cases. After establishing this fact, we can simply rely on the proof of [BK01]. Recall that the wheel has 12n checker vertices. Given a set S with |S| = u, what is the probability that exactly c edges have exactly one endpoint in S? In a standard wheel amplifier the probability is    u 12n − u c!(u − c)!!(12n − u − c)!! P (u, c) = , c c (12n)!! where we denote by n!! the product of all odd natural numbers less than or equal to n, and we assume without loss of generality that u − c is even. Let us explain this: the probability that exactly c edges cross the cut in this graph is equal to the number of ways we can choose their endpoints in S and in its complement, times the number of ways we can match the endpoints, times the number of matchings of the remaining vertices, divided by the number of matchings overall. How much does this probability change if we only allow matchings from one half of the checkers to the other? Intuitively, we need to consider two possibilities: one is that S is a balanced set, containing an equal number of checkers from each side, while the other is that S is unbalanced. It is not hard to see that if S is unbalanced, then, we can easily establish that the cut must be large. Thus, the main interesting case is the balanced one (and we will establish this fact more formally). Suppose that |S| = u and S contains exactly u/2 checkers from each side. Then the probability that there are exactly c edges crossing the cut is  u 2  12n−u 2   u−c  12n−u−c  ! c 2 2 ! ′ 2 2 ! P (u, c) = 2c c 2 (6n)! 2 2 Let us explain this. If S is balanced and there are c matching edges with exactly one endpoint in S, then, exactly c/2 of them must be incident on a vertex of S on each side, since the remaining vertices of S must have a perfect matching. Again, we pick the endpoints on each side, and on the complement of S, select a way to match them, select matchings on the remaining vertices and divide by the number of possible perfect matchings.  √ n 2 −n Using Stirling formulas, it is not hard to see that ! = Θ(n!2 n). Also n!! = 2  n/2 n ′ Θ( 2 !2 ). It follows that P is roughly the same as P in this case, modulo some polynomial factors which are not significant since the probabilities we are calculating are exponentially small. 6 Let us now also show that if S is unbalanced, the probability that it is a minimal bad set is even smaller. First, observe that if S is a minimal bad set whose cut has c edges, we have c ≤ u/6. The reason for this is that since S is bad, then, c is smaller than the number of contacts in S minus the number of cycle edges cut. It is not hard to see that, in each fragment, that is, each subset of S made up of a contiguous part of the cycle, two cycle edges are cut. Thus, the extra edges we need for the contacts the fragment contains are at most 1/6 of its checkers. Suppose now that S contains u/2 + k checkers on one side and u/2 − k checkers on the other. The probability that c matching edges have one endpoint in S is u  u  12n−u  12n−u   c  u−c ! 12n−u−c ! c + k − k − k + k 2 2 2 2 +k ! −k ! 2 P ′′ (u, c, k) = 2c c c c 2 2 (6n)! + k − k − k + k 2 2 2 2 The reasoning is the same as before, except we observe that we need to select more endpoints on the side where S is larger, since after we remove checkers matched to outside vertices S must have a perfect matching. Observe that for k = 0 this gives P ′ . We will show that for the range of values we care about P ′′ achieves a maximum for k = 0, and can thus be upper-bounded by (essentially) P , which is the probability that a set is bad in the standard amplifier. The rest of the proof follows from the argument given in [BK01]. In particular, we can assume that k ≤ c/2, since 2k edges are cut with probability 1. To show that the maximum is achieved for k = 0, we look at P ′′ (u, c, k + 1)/P ′′(u, c, k). We will show that this is less than 1. Using the identity n+1 , we get / nk = n+1 k+1 k+1 P ′′ (u, c, k + 1) = P ′′ (u, c, k)     2k + 1 2k + 1 2k + 1 1+ u 1 + 12n−u 1− c +k+1 −k −k 2 2 2 Using the fact that 1 + x < ex , we end up needing to prove the following. c 2 2k + 1 2k + 1 2k + 1 > u + 12n−u +k+1 −k −k 2 2 (1) Combining that without loss of generality u ≤ 6n holds with the bounds of c and k we have already mentioned, the inequality (1) is straightforward to establish. 4 Hybrid Problem By using the bi-wheel amplifier from the previous section, we are going to prove hardness of approximation for a bounded occurrence CSP with very special properties. This particular CSP will be well-suited for constructing a reduction to the TSP given in the next section. As the starting point of our reduction, we make use of the inapproximability result due to Håstad [H01] for the MAX-E3LIN2 problem, which is defined as follows: Given a system I1 of linear equations mod 2, in which each equation is of the form xi ⊕ xj ⊕ xk = bijk with 7 bijk ∈ {0, 1}, we want to find an assignment to the variables of I1 such as to maximize the number of satisfied equations. Let I1 be an instance of the MAX-E3LIN2 problem and {xi }νi=1 the set of variables, that appear in I1 . We denote by d(i) the number of appearances of xi in I1 . Theorem 2 (Håstad [H01]). For every ǫ > 0, there exists a constant Bǫ such that given an instance I1 of the MAX-E3LIN2 problem with m equations and maxi∈[ν] d(i) ≤ Bǫ , it is NP hard to decide whether there is an assignment that leaves at most ǫ · m equations unsatisfied, or all assignment leave at least (0.5 − ǫ)m equations unsatisfied. Similarly to the work by Berman and Karpinski [BK99] (see also [BK01] and [BK03]), we will reduce the number of occurrences of each variable to 3. For this, we will use our amplifier construction to create special instances of the Hybrid problem, which is defined as follows: Given a system I2 of linear equations mod 2 with either three or two variables in each equation, we want to find an assignment such as to maximize the number of satisfied equations. In particular, we are going to prove the following theorem. Theorem 3. For every constant ǫ > 0 and b ∈ {0, 1}, there exist instances of the Hybrid problem with 31m equations such that: (i) Each variable occurs exactly three times. (ii) 21m equations are of the form x ⊕ y = 0, 9m equations are of the form x ⊕ y = 1 and m equations are of the form x ⊕ y ⊕ z = b . (iii) It is NP -hard to decide whether there is an assignment to the variables that leaves at most ǫ · m equations unsatisfied, or every assignment to the variables leaves at least (0.5 − ǫ)m equations unsatisfied. Proof. Let ǫ > 0 be a constant and I1 an instance of the MAX-E3LIN2 problem with maxi∈[ν] d(i) ≤ Bǫ . For a fixed b ∈ {0, 1}, we can flip some of the literals such that all equations in the instance I1 are of the form x ⊕ y ⊕ z = b, where x, y, z are variables or negations. By constructing three more copies of each equation, in which all possible pairs of literals appear negated, we may assume that each variable occurs the same number of times negated as unnegated. Let us fix a variable xi in I1 . Then, we create 7 · d(i) = 2 · α new variables V ar(i) = ui α {xj , xni j }j=1 . In addition, we construct a bi-wheel amplifier Wi on 2 · α vertices (that is, a bi-wheel with d(i) contact vertices) with the properties described in Theorem 1. Since d(i) ≤ Bǫ is a constant, this can be accomplished in constant time. In the remainder, we refer to contact and checker variables as the elements in V ar(i), whose corresponding index is a contact and checker vertex in Wi , respectively. We denote by M(Wi ) ⊆ E(Wi ) the associated perfect matching on the set of checker vertices of Wi . In addition, we denote by Cn (Wi ) and Cu (Wi ) the set of edges contained in the first and second cycle of Wi , respectively. Let us now define the equations of the corresponding instance of the Hybrid problem. ni For each edge {j, k} ∈ M(Wi ), we create the equation xui j ⊕ xk = 1 and refer to equations of this form as matching equations. On the other hand, for each edge {l, t} in the cycle qi Cq (Wi ) with q ∈ {u, n}, we introduce the equation xqi l ⊕ xt = 0. Equations of this form will 8 be called cycle equations. Finally, we replace the j-th unnegated appearance of xi in I1 by the contact variable xui λ with λ = 7 · j, whereas the j-th negated appearance is replaced by ni xλ . The former construction yields m equations with three variables in the instance of the Hybrid problem, which we will denote by I2 . Notice that each variable appears in exactly 3 equations in I2 . Clearly, we have |I2 | = 31m equations, thereof 9m matching equations, 21m cycle equations and m equations of the form x ⊕ y ⊕ z = b. ni A consistent assignment to V ar(i) is an assignment with xui j = b and xj = (1 − b) for all j ∈ [α], where b ∈ {0, 1}. A consistent assignment to the variables of I2 is an assignment that is consistent to V ar(i) for all i ∈ [ν]. By standard arguments using the amplifier constructed in Theorem 1, it is possible to convert an assignment to a consistent assignment without decreasing the number of satisfied equations and the proof of Theorem 3 follows. 5 TSP This section is devoted to the proof of the following theorem. Theorem 4. It is NP -hard to approximate the TSP to within any constant approximation ratio less than 123/122. Let us first sketch the high-level idea of the construction. Starting with an instance of the Hybrid problem, we will construct a graph, where gadgets represent the equations. We will design gadgets for equations of size three (Figure 1) and for equations of size two corresponding to matching edges of the bi-wheel (Figure 2). We will not construct gadgets for the cycle edges of the bi-wheel; instead, the connections between the matching edge gadgets will be sufficient to encode these extra constraints. This may seem counterintuitive at first, but the idea here is that if the gadgets for the matching edges are used in a consistent way (that is, the tour enters and exits in the intended way) then it follows that the tour is using all edges corresponding to one wheel and none from the other. Thus, if we prove consistency for the matching edge gadgets, we implicitly get the cycle edges “for free”. This observation, along with an improved gadget for size-three equations and the elimination of the variable part of the graph, are the main sources of improvement over the construction of [L12]. 5.1 Construction We are going to describe the construction that encodes an instance I2 of the Hybrid problem into an instance of the TSP problem. Due to Theorem 3, we may assume that the equations with three variables in I2 are all of the form x ⊕ y ⊕ z = 0. In order to ensure that some edges are to be used at least once in any valid tour, we apply the following simple trick that was already used in the work by Lampis [L12]: Let e be an edge with weight w that we want to be traversed by every tour. We remove e and 9 replace it with a path of L edges and L − 1 newly created vertices each of degree two, where we think of L as a large constant. Each of the L edges has weight w/L and any tour that fails to traverse at least two newly created edges is not connected. On the other hand, a tour that traverses all but one of those edges can be extended by adding two copies of the unused edge increasing the cost of the underlying tour by a negligible value. In summary, we may assume that our construction contains forced edges that need to be traversed at least once by any tour. If x and y are vertices, which are connected by a forced edge e, we write {x, y}F or simply x−F y. In the following, we refer to unforced edges e with w(e) = 1 as simple. All unforced edges in our construction will be simple. Let us start with the description of the corresponding graph GS : For each bi-wheel Wp , we will construct the subgraph Gp of GS . For each vertex of the bi-wheel, we create a vertex in the graph and for each cycle equation x ⊕ y = 0, we create a simple edge {x, y}. Given a matching equation between two checkers xui ⊕ xnj = 1, we connect the vertices xui and xnj with two forced edges {xui , xnj }1F and {xui , xnj }2F . We have w({xui , xnj }iF ) = 2 for each i ∈ {1, 2}. Additionally, we create a central vertex s that is connected to gadgets simulating equations with three variables. Let x ⊕ y ⊕ z = 0 be the j-th equation with three variables in I2 . We now create the graph G3S j displayed in Figure 1 (a), where the (contact) vertices for x, y, z have already been constructed in the cycles. The edges {γ α , γ}F with α ∈ {r, l} and γ ∈ {x, z, y} are all forced edges with w({γ α , γ}F ) = 1.5. Furthermore, we have w({eαj , s}F ) = 0.5 for all α ∈ {r, l}. {erj , s}F and {elj , s}F are both forced edges, whereas all remaining edges of G3S j are simple. This is the whole description of GS . tj x elj xl xr e1j erj y yr yl xl z r s xr l y z yr e2j zl e3j zr zl s sj (a) (b) Figure 1: Gadgets simulating equations with three variables in the symmetric case (a) and in the asymmetric case (b). Dotted and straight lines represent forced and simple edges, respectively. 10 5.2 Tour from Assignment Given an instance I2 of the Hybrid problem and an assignment φ to the variables in I2 , we are going to construct a tour in GS according to φ and give the proof of one direction of the reduction. In particular, we are going to prove the following lemma. Lemma 1. If there is an assignment to the variables of a given instance I2 of the Hybrid problem with 31m equations and ν bi-wheels, that leaves k equations unsatisfied, then, there exists a tour in GS with cost at most 61m + 2ν + k + 2. Before we proceed, let us give a useful definition. Let G be an edge-weighted graph and ET a multi-set of edges of E(G) that defines a quasi-tour. Consider a set V ′ ⊆ V (G). The local edge cost of the set V ′ is then defined as cT (V ′ ) = X X u∈V ′ e∈ET , e={u,v} w(e) 2 In words, for each vertex in V ′ , we count half the total weight of its incident edges used in the quasi-tour (including multiplicities). Observe that this sum contains half the weight of edges with one endpoint in V ′ but the full weight for edges with both endpoints in V ′ (since we count both endpoints in the sum). Also note that for two sets V1 , P V2 , we have cT (V1 ∪ V2 ) ≤ cT (V1 ) + cT (V2 ) (with equality for disjoint sets) and that cT (V ) = e∈ET w(e). Proof. First, note that it is sufficient to prove that we can construct a quasi-tour of the promised cost which uses all forced edges exactly once. Since all unforced edges have cost 1, if we are given a quasi-tour we can connect two disconnected components by using an unforced edge that connects them twice (this is always possible since the underlying graph we constructed is connected). This does not increase the cost, since we added two unit-weight edges and decreased the number of components. Repeating this results in a connected tour. Let {Wa }νa=1 be the associated set of bi-wheels of I2 . For a fixed bi-wheel Wp , let {xui , xni }zi=1 be its associated set of variables. Due to the construction of instances of the Hybrid problem in Section 4, we may assume that all equations with two variables are satisfied by the given assignment. Thus, we have xui 6= xnj , xui = xuj and xni = xnj for all i, j ∈ [z]. Assuming xα1 = 1 for some α ∈ {u, n}, we use once all simple edges {xαi , xαi+1 } with i ∈ [z − 1] and the edge {xαz , xα1 }. We also use all forced edges corresponding to matching equations once. In other words, for each biwheel we select the cycle that corresponds to the assignment 1 and use all the simple edges from that cycle. This creates a component that contains all checker vertices from both cycles and all contacts from one cycle. As for the next step, we are going to describe the tour traversing G3S j with j ∈ [m] simulates x ⊕ y ⊕ z = 0. given an assignment to contact variables. Let us assume that G3S j 3S According to the assignment to x, y and z, we will traverse Gj as follows: In all cases, we will use all forced edges once. Case (x + y + z = 2): Then, we use {γ l , γ r } for all α ∈ {r, l} and γ ∈ {x, y, z} with γ = 1. 11 xui−1 xui xui+1 xnj−1 xnj xnj+1 xnu(i−1) xun(j−1) (a) xui xnj xnj+1 xui+1 (b) Figure 2: Gadget simulating equation with two variables in symmetric case (a) and in the asymmetric case (b). Dotted and straight lines represent forced and simple edges, respectively. For δ ∈ {x, z, y} with δ = 0, we use {eαj , δ α } for all α ∈ {r, l}. Case (x + y + z = b with b ∈ {0, 1}): In both cases, we traverse {γ α , eαj } for all γ ∈ {x, y, z} and α ∈ {r, l}. Case (x + y + z = 3): We use {γ r , γ l } with γ ∈ {y, z}. Furthermore, we include {xα , eαj } for both α ∈ {r, l}. Let us now analyze the cost of the edges of our quasi-tour given an assignment. For each matching edge {xui , xnj } consider the set of vertices made up of its endpoints. Its local cost is 5: we pay 4 for the forced edges and there are two used simple edges with one endpoint in the set. Let us also consider the local cost for a size-three equation gadget, where we consider the set to contain the contact vertices {x, y, z} as well the other 8 vertices of the gadget. The local cost here is 9.5 for the forced edges. We also pay 6 more (for a total of 15.5) when the assignment satisfies the equation or 7 more when it does not. Thus, we have given a covering of the vertices of the graph by 9m sets of size two, m sets of size 11 and {s}. The total edge cost is thus at most 5·9m+15.5·m+0.5·m+k = 61m+k. To obtain an upper bound on the cost of the quasi-tour, we observe that the tour has at most ν + 1 components (one for each bi-wheel and one containing s). The lemma follows. 5.3 Assignment from Tour In this section, we are going prove the other direction of our reduction. Given a tour in GS , we are going to define an assignment to the variables of the associated instance of the Hybrid problem and give the proof of the following lemma. Lemma 2. If there is a tour in GS with cost 61m + k − 2, then, there is an assignment to the variables of the corresponding instance of the Hybrid problem that leaves at most k equations unsatisfied. 12 Again, let us give a useful definition. Consider a quasi-tour ET and a set V ′ ⊆ V (G). Let conT (V ′ ) be the number of connected components induced by ET which are fully contained in V ′ . Then, the full local cost of the set V ′ is defined as cFT (V ′ ) = cT (V ′ ) + 2conT (V ′ ). By the definition, the full local cost of V (G) is equal to the cost of the quasi-tour (plus 2). Intuitively, cFT (V ′ ) captures the cost of the quasi-tour restricted to V ′ : it includes the cost of edges and the cost of added connected components. Note that now for two disjoint sets V1 , V2 we have cFT (V1 ∪V2 ) ≥ cFT (V1 )+cFT (V2 ) since V1 ∪V2 could contain more connected components than V1 , V2 together. If we know that the total cost of the quasi-tour is small, then cFT (V ) is small (less than 61m + k). We can use this to infer that the sum of the local full costs of all gadgets is small. The high-level idea of the proof is the following: we will use roughly the same partition of V (G) into sets as in the proof of Lemma 1. For each set, we will give a lower bound on its full local cost for any quasi-tour, which will be equal to what the tour we constructed in Lemma 1 pays. If a given quasi-tour behaves differently its local cost will be higher. The difference between the actual local cost and the lower bound is called the credit of that part of the graph. We construct an assignment for I2 and show that the total sum of credits is higher that the number of unsatisfied equations. But using the reasoning of the previous paragraph, the total sum of credits will be at most k. Proof. We are going to prove a slightly stronger statement and show that if there exists a quasi-tour in GS with cost 61m + k − 2, then, there exists an assignment leaving at most k equations unsatisfied. Recall that the existence of a tour in GS with cost C implies the existence of a quasi-tour in GS with cost at most C. We may assume that simple edges are contained only once in ET due to the following preprocessing step: If ET contains two copies of the same simple edge, we remove them without increasing the cost, since the number of components can only increase by one. In the following, given a quasi-tour ET in GS , we are going to define an assignment φT and analyze the number of satisfied equations by φT compared to the cost of the quasi-tour. The general idea is that each vertex of GS that corresponds to a variable of I2 has exactly two forced and exactly two simple edges incident to it. If the forced edges are used once each, the variable is called honest. We set it to 1 if the simple edges are both used once and to 0 otherwise. It is not hard to see that, because simple cycle edges connect vertices that represent the variables, this procedure will satisfy all cycle equations involving honest variables. We then argue that if other equations are unsatisfied the tour is also paying extra, and the same happens if a variable is dishonest. Let us give more details. First, we concentrate on the assignment for checker variables. Assignment for Checker Variables Let us consider the following equations with two variables xui−1 ⊕ xui = 0, xui ⊕ xui+1 = 0, xnj−1 ⊕ xnj = 0, xnj ⊕ xnj+1 = 0 and xui ⊕ xnj = 1. We are going to analyze the cost of a quasitour traversing the gadget displayed in Figure 2 (a) and define an assignment according 13 to ET . Let us first assume that our quasi-tour is honest, that is, the underlying quasi-tour traverses forced edges only once. Honest tours: For x ∈ {xui , xnj }, we set x = 1 if the quasi-tour traverses both simple edges incident on x and x = 0, otherwise. Since we removed all copies of the same simple edge, we may assume that cycle equations are always satisfied. If the tour uses xui−1 − xui −F xnj −F xui − xui+1 , we get xui−1 = xui+1 = 1, xnj−1 = xnj+1 = 0 and 5 satisfied equations. Given xnj−1 − xnj −F xui −F xnj − xnj+1 , we obtain 5 satisfied equations as well. Let us define Vip := {xui , xnj }. Notice that in both cases, we have local cost cFT (Vip ) = 5. We claim that cFT (Vip ) ≥ 5 for a valid quasi-tour. In order to obtain a valid quasi-tour, we need to traverse both forced edges in Gpi and use at least two simple edges, as otherwise, it implies cFT (Gpi ) ≥ 6. Given a quasi-tour ET , we introduce a local credit function defined by crT (Vip ) = cFT (Vip ) − 5. If xui −F xnj −F xui forms a connected component, we get 4 satisfied equations and crT (Vip ) = 1, which is sufficient to pay for the unsatisfied equation xui ⊕ xnj = 1. On the other hand, assuming xui−1 = xui+1 = 1 and xnj−1 = xnj+1 = 1, we get crT (Vip ) = 1 and 1 unsatisfied equation. Dishonest tours: We are going to analyze quasi-tours, which are using one of the forced edges twice. By setting xui 6= xnj , we are able to find an assignment that always satisfies xui ⊕ xnj = 1 and two other equations out of the five that involve these dishonest variables. The local cost in this case is at least 7. Hence, the credit crT (Vip ) = 2 is sufficient to pay for the two unsatisfied equations. Assignment for Contact Variables Again, we will distinguish between honest tours (which use forced edges exactly once) and dishonest tours. This time we are interested in seven equations: the size-three equation x ⊕ y ⊕ z = 0 and the six cycle equations containing the three contacts. Observe that the local cost of Vj3S := {xr , xl , x, y r , y l, y, z r , z l , z, erj , elj } is at least 15.5. The local edge cost of any quasi-tour is 9.5 for the forced edges. For each component {γ, γ l , γ r } with γ ∈ {x, y, z}, we need to pay at least 2 more because there are two vertices with odd degree (γ l , γ r ) and we also need to connect the component to the rest of the graph (otherwise the component already costs 2 more). Let us define the credit of Vj3S with respect to ET by crT = cFT (Vj3S ) − 15.5. Honest tours: For each γ ∈ {x, y, z}, we set γ = 1 if the tour uses both simple edges incident on γ and 0, otherwise. Notice that in the case (x + y + z = b) with b ∈ {0, 2}, this satisfies all seven equations and the tour has local cost at least cFT (Vj3S ) = 15.5. Case (x = y = z = 1) : The assignment now failed to satisfy the size-three equation, so we need to prove that the quasi-tour has local cost at least 16.5. Since all vertices are balanced with respect to ET , the quasi-tour has to use at least one edge incident on erj and elj besides {s, erj }F and {s, elj }F . If the quasi-tour takes {eαj , γ α } for a γ ∈ {x, y, z} and all α ∈ {r, l}, since all simple edges incident on x, y, z are used, we get at total cost of at least 16.5, which gives a credit of 1. Case (x + y + z = 1) : Without loss of generality, we assume that x = y = 0 6= z 14 holds. Again, only the size-three equation is unsatisfied, so we must show that the local cost is at least 16.5. We will discuss two subcases. (i) There is a connected component δ −F δ r − δ l −F δ for some δ ∈ {x, y}. We obtain that cFT ({δ, δ l , δ r }) ≥ 6 and therefore, a lower bound on the total cost of 16.5. (ii) Since we may assume that xr , xl , y r and y l are balanced with respect to ET , we have that {eαj , γ α } ∈ ET for all α ∈ {r, l} and γ ∈ {x, y}. Because eαj are also balanced, we obtain {eαj , z α } ∈ ET for all α ∈ {r, l}, which implies a total cost of 16.5. Dishonest tours: Let us assume that the quasi-tour uses both of the forced edges {γ r , γ} and {γ l , γ} for some γ ∈ {x, z, y} twice. We delete both copies and add {γ r , γ l } instead which reduces the cost of the quasi-tour. Hence, we may assume that only one of the two incident forced edges is used twice. First, observe that if all forced edges were used once, then there would be eight vertices in the gadget with odd degree: xr , xl , y r , y l , z r , z l , erj , elj . If exactly one forced edge is used twice, then seven of these vertices have odd degree. Thus, it is impossible for the tour to make the degrees of all seven even using only the simple edges that connect them. We can therefore assume that if a forced edge is used twice, there exists another forced edge used twice. We will now take cases, depending on how many of the vertices x, y, z are incident on forced edges used twice. Note that if one of the forced edges incident on x is used twice, then exactly one of the simple edges incident on x is used once. So, first suppose all three of x, y, z have forced edges used twice. The local cost from forced edges is at least 14. Furthermore, there are three vertices of the form γ α , for γ ∈ {x, y, z} and α ∈ {l, r} with odd degree. These have no simple edges connecting them, thus the quasi-tour will use three simple edges to balance their degrees. Finally, the used simple edges incident on x, y, z each contribute 0.5 to the local cost. Thus, the total local cost is at least 18.5, giving us a credit of 3. It is not hard to see that there is always an assignment satisfying four out of the seven affected equations, so this case is done. Second, suppose exactly two of x, y, z have incident forced edges used twice, say, x, y. For z, we select the honest assignment (1 if the incident simple edges are used, 0 otherwise) and this satisfies the cycle equations for this variable. We can select assignments for x, y that satisfy three of the remaining five equations, so we need to show that the cost in this case is at least 17.5. The cost of forced edges is at least 12.5, and the cost of simple edges incident on x, y adds 1 to the local cost. One of the vertices xl , xr and one of y l , y r have odd degree, therefore the cost uses two simple edges to balance them. Finally, the vertices z l , z r have odd degree. If two simple edges incident to them are used, we have a total local cost of 17.5. If the edge connecting them is used, then the two simple edges incident on z must be used, again pushing the local cost to 17.5. Finally, suppose only x has an incident forced edge used twice. By the parity argument given above, this means that one of the forced edges incident on s is used twice. We can satisfy the cycle equations for y, z by giving them their honest assignment, and out of the three remaining equations some assignment to x satisfies two. Therefore, we need to 15 show that the cost is at least 16.5. The local cost from forced edges is 11.25 and the simple edge incident on x contributes 0.5. Also, at least one simple edge incident on xl or xr is used, since one of them has odd degree. For y l , y r , either two simple edges are used, or if the edge connecting them is used the simple edges incident on y contribute 1 more. With similar reasoning for z l , z r , we get that the total local cost is at least 16.75. Let us now conclude our analysis. Consider the following partition of V : we have a singleton set {s}, 9m sets of size 2 containing the matching edge gadgets and m sets of size 11 containing the gadgets for size-three equations (except s). The sum of their local costs is at most cFT (V ) ≤ 61m P + k. But the sum of their local costs is (using the preceding analysis) equal to 61m + crT (Vi ). Thus, the sum of all credits is at most k. Since we have already argued that the sum of all credits is enough to cover all equations unsatisfied by our assignment, this concludes the proof. We are ready to give the proof of Theorem 4. Proof of Theorem 4. We are given an instance I1 of the MAX-E3LIN2 problem with ν variables and m equations. For all δ > 0, there exists a k such that if we repeat each equa(k) tion k time we get an instance I1 with m′ = km equations and ν variables such that 2(ν + 1)/m′ ≤ δ. (k) Then, from I1 , we generate an instance I2 of the Hybrid problem and the corresponding graph GS . Due to Lemmata 1, 2 and Theorem 3, we know that for all ǫ > 0, it is NP hard to tell whether there is a tour with cost at most 61m′ +2ν +2+ǫ·m′ ≤ 61·m′ +(δ +ǫ)m′ or all tours have cost at least 61m′ + (0.5 − ǫ)m′ − 2 ≥ 61.5 · m′ − ǫ · m′ − δ · m′ . The ratio between these two cases can get arbitrarily close to 123/122 by appropriate choices for ǫ, δ. 6 ATSP In this section, we prove the following theorem. Theorem 5. It is NP -hard to approximate the ATSP to within any constant approximation ratio less than 75/74. 6.1 Construction Let us describe the construction that encodes an instance I2 of the Hybrid problem into an instance of the ATSP. Again, it will be useful to have the ability to force some edges to be used, that is, we would like to have bidirected forced edges. A bidirected forced edge of weight w between two vertices x and y will be created in a similar way as undirected forced edges in the previous section: construct L − 1 new vertices and connect x to y through these new vertices, making a bidirected path with all edges having weight w/L. 16 It is not hard to see that without loss of generality we may assume that all edges of the path are used in at least one direction, though we should note that the direction is not prescribed. In the remainder, we denote a directed forced edge consisting of vertices x and y by (x, y)F , or x →F y. Let I2 consist of the collection {Wi }νi=1 of bi-wheels. Recall that the bi-wheel consists of two cycles and a perfect matching between their checkers. Let {xui , xni }zi=1 be the associated set of variables of Wp . We write u(i) to denote the function which, given the index of a checker variable xui returns the index j of the checker variable xnj to which it is matched (that is, the function u is a permutation function encoding the matching). We write n(i) to denote the inverse function u−1 (i). Now, for each bi-wheel Wp , we are going to construct the corresponding directed graph GpA as follows. First, construct a vertex for each checker variable of the wheel. For each matching equation xui ⊕ xnj = 1, we create a bidirected forced edge {xui , xnj }F with w({xui , xnj }F ) = 2. For each contact variable xk , we create two corresponding vertices xrk and xlk , which are joined by the bidirected forced edge {xrk , xlk }F with w({xrk , xlk }F ) = 1. Next, we will construct two directed cycles Cup and Cnp . Note that we are doing arithmetic on the cycle indices here, so the index z + 1 should be read as equal to 1. For Cup , for any two consecutive checker vertices xui , xui+1 on the un-negated side of the bi-wheel, we add a simple directed edge xnu(i) → xui+1 . If the checker xui is followed by a contact xui+1 in ul u the cycle, then we add two simple directed edges xnu(i) → xur i+1 and xi+1 → xi+2 . Observe that by traversing the simple edges we have just added, the forced matching edges in the direction xui →F xnu(i) and the forced contact edges for the un-negated part in the direction ul xur i →F xi we obtain a cycle that covers all checkers and all the contacts of the un-negated part. We now add simple edges to create a second cycle Cnp . This cycle will require using the forced matching edges in the opposite direction and, thus, truth assignments will be encoded by the direction of traversal of these edges. First, for any two consecutive checker vertices xni , xni+1 on the un-negated side of the bi-wheel, we add the simple directed edge xun(i) → xni+1 . Then, if the checker xni is followed by a contact xni+1 in the cycle then we add nl n the simple directed edges xun(i) → xnr i+1 and xi+1 → xi+2 . Now by traversing the edges we have just added, the forced matching edges in the direction xni →F xun(i) and the forced contact edges for the negated part in the direction xnr →F xnl i i , we obtain a cycle that covers all checkers and all the contacts of the negated part, that is, a cycle of direction opposite to Cup . What is left is to encode the equations of size three. Again, we have a central vertex s that is connected to gadgets simulating equations with three variables. For every equation with three variables, we create the gadget displayed in Figure 1 (b), which is a variant of the gadget used by Papadimitriou and Vempala [PV00]. Let us assume that the j-th equation with three variables in I3 is of the form x ⊕ y ⊕ z = 1. This equation is simulated α by G3A j . The vertices used are the contact vertices γ , γ ∈ {x, y, z}, α ∈ {r, l}, which we have already introduced, as well as the vertices {sj , tj , eij | i ∈ [3]}. For notational 17  simplicity, we define Vj3A = sj , tj , eij , γ α | i ∈ [3], γ ∈ {x, y, z}, α ∈ {r, l} . All directed non-forced edges are simple. The vertices sj and tj are connected to s by forced edges with w((s, sj )F ) = w((tj , s)F ) = λ, where λ > 0 is a small fixed constant. To simplify things, we also force them to be used in the displayed direction by deleting the edges that make up the path of the opposite direction. This is the whole description of the graph GA . 6.2 Assignment to Tour We are going to construct a tour in GA given an assignment to the variables of I2 and prove the following lemma. Lemma 3. Given an instance I2 of the Hybrid problem with ν bi-wheels and an assignment that leaves k equations in I2 unsatisfied, then, there exists a tour in GA with cost at most 37m + 5ν + 2mλ + 2νλ + k. Before we proceed, let us again give a definition for a local edge cost function. Let G be an edge-weighted digraph and ET a multi-set of edges of E(G) that defines a tour. Consider a set V ′ ⊆ V (G). The local edge cost of the set V ′ is then defined as X X  cT (V ′ ) = w (u, v) u∈V ′ (u,v)∈ET In words, for each vertex in V ′ we count the total weight of its outgoing edges used in the quasi-tour (including multiplicities). Thus, that this sum contains the full weight for edges with their source in V ′ , regardless of where their other endpoint is. Also note that again for two sets V1 , VP 2 we have cT (V1 ∪ V2 ) ≤ cT (V1 ) + cT (V2 ) (with equality for disjoint sets) and that cT (V ) = e∈ET w(e). Proof of Lemma 3. Let Wp be a bi-wheel with variables {xui , xni }zi=1 . Given an assignment to the variables of I2 , due to Theorem 3, we may assume that either xui = 1 6= xnj for all i, j ∈ [z] or xui = 0 6= xnj for all i, j ∈ [z]. We traverse the cycle Cup if xu1 = 1 and the cycle Cnp otherwise. This creates ν strongly connected components. Each contains all the checkers of a bi-wheel and the contacts from one side. For each matching edge gadget, the local edge cost is 3. We pay two for the forced edge and 1 for the outgoing simple edge. We will account for the cost of edges incident on contacts when we analyze the size-three equation gadget below. Let us describe the part of the tour traversing the graph G3A j , which simulates x⊕y⊕z = 1. Recall that if x is set to true in the assignment we have traversed the bi-wheel gadgets in such a way that the forced edge xr →F xl is used, and the simple edge coming out of xl is used. According to the assignment to x, y and z, we traverse G3A j as follows: Case (x + y + z = 1): Let us assume that z = y = 0 6= x holds. Then, we use s →F sj → e2j → y l →F y r → e3j → z l →F z r → e1j → tj →F s. The cost is 3 + λ for the forced edges, 6 for the simple edges inside the gadget, plus 1 for the simple edge going out of xl . Total local edge cost cost: cT (Vj3A ) = 10 + λ. 18 Case (x + y + z = 3): Then, we use s →F sj → e2j → e1j → e3j → tj →F s. Again we pay 3 + λ for the forced edges, 4 for the simple edges inside the gadget and 3 for the outgoing edges incident on xl , y l , z l . Total local edge cost: cT (Vj3A ) = λ + 10. Case (x + y + z = 2): Let us assume that x = y = 1 6= z holds. Then, we use s →F sj → e3j → z l →F z r → e1j → e3j → e2j → tj →F s with total local edge cost cT (Vj3A ) = λ + 11. Case (x + y + z = 0): We use s →F sj → e2j → y l →F y r → e3j → z l →F z r → e1j → xl →F xr → e2j → tj →F s with cT (Vj3A ) = λ + 11. The total edge cost of the quasi-tour we constructed is 3 · 9m + (10 + 2λ)m + k = 37m + 2λm + k. We have at most ν + 1 strongly connected components: one for each bi-wheel and one containing s. A component representing a bi-wheel can be connected to s as follows: let xl , xr be two contact vertices in the component. Add one copy of each edge from the cycle s →F sj → e1j → xl →F xr → e2j → tj →F s. This increases the cost by 5 + 2λ but decreases the number of components by one. 6.3 Tour to Assignment In this section, we are going to prove the other direction of the reduction. Lemma 4. If there is a tour with cost 37 · m + k + 2λ · m, then, there is an assignment that leaves at most k equations unsatisfied. Proof. Given a tour ET in GA , we are going to define an assignment to checker and contact variables. As in Lemma 2, we will show that any tour must locally spend on each gadget at least the same amount as the tour we constructed in Lemma 3. If the tour spends more, we use that credit to satisfy possible unsatisfied equations. Assignment for Checker Variables Let us consider the following equations with two variables xui ⊕ xui+1 = 0, xui−1 ⊕ xui = 0, xui ⊕ xnj = 1, xnj ⊕ xnj+1 = 0, xnj−1 ⊕ xnj = 0 and the corresponding situation displayed in Figure 2 (b). Since ET is a valid tour in GA , we know that {xui , xnj }F is traversed and due to the degree condition, for each x ∈ {xui , xnj }, the tour uses another incident edge e on x with w(e) ≥ 1. Therefore, we have that cT ({xui , xnj }) ≥ 3. The credit assigned to a gadget is defined as crT ({xui , xnj }) = cT ({xui , xnj }) − 3. Let us define the assignment for xui and xnj . A variable xui is honestly traversed if either both the simple edge going into xui is used and the simple edge coming out of xnj is used, or neither of these two edges is used. In the first case, we set xui to 1, otherwise to 0. Similarly, xnj is honest if both the edge going into xnj and the edge out of xui are used, and we set it to 1 in the first case and 0 otherwise. Honest tours: First, suppose that both xui and xnj are honest. We need to show that the credit is at least as high as the number of unsatisfied equations out of the five equations 19 that contain them. It is not hard to see that if we have set xui 6= xnj all equations are satisfied. If we have set both to 1, then the forced edge must be used twice, making the local edge cost at least 6, giving a credit of 3, which is more than sufficient. Dishonest tours: If both xui and xnj are dishonest the tour must be using the forced edge in both directions. Thus, the local cost is 5 or more, giving a credit of 2. There is always an assignment that satisfies three out of the five equations, so this case is done. If one of them is dishonest, the other must be set to 1 to ensure strong connectivity. Thus, there are two simple edges used leaving the gadget, making the local cost 4 (perhaps the same edge is used twice). We can set the honest variable to 1 (satisfying its two cycle equations), and the other to 0, leaving at most one equation unsatisfied. Assignment for Contact Variables First, we note that for any valid tour, we have cT (Vj3A ) ≥ 10 + λ. This is because the two forced edges of weight λ must be used, and there exist 10 vertices in the gadget for which all outgoing edges have weight 1. Let us define the credit crT (Vj3A ) = cT (Vj3A ) − (10 + λ). Honest Traversals: We assume that the underlying tour is honest, that is, forced edges are traversed only in one direction. We set x to 1 if the forced edge is used in the direction xr →F xl and 0 otherwise. In the first case we know that the simple edges going into xr and out of xl are used. In the second, the edges e1j → xl and xr → e2j are used. We do similarly for y, z. We are interested in the equation x ⊕ y ⊕ z = 1 and the six cycle equations involving x, y, z. The assignment we pick for honest variables satisfies the cycle equations, so if it also satisfies the size-three equation we are done. If not, we have to prove that the tour pays at least 11 + λ. Case (x = y = z = 0): Due to our assumption, we know that e2j → y l →F y r → e3j → z l →F z r → e1j → xl →F xr → e2j is a part of the tour. Since ET is a tour, there exists a vertex in Vj3A \{sj , tj } that is visited twice and we get cT (Vj3A ) ≥ 11 + λ. Thus, we can spend the credit crT (Vj3A ) ≥ 1 on the unsatisfied equation x ⊕ y ⊕ z = 1. Case (x + y + z = 2): Without loss of generality, let us assume that x = y = 1 6= z holds. Then, we know that e3j → z l →F z r → e1j is a part of the tour. But, this implies that 3A there is a vertex in V (G3A j ) that is visited twice. Hence, we have that crT (Vj ) ≥ 1. Dishonest Traversals: Consider the situation, in which some forced edges {γ r , γ l }F are traversed in both directions for some variables γ ∈ {x, y, z}. For the honest variables, we set them to the appropriate value as before, and this satisfies their cycle equations. Observe now that if a forced edge γ l →F γ r is also used in the opposite direction, then there must be another edge used to leave the set {γ l , γ r }. Thus the local edge cost of this set is at least 3. It follows that the credit we have for the gadget is at least as large as the number of dishonest variables. We can give appropriate values to them so each satisfies one cycle equation and the size-three equation is satisfied. Thus, the number of unsatisfied equations is not larger than our credit. In summary, for every tour ET in GA , we can find an assignment to the variables of I2 such that all unsatisfied equations are paid by the credit induced by ET . 20 We are ready to give the proof of Theorem 5. Proof of Theorem 5. We are again given an instance I1 of the MAX-E3LIN2 problem with ν variables and m equations. For all δ > 0, there exists a k such that if we repeat each (k) equation k time we get an instance I1 with m′ = km equations and ν variables such that ν/m′ ≤ δ. (k) Then, from I1 , we generate an instance I2 of the Hybrid problem and the corresponding directed graph GA . Due to Lemmata 3, 4 and Theorem 3, we know that for all ǫ > 0, it is NP -hard to tell whether there is a tour with cost at most 37m′ + 5ν + 2m(ν + λ) + ǫ · m′ ≤ 37 · m′ + ǫ′ m′ or all tours have cost at least 37m′ + (0.5 − ǫ)m′ ≥ 37.5 · m′ − ǫ′ · m′ , for some ǫ′ depending only on ǫ, δ, λ. The ratio between these two cases can get arbitrarily close to 75/74 by appropriate choices for ǫ, δ, λ. 7 Concluding Remarks In this paper, we proved that it is hard to approximate the ATSP and the TSP within any constant factor less than 75/74 and 123/122, respectively. The proof method required essentially new ideas and constructions from the ones used before in that context. Since the best known upper bound on the approximability is O(log n/ log log n) for ATSP and 3/2 for TSP, there is certainly room for improvements. Especially, in the asymmetric version of the TSP, there is a large gap between the approximation lower and upper bound, and it remains a major open problem on the existence of an efficient constant factor approximation algorithm for that problem. Furthermore, it would be nice to investigate if some of the ideas of this paper, and in particular the bi-wheel amplifiers, can be used to offer improved hardness results for other optimization problems, such as the Steiner Tree problem. References [ALM+ 98] S. Arora, C. Lund, R. Motwani, M. Sudan and M. Szegedy, Proof Verification and the Hardness of Approximation Problems, J. ACM 45, pp. 501–555, 1998. [AGM+ 10] A. Asadpour, M. Goemans, A. Madry, S. Oveis Gharan and A. Saberi, An O(log n/ log log n)-Approximation Algorithm for the Asymmetric Traveling Salesman Problem, In Proc. 21st SODA (2010), pp. 379–389, 2010. [BK99] P. Berman and M. Karpinski, On Some Tighter Inapproximability Results, In Proc. 26th ICALP (1999), Springer, LNCS 1644, pp. 200–209, 1999. [BK01] P. Berman and M. Karpinski, Efficient Amplifiers and Bounded Degree Optimization, ECCC TR01-053, 2001. 21 [BK03] P. Berman and M. Karpinski, Improved Approximation Lower Bounds on Small Occurrence Optimization, ECCC TR03-008, 2003. [BK06] P. Berman and M. Karpinski, 8/7-approximation algorithm for (1, 2)-TSP, In Proc. 17th SODA (2006), pp. 641–648, 2006. [B04] M. Bläser, A 3/4-Approximation Algorithm for Maximum ATSP with Weights Zero and One, In Proc. 7th APPROX (2004), Springer, LNCS 3122, pp. 61–71, 2004. [BS00] H.-J. Böckenhauer and S. Seibert, Improved Lower Bounds on the Approximability of the Traveling Salesman Problem, Theor. Inform. Appl. 34, pp. 213–255, 2000. [C76] N. Christofides, Worst-Case Analysis of a New Heuristic for the Traveling Salesman Problem, Technical Report CS-93-13, Carnegie Mellon University, Pittsburgh, 1976. [E03] L. Engebretsen, An Explicit Lower Bound for TSP with Distances One and Two, Algorithmica 35, pp. 301–318, 2003. [EK06] L. Engebretsen and M. Karpinski, TSP with Bounded Metrics, J. Comput. Syst. Sci. 72, pp. 509–546, 2006. [H01] J. Håstad, Some Optimal Inapproximability Results, J. ACM 48, pp. 798–859, 2001. [KS12] M. Karpinski and R. Schmied, On Approximation Lower Bounds for TSP with Bounded Metrics, CoRR arXiv: abs/1201.5821, 2012. [KS13] M. Karpinski and R. Schmied, On Improved Inapproximability Results for the Shortest Superstring and Related Problems, In Proc. 19th CATS (2013), CRPIT 141, pp. 27-36, 2013. [L12] M. Lampis, Improved Inapproximability for TSP, In Proc. 15th APPROX (2012), Springer, LNCS 7408, pp. 243–253, 2012. [MS11] T. Mömke and O. Svensson, Approximating Graphic TSP by Matchings, In Proc. IEEE 52nd FOCS (2011), pp. 560–569. [M12] M. Mucha, 13/9-Approximation for Graphic TSP, In Proc. STACS (2012), volume 14 of LIPIcs, pp. 30–41, Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, 2012. [OSS11] S. Oveis Gharan, A. Saberi and M. Singh, A Randomized Rounding Approach to the Traveling Salesman Problem, In Proc. IEEE 52nd FOCS (2011), pp. 550–559. [PV00] C. Papadimitriou and S. Vempala, On the Approximability of the Traveling Salesman Problem, in Proc. 32nd ACM STOC (2000), pp. 126–133, 2000; see also a corrected version in Combinatorica 26, pp. 101–120, 2006. 22 [PY93] C. Papadimitriou and M. Yannakakis, The Traveling Salesman Problem with Distances One and Two, Math. Oper. Res. 18 , pp. 1–11, 1993. [SV12] A. Sebö and J. Vygen, Shorter Tours by Nicer Ears, CoRR arXiv: abs/1201.1870, 2012; to appear in Combinatorica. 23