0% found this document useful (0 votes)

34 views15 pages

Minimum Consistent Subset in Trees and Interval Graphs

Aritra Banik, Sayani Das , Anil Maheshwari, Bubai Manna, Subhas C. Nandy, Krishna Priya K. M., Bodhayan Roy, Sasanka Roy, Abhishek Sahu

Uploaded by

Sayani Das

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

34 views15 pages

Minimum Consistent Subset in Trees and Interval Graphs

Aritra Banik, Sayani Das , Anil Maheshwari, Bubai Manna, Subhas C. Nandy, Krishna Priya K. M., Bodhayan Roy, Sasanka Roy, Abhishek Sahu

Uploaded by

Sayani Das

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

Minimum Consistent Subset in Trees and Interval

Graphs
Aritra Banik # Sayani Das #
National Institute of Science, Education and Theoretical Computer Science,
Research, An OCC of Homi Bhabha National The Institute of Mathematical Sciences,
Institute, Bhubaneswar, India Chennai, India

Anil Maheshwari # Bubai Manna #

School of Computer Science, Carleton University, Department of Mathematics, Indian Institute of
Ottawa, Canada Technology Kharagpur, India
Subhas C. Nandy # Krishna Priya K. M. #
Advanced Computing and Microelectronics Unit, National Institute of Science, Education and
Indian Statistical Institute, Kolkata, India Research, An OCC of Homi Bhabha National
Institute, Bhubaneswar, India
Bodhayan Roy # Sasanka Roy #
Department of Mathematics, Indian Institute of Advanced Computing and Microelectronics Unit,
Technology Kharagpur, India Indian Statistical Institute, Kolkata, India
Abhishek Sahu #
National Institute of Science, Education and
Research, An OCC of Homi Bhabha National
Institute, Bhubaneswar, India

Abstract
In the Minimum Consistent Subset (MCS) problem, we are presented with a connected simple
undirected graph G, consisting of a vertex set V (G) of size n and an edge set E(G). Each vertex in
V (G) is assigned a color from the set {1, 2, . . . , c}. The objective is to determine a subset V ′ ⊆ V (G)
with minimum possible cardinality, such that for every vertex v ∈ V (G), at least one of its nearest
neighbors in V ′ (measured in terms of the hop distance) shares the same color as v. The decision
problem, indicating whether there exists a subset V ′ of cardinality at most l for some positive integer
l, is known to be NP-complete even for planar graphs.
In this paper, we establish that the MCS problem is NP-complete on trees. We also provide
a fixed-parameter tractable (FPT) algorithm for MCS on trees parameterized by the number of
colors (c) running in O(26c n6 ) time, significantly improving the currently best-known algorithm
whose running time is O(24c n2c+3 ). In an effort to comprehensively understand the computational
complexity of the MCS problem across different graph classes, we extend our investigation to interval
graphs. We show that it remains NP-complete for interval graphs, thus enriching graph classes where
MCS remains intractable.
2012 ACM Subject Classification Theory of computation → Fixed parameter tractability; Theory
of computation → Graph algorithms analysis; Theory of computation → Dynamic graph algorithms
Keywords and phrases Nearest-Neighbor Classification, Minimum Consistent Subset, Trees, Interval
Graphs, Parameterized complexity, NP-complete
Digital Object Identifier 10.4230/LIPIcs.FSTTCS.2024.7
Related Version Full Version: https://arxiv.org/abs/2404.15487

Funding Aritra Banik: Supported by the Science and Engineering Research Board (SERB) via the
project MTR/2022/000253.
Anil Maheshwari: Supported by Natural Sciences and Engineering Research Council of Canada
(NSERC)
Bodhayan Roy: Supported by the Science and Engineering Research Board (SERB) via the project
MTR/2021/000474.
© Aritra Banik, Sayani Das, Anil Maheshwari, Bubai Manna, Subhas C. Nandy, Krishna Priya K. M.,
Bodhayan Roy, Sasanka Roy, and Abhishek Sahu;
licensed under Creative Commons License CC-BY 4.0
44th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science
(FSTTCS 2024).
Editors: Siddharth Barman and Sławomir Lasota; Article No. 7; pp. 7:1–7:15
Leibniz International Proceedings in Informatics
Schloss Dagstuhl – Leibniz-Zentrum für Informatik, Dagstuhl Publishing, Germany
7:2 Minimum Consistent Subset in Trees and Interval Graphs

1 Introduction
For many supervised learning algorithms, the input comprises a colored training dataset T
in a metric space (X , d) where each element t ∈ T is assigned a color C(t) from the set of
colors [c]. The objective is to preprocess T in a manner that enables rapid assignment of
color to any uncolored element in X , satisfying specific optimality criteria. One commonly
used optimality criterion is the nearest neighbor rule, where each uncolored element x is
assigned a color based on the colors of its k nearest neighbors in the training dataset T
(where k is a fixed integer). The efficiency of such an algorithm relies on the size of the
training dataset. Therefore, it is crucial to reduce the size of the training set while preserving
distance information. This concept was formalized by Hart [11] in 1968 as the minimum
consistent subset (MCS) problem. In this problem, given a colored training dataset T , the
objective is to find a subset S ⊆ T of minimum cardinality such that for every point t ∈ T ,
the color of t is the same as the color of one of its nearest neighbors in S. Since its inception,
the MCS problem has found numerous applications, as can be judged by over 2800 citations
to [11] in Google Scholar.
The MCS problem for points in ℜ2 under the Euclidean norm is shown to be NP-complete
if at least three colors color the input points. Furthermore, it remains NP-complete even for
two colors [15, 12]. Recently, it has been shown that the MCS problem is W[1]-hard when
parameterized by the output size [5]. For some algorithmic results on the MCS problem in
ℜ2 , see [3, 15].
In this paper, we explore the minimum consistent subset problem when (X , d) is a graph
metric. Without loss of generality, we will use [n] to denote the set of integers {1, . . . , n}.
For any graph G, we denote the set of vertices of G by V (G) and the set of edges by E(G).
Consider any graph G and an arbitrary vertex coloring function C : V (G) → [c]. For a subset
of vertices U , let C(U ) represent the set of colors of the vertices in U , formally denoted
as C(U ) = {C(u) : u ∈ U }. For any two vertices u, v ∈ V (G), the shortest path distance
between u and v in G is denoted by d(u, v). For a vertex v ∈ V (G) and any subset of
vertices U ⊆ V (G), let d(v, U ) = minu∈U d(v, u). The nearest neighbors of v in the set U
are denoted as NN(v, U ), formally defined as NN(v, U ) = {u ∈ U : d(v, u) = d(v, U )}. A
subset of vertices S ⊆ V (G), is called a consistent subset for (G, C) if for every v ∈ V (G),
C(v) ∈ C(NN(v, S)). The consistent subset problem on graphs is defined as follows:

Consistent Subset Problem on Graphs(CSPG)

Input: A graph G, a coloring function C : V (G) → [c], and an integer ℓ.

Question: Does there exist a consistent subset of size ≤ ℓ for (G, C)?

A consistent subset of minimum size is called a minimum consistent subset (MCS).

Banerjee et al. [2] proved that the CSPG is W[2]-hard [6] when parameterized by the size of
the minimum consistent set, even when limited to two colors, thus refuting the possibility of
an FPT algorithm parameterized by (c + ℓ) under standard complexity-theoretic assumptions
for general graphs. This naturally raises the question of determining the simplest graph
classes where the problem remains computationally intractable. Dey et al. [8] presented
a polynomial-time algorithm for CSPG on simple graph classes such as paths, spiders,
combs, and caterpillars. The CSPG has gained significant research attention in recent years
when the underlying graph is a tree. Dey et al. [7] presented a polynomial-time algorithm
for bi-colored trees, and Arimura et al. [1] presented an XP algorithm parameterized by
the number of colors c, with a running time of O(24c n2c+3 ). Very recently, the paper [4]
demonstrated a polynomial time algorithm for the minimum consistent spanning subset (a
variant of MCS) in trees.
A. Banik et al. 7:3

New Results. The most intriguing question yet to be answered is whether CSPG remains
NP-hard for trees [1, 7]. In this paper, we decisively answer this question in the affirmative.
This is particularly noteworthy given the scarcity of naturally occurring problems known to
be NP-hard on trees. Our contribution includes a reduction from the MAX-2SAT problem,
detailed in Section 3. Next, we show that CSPG is fixed-parameter tractable (FPT) for trees
on n vertices, significantly improving the results presented in Arimura et al. [1]. Our intricate
dynamic programming algorithm runs in O(26c n6 ) time, whereas [1] requires O(24c n2c+3 )
time; see Section 4.
Moreover, in Section 5, we show that CSPG on interval graphs is NP-hard. While interval
graphs are unrelated to trees, our hardness result for interval graphs raises new questions
about the fixed-parameter tractability of CSPG on this distinct graph class.

2 Notation and Preliminary Results

Notations. For any graph G and any vertex v ∈ V (G), let N (v) = {u : u ∈ V (G), (u, v) ∈
E(G)} denotes the set of neighbours of v and N [v] = {v} ∪ N (v). We denote the distance
between two subgraphs G1 and G2 in G by d(G1 , G2 ) = min{d(v1 , v2 ) : v1 ∈ V (G1 ), v2 ∈
V (G2 )}. For any subset of vertices U ⊆ V (G) in a graph G, G[U ] denotes the subgraph of
G induced on U . Most of the symbols and notations of graph theory used are standard and
taken from [9].
As an elementary result, we prove that MCS is log-APX-hard [14]. We reduce the
dominating set problem to the consistent subset problem (CSPG). In the dominating set
problem given a graph G and an integer k the objective is to find out whether there exists a
subset U ⊆ V (G) of size k such that for any vertex v ∈ V (G), N [v] ∩ U ̸= ∅. It is known that
the set cover problem is log-APX-hard, or in other words, it is NP-hard to approximate the
set cover problem within a c · log n factor [13]. As there exists an L-reduction from set cover
to dominating set problem, the dominating set problem is known to be log-APX-hard. Let
(G, k) be any arbitrary instance of the dominating set problem with a graph G and an integer
k. We construct an instance (H, C, k + 1) of CSPG as follows. V (H) = V (G) ∪ {x} and
E(H) = E(G) ∪ {(x, v) : v ∈ V (G)}. For all v ∈ V (G), we set C(v) = 1, and set C(x) = 2.
For the sake of completeness, we state the following lemma.

▶ Lemma 1. There exists a dominating set for G of size at most k if and only if there is a
consistent subset of size at most k + 1 for the graph H.

Proof. Let D be a dominating set of size k for the graph G. We claim that D′ = {x} ∪ D
is a consistent subset of H. If not, then there is a vertex vi ∈ V (H) \ D′ such that
d(vi , D) > d(vi , x) = 1. This contradicts the assumption that D is a dominating set for G
and the claim holds.
On the other hand, suppose D′ is a consistent subset of size k + 1 in the graph H.
Observe that x ∈ D′ as x is the only vertex with color 2. We claim that D = D′ \ {x} is
a dominating set of G. If not, then there is a vertex v ∈ V (G)\D such that N (v) ∩ D = ∅.
Thus d(v, D′ ) > 1 but d(v, x) = 1 and C(v) ̸= C(x). This contradicts the assumption that C
is a consistent subset and hence the claim holds. ◀

From Lemma 1, we have the following theorem.

▶ Theorem 2. There exists a constant c > 0 such that it is NP-hard to approximate the
Minimum Consistent Set problem within a factor of c · log n.

FSTTCS 2024
7:4 Minimum Consistent Subset in Trees and Interval Graphs

3 NP-hardness of MCS for Trees

In this section, we prove that CSPG is NP-hard even when the input graph is a tree. We
present a reduction from MAX-2SAT problem to CSPG. Let θ be a given MAX-2SAT
formula with n variables {x1 , . . . , xn } and m clauses {c1 , . . . , cm }, n, m ≥ 50. We construct
an instance (Tθ , Cθ ) of the MCS problem from θ as follows.

Interval Graph Construction.

Construction of (Tθ , Cθ ).
The constructed tree Tθ is composed of variable gadgets, clause gadgets, and central
vertex gadgets.
Variable Gadget.
A variable gadget Xi for the variable xi ∈ θ has two components where each component
has a literal path and M pairs of stabilizer vertices, as described below (see Figure 1),
where M is very large (we will define the exact value of M later).
Literal paths: The two literal paths of the variable gadget Xi are Piℓ = ⟨x1i , x2i , x3i , x4i ⟩
ℓ
and P i = ⟨x1i , x2i , x3i , x4i ⟩, each consisting of four vertices; these are referred to as
positive literal path, and negative literal path respectively. Here, by a path of k (> 2)
vertices, we mean a connected graph with k − 2 nodes having degree 2 and the
remaining two nodes having degree 1. All the vertices on the path Piℓ are of color cℓi
ℓ
and all the vertices on the path P i are of color cℓi .
M
Stabilizer vertices: M pairs of vertices {s1i , s1i }, . . . , {sM i , si }, where the color of
j j
each pair of vertices {si , si } is cs (i, j). We denote the set of vertices Si = {s1i , . . . , sM
i }
as positive stabilizer vertices and the set of vertices S i = {s1i , . . . , sM i } as negative
stabilizer vertices. Each vertex in Si is connected to x1i and each vertex in S i is
connected to x1i .
The intuition behind this set of stabilizer vertices is that by setting a large value of
1 M
M we ensure that either {s1i , . . . , sM i } or {si , . . . , si } is present in any “small” sized
solution.
Clause Gadget.
For each clause ci = (yi ∨ zi ), where yi and zi are two (positive/negative) literals,
we define the clause gadget Ci as follows. It consists of three paths, namely left
occurrence path PiY = ⟨yi1 , · · · yi7 ⟩, right occurrence path PiZ = ⟨zi1 , · · · zi7 ⟩, and clause
path PiW = ⟨wi1 , · · · wi7 ⟩ (see Figure 1). All the vertices in PiY (resp. PiZ ) have the
same color as that of the corresponding literal path in their respective variable gadgets,
i.e. for any literal, say yi in Ci , if yi = xi (resp. xi ) then we set the color of the
vertices of PiY as cxi (resp. cxi ). The color of the vertices on the path PiZ is set in the
same manner. We color the vertices in PiW by cw i , which is different from that of the
vertices in PiY and PiZ . We create the clause gadget Ci by connecting yi1 with wi2 and
zi1 with wi6 by an edge (see Figure 1).
Central Vertex Gadget.
We also have a central path P v = ⟨v1 , v2 , v3 ⟩. The color of all the vertices in P v is the
same, say cv , which is different from the color of all other vertices in the construction.
For each variable gadget Xi , x1i and x1i are connected with the vertex v1 (see Figure 1).
For each clause Ci , wi4 is connected with v1 . The color of the vertices of P v is cv .
A. Banik et al. 7:5

sM
1 sM
1 sM
3 sM
3
x1 = false x2 = false x3 = false
S1 S3
x41 x41 x42 x42 x43 x43
s11 s11 s13 s13
x11 x11 x12 x12 x13 x13

v3
v1
P1W P3W
w11 wi4 w17 w31 w34 w37

y17 z17 y37 z37

y11 z11 y31 z31

(x1 ∨ x2) (x1 ∨ x3) (x2 ∨ x3)

Figure 1 An example of construction of (Tθ , Cθ ) where θ = (x1 ∨ x2 ) ∧ (x1 ∨ x3 ) ∧ (x2 ∨ x3 ). For

the assignment x1 = x2 = x3 = false, we have shown the corresponding CS with a red box around
the vertices. For the assignment x1 = x2 = x3 = false, (x1 ∨ x2 ) is not satisfied whereas the rest of
the clauses are satisfied.

Our objective is to show that there exists an MCS of size at most N (k) = n(M + 2) +
2k + 3(m − k) + 1 in the tree Tθ if at least k clauses of θ are satisfied; otherwise, the size
is strictly greater than N (k). We now prove a set of auxiliary claims about a minimum
consistent subset for (Tθ , Cθ ).

▶ Lemma 3. For any consistent subset VC of size at most N (k) = n(M +2)+2k +3(m−k)+1
in the tree Tθ , the following are true.
For any variable xi , exactly one of the following is true.
Si ⊂ VC , S i ∩ VC = ∅, and x2i , x4i ∈ VC .
S i ⊂ VC , Si ∩ VC = ∅, and x2i , x4i ∈ VC
, and
v3 ∈ V C .

Lemma 3, states that by strategically choosing the vertices in a variable gadget, the
vertices of the tree corresponding to that variable gadget can be consistently covered by
choosing its only one set of stabilizer vertices.

̸ ∅. Let sji ∈ Si ∩VC . But then every vertex v ∈ Si \{sji }

Proof. For a variable xi , let Si ∩VC =
must have a vertex within distance 2 of its own color, since dist(sji , v) = 2. Hence Si ⊂ VC .
One can similarly prove that S i ∩ VC ̸= ∅ =⇒ S i ⊆ VC . Also, every variable gadget contains
M uniquely colored vertices and hence has at least M vertices in VC . So, if M >> n, we
have that N (k) < (n + 1)M , and there exists no variable gadget that contains vertices from
both Si and S i . In other words exactly one of the following holds for every variable gadget
corresponding to a variable xi :
Si ⊂ VC , S i ∩ VC = ∅
S i ⊂ VC , Si ∩ VC = ∅,

FSTTCS 2024
7:6 Minimum Consistent Subset in Trees and Interval Graphs

Below, we look into one of these cases, and a similar argument may be made for the other
case as well.

Case 1: Si ⊂ VC , S i ∩ VC = ∅. Notice that there must be a vertex in the literal path

{x1i , x2i , x3i , x4i } from {x1i , x2i } of the variable gadget Xi since dist(Si , x1i ) = 1 and the distance
to any other vertex of the same color (other than these two vertices) is more than 1. But
x1i ∈/ VC , as S i ∩ VC = ∅ and dist(x1i , S i ) < dist(Si , S i ). Hence x2i ∈ VC .
Similarly there must be a vertex in the literal path {x1i , x2i , x3i , x4i } of the variable gadget
Xi since dist(Si , x1i ) = 3 and the distance to any other vertex of the same color (other
than {x2i , x3i , x4i }) is more than 3. And the distance of 4 between Si and Si eliminates the
possibility of belonging any of x1i , x2i or x3i belonging in VC . Thus, x4i ∈ VC .

Case 2: S i ⊂ VC , Si ∩ VC = ∅. Case 2 may be argued in a manner similar to Case 1.

Moreover, VC must contain at least one veretx from the set {v1 , v2 , v3 }. However, the
distance of 4 between Si and Si rules out the possibility of either v1 or v2 being in VC .
Consequently, v3 ∈ VC . ◀

To satisfy the inequality in the above lemma, we now set the value of M as n3 . In the
next lemma, we present a bound on the vertices from each clause gadget that are contained
in a consistent subset of size at most N (k). For any clause Ci , denote the corresponding
clause gadget by TiC = G[{wia , yia , zia |1 ≤ a ≤ 7}].

▶ Lemma 4. In any consistent subset VC of the tree Tθ , for each clause Ci , 2 ≤ |V (TiC )∩VC |.

Proof. There needs to be a vertex among the vertices {wia | 1 ≤ a ≤ 7} since they are
distinctly colored from all other vertices. If this vertex belongs to {wia | 1 ≤ a ≤ 4}, then
there must also be a vertex in {yia | 1 ≤ a ≤ 7} since the nearest vertex of the same color
(any yia ) is farther away than the vertex with the color of any wia . Similarly, if this vertex
belongs to {wia | 4 ≤ a ≤ 7}, then there must be a vertex in {zia | 1 ≤ a ≤ 7}. Therefore,
2 ≤ |V (TiC ) ∩ VC |. ◀

▶ Theorem 5. There exists a truth assignment of the variables in θ which satisfies at least k
clauses if and only if there exists a consistent subset of size at most N (k) for (Tθ , Cθ ).

Proof. (⇒) For the forward direction, let there exist an assignment A to the variables of θ
that satisfies k clauses. Consider the following set of vertices VA . For each variable xi if xi is
true, include all the vertices of Si in VA . Also include x2i and x4i in VA . If xi = false then
include all the vertices of S i in VA . Also include x4i and x2i in VA .
For every satisfied clause Ci = (yi ∨ zi ) with respect to A we include the following vertices
in VA . Without loss of generality assume that yi = true. We include wi7 and zi1 in VA . For
every unsatisfied clause Ci = (yi ∨ zi ), we include wi1 , yi1 and zi7 in VA . We also include v3
in VA .
Observe that the cardinality of VA is N (k) = n(M + 2) + 2k + 3(m − k) + 1. Next, we
prove that VA is a consistent subset for Tθ . Observe that for any pair of vertices (sji , sji ),
exactly one of them is in VA . Without loss of generality assume that sji ∈ VA . Observe that
d(sji , sji ) = d(sji , VA ) = 4. If xi = true then d(xji , x2i ) ≤ d(xji , VA ) and d(xji , x4i ) ≤ d(xji , VA ).
The case when xi = false is symmetric.
For any clause gadget either wi1 or wi7 is in VA . Without loss of generality assume that
wi1 ∈ VA . Observe that for every vertex wij , d(wij , wi1 ) = d(wij , VA ). Let Ci = (yi ∨ zi ) be
a satisfied clause and without loss of generality assume that yi = xj = true. Observe
A. Banik et al. 7:7

that d(yi1 , x2j ) = 6, and d(yi3 , VA ) = 6, and d(yia , x2j ) = d(yia , VA ). For any unsatisfied
clause Ci = (yi ∨ zi ) observe that d(yia , x2j ) = d(yia , VA ). Also for any vj where 1 ≤ j ≤ 3
d(vj , v3 ) = d(vj , VA ). Therefore VA is a consistent subset for Tθ .
(⇐) In the backward direction, let there be a consistent subset VC of size at most N (k) for
(Tθ , Cθ ). We know from Lemma 3 that either Si ⊂ VC or S i ⊂ VC . From Lemma 3 any such
solution has at least n(M + 2) + 1 many vertices from VC outside the clause gadgets leaving
at most 2k + 3(m − k) that may be chosen from the clause gadgets.
Each clause gadget comprises of vertices of three distinct colors: one color exclusive to
the clause itself and two colors dedicated to literals. An essential insight is that if there
are no vertices in VC of colors specific to the literals from a clause in the variable gadgets,
then such a clause gadget must contain at least three vertices from VC . This assertion is
valid because the distance between two sets of vertices of the same color (corresponding to
the same literal in two clauses) across any two clauses is at least 8, while vertices in VC of
clause-specific colors are at a distance of at most 6.
This fact, coupled with Lemma 4 implies that there are at least k clauses for whom colors
specific to at least one of their literals have the vertices in VC of the same color from the
variable gadgets. Making the same literals true and setting other variables arbitrarily gives
us an assignment that satisfies at least k clauses. ◀

4 MCS for Trees: A Parameterized Algorithm

In this section, we consider the optimization version of the MCS problem for the trees.

Minimum Consistent Subset for Trees Parameter: c

Input: A rooted tree T , whose vertices V (T ) are coloured with a set C of c colours.
Question: Find the minimum possible size of a consistent subset (MCS) for T ?

We consider T as a rooted tree by taking an arbitrary vertex r as its root. We use V (T ′ )

to denote the vertices of a subtree T ′ of T , and C(U ) ⊆ C to denote the subset of colors
assigned to subset of vertices U ⊆ V , and C(u) to denote the color attached to the vertex
u ∈ V (T ). For any vertex v, let ηv denote the number of children of v and we denote the
children of v by v1 , v2 , · · · , vηv . We denote the subtree rooted at a vertex v by T (v). For
any vertex v and any integer i < ηv , we use Ti+ (v) to denote the union of subtrees rooted
at vi+1 to vηv , and Ti (v) to denote the subtree rooted at v and containing first i many
children of v. Thus, Ti+ (v) = ∪i+1≤j≤ηv T (vj ), which is a forest, and Ti (v) = T (v) \ Ti+ (v).
In Figure 2(a), the light yellow part is Ti (v), and the light sky-colored part is Ti+ (v). We
define T out (v) = T \ T (v).
For any positive integer d and for any vertex v ∈ V (T ), a set of vertices U ⊂ V (T ) is
called d-equidistant from v if d(ui , v) = d for all ui ∈ U . Any subset of vertices U spans
sib
a set of colors C ′ ⊆ C if C(U ) = C ′ . For any vertex v ∈ V (T ), we use Ei (v, d, C ′ ) (resp.
out
Ei (v, d, C ′ )) to denote the set of subsets of vertices in Ti+ (v) (resp. T \ T (v)), which are
d-equidistant from v and span the colors in C ′ . Next, we define a (partial) consistent subset
for a subtree Ti (v).

Intuition. Our dynamic programming (DP) routine exploits the key observation that a
(partial) consistent subset (formally defined below) for a subtree T (v) can be computed
in FPT time. This computation is possible given the distance to the closest vertex in the
consistent subset that lies outside T (v) and the colors of those vertices. The entries in our

FSTTCS 2024
7:8 Minimum Consistent Subset in Trees and Interval Graphs

Sv
out
N N (v, Sv ) out

v v
N N (v, Sv ) sib

N N (v, Sv )
in

v1 vi
Ti Ti(v)
v1 vi vη
Ti+

(a) Sv in
Svsib

(b)

Figure 2 Illustration of the bottom-up dynamic programming routine, ■ vertices denotes

consistent subset. δSin = 1, δSout = 2, δSsib = 1.

DP table store the minimum size of partial consistent subsets for all subtrees Ti (v). These
subsets are defined based on six parameters: distance to the closest vertex from v in the
consistent subset in Ti (v), T out (v) and Ti+ (v) and colors of these three set of closest vertices.

▶ Definition 6. Let din ∈ Z+ out sib +

0 and d , d ∈ Z , and let three subsets of colors C , C , C
in out sib
⊆
C. A (partial) consistent subset of the subtree Ti (v) with respect to the parameters d , dout , in

dsib , C in , C out , C sib is defined as a set of vertices W ⊆ V (Ti (v)) such that for any arbitrary
sib out
subset X ∈ Ei (v, dsib , C sib ) and Y ∈ Ei (v, dout , C out ) (assuming they exist), W satisfies the
following (see Figure 2(b)):

d(v, W ) = din . (i.e., the distance of v to its nearest member(s) in W is din )

C(NN(v, W )) = C in . (i.e., C in is the set of colors of the nearest members of v in the
set W )
For every vertex u ∈ Ti (v), C(u) ∈ C(NN(u, W ∪ X ∪ Y )).

Note that for some values of din , dout , dsib , C in , C out , C sib there may not exist any (partial)
consistent subset for Ti (v); in such a case we define it is undefined. Also note that, for
some values, the (partial) consistent subset can be empty as well, such as when din = ∞ and
C(u) ∈ C(NN(u, X ∪ Y )) for every vertex u ∈ Ti (v). For ease of notation, we will denote a
(partial) consistent subset for Ti (v) as a consistent subset with respect to the parameters
din , dout , dsib , C in , C out , C sib .
Consider an arbitrary consistent subset ST of T , an arbitrary vertex v ∈ V (T ) and
an integer i ∈ [ηv ] (see Figure 2(b)). For any vertex v ∈ V (T ) and 1 ≤ i ≤ ηv , define
Svin = ST ∩ V (Ti (v)), Svsib = ST ∩ V (Ti+ (v)), and Svout = ST ∩ V (T \ T (v)). Also define
δSin = d(v, Svin ), CSin = C(NN(v, Svin )), δSsib = d(v, Svsib ), CSsib = C(NN(v, Svsib )), δSout = d(v, Svout ),
CSout = C(NN(v, Svout )). Let W be any arbitrary (partial) consistent subset with respect to the
parameters δSin , δSout , δSsib , CSin , CSout , CSsib (see Definition 6). Next, we have the following lemma.

▶ Lemma 7. SW = (ST \ Svin ) ∪ W is a consistent subset for T .

Proof. Suppose that A = W ∪ NN(v, Svsib ) ∪ NN(v, Svout ) is the set of vertices which are
either in W or in the nearest neighbor of v outside Ti (v) in SW . We will show that for any
vertex u ∈ Ti (v), N N (u, SW ) ⊆ A and there is a vertex in N N (u, A) of the color same as
u. Similarly, let B = (SW \ W ) ∪ NN(v, W ). We show that for any vertex w outside Ti (v),
N N (w, SW ) ⊆ B and there is a vertex in N N (w, B) of the color same as w. Please note
that A and B are not necessarily disjoint.
A. Banik et al. 7:9

sib
Consider a vertex u ∈ Ti (v) and w ∈ T \ Ti (v). Since NN(v, Svsib ) ∈ Ei (v, δSsib , CSsib ),
out
NN(v, Svout ) ∈ Ei (v, δSout , CSout ), and W is a consistent subset with respect to the parame-
ters δSin , δSout , δSsib , CSin , CSout , CSsib , we have C(u) ∈ C(NN(u, W ∪ NN(v, Svsib ) ∪ NN(v, Svout ))) =
C(NN(u, A)). Also, as ST is a consistent subset, we have C(w) ∈ C(NN(w, ST \ Svin ) ∪ CSin ).
From the properties of W , we have C(NN(v, W )) = CSin . Hence, C(w) ∈ C(NN(w, B)). Thus,
to prove the result, it is enough to show that (i) no vertex from B \ A can be the closest to
the vertex u in the set SW , and (ii) no vertex from A \ B can be the closest vertex of w in
the set SW . We prove these two claims by contradiction.
Assume that Claim (i) is false. Then, there will be a vertex x ∈ B \ A, which is closest
to u. Note that, (B \ A) ∩ ((Ti (v) ∪ NN(v, Svsib ) ∪ NN(v, Svout )) = ∅. Hence we have d(u, x) =
d(u, v) + d(v, x), d(v, x) > min(δSsib , δSout ). This is a contradiction as the closest vertex from v
in SW ∪ (T \ Ti (v)) (if it exists) is at distance min(δSsib , δSout ) = d(v, NN(v, Svsib ) ∪ NN(v, Svout )).
Hence, x cannot be the closest vertex of u in SW . Thus, Claim (i) follows.
Now, assume that Claim (ii) is false. Then there is a vertex y ∈ A\B, which is closest to w.
As w ̸∈ Ti (v) and y ∈ Ti (v), we have d(w, y) = d(w, v) + d(v, y). Since d(v, NN(v, W )) = δSin
and w ∈ (A \ B = W \ NN(v, W )), we have d(v, y) > δSin . This contradicts the fact that the
in
closest vertex from v in SW ∪ Ti (v) (if it exists) is at distance δ from v. Hence, Claim (ii)
is true. ◀

Motivated by Lemma 7, we design the following algorithm based on the dynamic program-
ming technique. For each choice of v ∈ V (T ), i ∈ [ηv ], δvin ∈ [n] ∪ {0, ∞}, δvout ∈ [n] ∪ {∞},
δvsib ∈ [n]∪{∞}, and Cvin , Cvout , Cvsib ⊆ C, we define a subproblem which computes the cardinality
of a minimum sized (partial) consistent subset for the subtree Ti (v) with respect to the pa-
rameters ⟨δvin , δvout , δvsib , Cvin , Cvout , Cvsib ⟩, and denote its size by P (Ti (v), δvin , δvout , δvsib , Cvin , Cvout , Cvsib ).
Let us use δvmin = min(δvin , δvout , δvsib ).
( ( (
Cvin if δvin = δvmin Cvsib if δvsib = δvmin Cvout if δvout = δvmin
A= , B= , D=
∅ otherwise ∅ otherwise ∅ otherwise

We define Cvmin = A ∪ B ∪ D. Note that δvmin is the distance of the closest vertex to v in
sib out
any X ∈ Ei (v, δvsib , Cvsib ), any Y ∈ Ei (v, δvout , Cvout ) or the consistent subset, and that Cvmin
denotes the colors of all such vertices.
To compute any DP entry, we take into account the following six cases. The first two
cases are for checking whether a DP entry is valid. The third case considers the scenario
in which v is part of the solution; the fourth, fifth, and sixth cases collectively consider the
scenario in which v is not in the solution.

Case 1: If C(v) ∈ / Cvmin , return undefined.

Case 2: δv = 0 and Cvin ̸= {C(v)}. Here, return ∞.
in

Case 3: δvin = 0 and Cvin = {C(v)}. Here, return P (Ti (v), δvin , δvout , δvsib , Cvin , Cvout , Cvsib ) =
X
1+ min′ P Tηvj (vj ), δ, 1, ∞, C ′ , {C(v)}, ∅
δ,C
1≤j≤i

Explanation. Case 1 and Case 2 are self-explanatory. Case 3 implies that the vertex v
is included in the consistent subset. Consequently, for the optimal solution, we need to
determine a consistent subset for each tree rooted at a child vj of v, independently of each
other, assuming that v is part of the consistent subset (refer to Figure 2(a)). For every child vj
of v, we iterate through all possible choices of C ′ ⊆ C and δvinj = δ ∈ {1, . . . , h(T (vj )} ∪ {∞}

FSTTCS 2024
7:10 Minimum Consistent Subset in Trees and Interval Graphs

v1 vj
vi
Ti+ (v)

vj 1 vjηv
j

Figure 3 Illustration of the Case 3, where δvin = 0.

where h(T (vj )) is the height of the tree rooted at vj , to identify the minimum consistent
subset for Tηvj (vj ). This is done with the constraints that the closest vertex in the consistent
subset inside Tηvj (vj ) is at a distance of δ and spans C ′ . For any vertex in T (vj ), the path of
the closest vertex of its own color outside Ti (v) has to pass through v, which is considered to
be in the consistent subset and has color C(v). Thus, δvout = 1 is taken for the tree Tηvj (vj ).
Since we are solving for the complete tree rooted at vj with no siblings, we set the distance
to the closest sibling vertex as δvsib = ∞ and the corresponding color set as ∅.

Notations for Subsequent Cases. In the rest of the section, we consider three more cases
where δvin > 0, and hence δvin , δvout , δvsib > 0. Intuitively, while solving the problem recursively,
we will recursively solve MCS in Ti−1 (v) and Tηvi (vi ). We try all possible sets of choices
of Ca , Cb with Cvin = Ca ∪ Cb , and recursively solve for a solution assuming that the nodes
of colors in Ca are present in Ti−1 (v) at a distance of δvin (if Ca ̸= ∅ and such choices are
feasible) and nodes of colors in Cb ⊆ Cvin are present in Tηvi (vi ) at a distance of δvin − 1 from
vi (if Cb ̸= ∅ and such choices are feasible).

Ti−1 (v)
v1 v2 vi−1 vi Ti+ (v)

Ca Cb
Figure 4 Illustration of the Case 4.

Case 4: Ca , Cb ̸= ∅. In this case, the closest vertices in the consistent subset from v in both
Ti−1 (v) and Tη(vi ) (vi ) are located at a distance of precisely δvin . We start by defining the
following. Let δx = min(δvin , δvsib ) and recall δvmin = min(δvin , δvout , δvsib ).
Observe that both Ti+ (v) and Tηvi (vi ) contains siblings of Ti−1 (v). Thus, in a hypothetical
consistent subset, which is compatible with the current partial consistent subset, for the
tree T , the closest vertices from v in T(i−1)+ (v) are either in Ti+ (v) or Tηvi (vi ). Here, δx is
A. Banik et al. 7:11

trying to capture this distance information, and Ci−1 sib

represents the colors of such vertices.
Similarly, from vi in a hypothetical consistent subset CS for T , which is compatible with
the current partial consistent subset, δvmin + 1 denotes the distance to the vertices in CS
contained in T \ T (vi ). Note that these vertices can be either in T(i−1) (v) or in Ti+ (v) or
in T \ T (v). Ciout represents the colors of such vertices.

C b

 if δvin < δvsib and Cb ̸= ∅
Ci−1 = Cb ∪ Cvsib if δvin = δvsib
sib



C sib otherwise
v

Also, define Ciout = A ∪ B ∪ D, where

( ( (
Ca if δvin = δvmin Cvsib if δvsib = δvmin Cvout if δvout = δvmin
A= , B= , D=
∅ otherwise ∅ otherwise ∅ otherwise

Now we can safely assume that there is a δx -equidistant set from v contained in T(i−1)+ (v)
that spans Ci−1
sib
.

Return P (Ti (v), δvin , δvout , δvsib , Cvin , Cvout , Cvsib ) = min P Ti−1 (v), δvin , δvout , δx , Ca , Cvout , Ci−1
sib

Ca ,Cb

+ P Tηvi (vi ), δvin − 1, δvmin + 1, ∞, Cb , Ciout , ∅

Explanation. In this case we iterate over all possible choices of Ca and Cb , assuming
Ca , Cb ̸= ∅ and Ca ∪ Cb = Cvin . In the first part of the recursive formula, we recursively
solve the problem for the tree Ti−1 (v) with the restriction that we have to include a set of
vertices of color Ca in Ti−1 (v) at distance δvin from v (see Figure 2(b)). The restriction on
Cvout and δvout among the vertices in T \ T (v) remains the same as that of the parent problem.
Regarding Cvsib and δvsib , observe that vertices in T (vi ) and Ti+ are part of T(i−1)+ . Therefore
the parameters for the sibling depend on the value of δvin and δvsib , and accordingly, we have
defined Ci−1
sib
.
In the second part of the recursive formula, we are solving the problem recursively for
the tree Tηvi (vi ), with the restriction that, in the consistent set in the consistent set we have
to include a set of vertices from Tηvi (vi ) which are of colors Cb , and at distance δvin − 1 from
vi . Observe that the vertices in Ti−1 (v), Ti+ (v) and T \ T (v) are all outside T (vi ). Thus the
restriction on the distance to the vertices on the consistent subset outside T (vi ) and their
colors depend on the values of δvin , δvsib and δvout . Thus the distance δvout of this subproblem is
defined as δvmin = min(δvin , δvsib , δvout ), and the set of colors Ciout is defined accordingly. As we
are solving for the whole tree rooted at vi , there are no siblings; so δvsib = 0 and Cisib = ∅.

Case 5: Ca = ∅ and Cb = Cvin . Note that in this case, the closest vertices in the consistent
subset from v in Tη(vi ) (vi ) are located at a distance of δvin while in Ti−1 (v), they are
located at a distance of at least δ ≥ δvin + 1
We iterate over all values δ > δvin and all possible choices of colors to find the size of a
minimum consistent subset. We define δx and Ci−1 sib
the same as in Case 4. For any values
δ > δv and C ⊆ [c], we define δv (δ, C) = min(δ, δvsib , δvout ), and Ciout (δ, C) = A ∪ B ∪ D
in min

where
( ( (
C if δ = δvmin Cvsib if δvsib = δvmin Cvout if δvout = δvmin
A= , B= , D=
∅ otherwise ∅ otherwise ∅ otherwise

FSTTCS 2024
7:12 Minimum Consistent Subset in Trees and Interval Graphs

From vi in a hypothetical consistent subset CS for T , which is compatible with the

current partial consistent subset, δvmin (δ, C) denotes the distance to the vertices in CS
contained in T \ T (vi ). Note that these vertices can be either in T(i−1) (v) or in Ti+ (v) or
in T \ T (v). Ciout (δ, C) represents the colors of such vertices.

Return P (Ti (v), δvin , δvout , δvsib , Cvin , Cvout , Cvsib ) = min
in
P Ti−1 (v), δ, δvout , δx , C, Cvout , Ci−1
sib

δ>δv ,C⊆[c]

+ P Tηvi (vi ), δvin − 1, δvmin (δ, C) + 1, ∞, Cb , Ciout (δ, C), ∅

Explanation. The explanation for this case is the same as Case 4 except for the fact that
we have to make sure that the closest vertex chosen in the consistent subset from Ti−1 (v) is
at distance at least δvin + 1.

Case 6: Cb = ∅ and Ca = Cvin .

Here we consider the case when Cb = ∅ and Ca = Cvin . Note that in this case, the closest
vertices in the consistent subset from v in Ti−1 (v) are located at a distance of δvin while in
Tη(vi ) (vi ), they are located at a distance of at least δ ≥ δvin + 1
We define δvmin (δ, C) = min(δvin , δvsib , δvout ). We define Ciout (δ, C) = A ∪ B ∪ D where
( ( (
Ca if δvin = δvmin Cvsib if δvsib = δvmin Cvout if δvout = δvmin
A= , B= , D=
∅ otherwise ∅ otherwise ∅ otherwise

For any values δ > δvin and C ⊆ [c] we define δx (δ, C) = min(δvsib , δ) We define Ci−1
sib
(δ, C) =
E ∪ F where
( (
C if δ = δx (δ, C) Cvsib if δvsib = δx (δ, C)
E= , F =
∅ otherwise ∅ otherwise

The meaning and the reasoning behind the defining of δvmin (δ, C) and Ciout (δ, C) remains the
same as in previous cases. Thus, in a hypothetical consistent subset, which is compatible
with the current partial consistent subset, for the tree T , the closest vertices from v
in T(i−1)+ (v) are either in Ti+ (v) or Tηvi (vi ). Here, δx (δ, C) is trying to capture this
distance information, and Ci−1
sib
(δ, C) represents the colors of such vertices.
In this case we return:

Return P (Ti (v), δvin , δvout , δvsib , Cvin , Cvout , Cvsib )=
min P Ti−1 (v), δvin , δvout , δx (δ, C), Ca , Cvout , Ci−1
sib
(δ, C)
δ>δv , C⊆[c]
in

+P Tηvi (vi ), δ, δvmin + 1, ∞, Cb , C, ∅

Explanation. The explanation for this case is the same as the previous two cases.

Running time of the algorithm. The total number of choices of δvin , δvout , δvsib , Cvin , Cvout and
Cvsib is bounded by n3 23c . For each choice Ca and Cb , the algorithm takes at-most n2c time
to go through all possible entries of δ and C (in case 5 and case 6) and there are at-most
22c choices of Ca and Cb . The recursion runs for at most n2 times. Hence, the worst-case
running time of the algorithm is O(26c n6 ).
A. Banik et al. 7:13

5 NP-hardness of MCS for Interval Graphs

A graph H is said to be an interval graph if there exists an interval layout of the graph H,
or in other words, for each node, vi ∈ V (H) one can assign an interval αi on the real line
such that (vi , vj ) ∈ E(H) if and only if αi and αj (completely or partially) overlap in the
layout of those intervals.
We prove that the Minimum Consistent Subset problem is NP-complete even when the
input graph is an interval graph. We present a reduction from the Vertex Cover problem for
cubic graphs. It is known that Vertex Cover remains NP-complete even for cubic graphs [10].
For any set of intervals I, let G(I) be the interval graph corresponding to the set of intervals I.

Interval Graph Construction.

Let G be any cubic graph, where V (G) = {v1 , . . . , vn } is the set of vertices, and
E(G) = {e1 , . . . , em } is the set of edges in G. We create the set of intervals IG for G
on a real line L. The set of intervals in IG is represented by intervals of three different
sizes, medium, small and large, where each medium interval is of unit length, each
small interval is of length ϵ << 2n1 3 and the length of the large interval is ℓ >> 2n.
We define IG = I1 ∪ I2 ∪ I3 ∪ I4 where I1 contains 2m medium size intervals (two
intervals for each edge) and defined as I1 = {I(ei , vj ) : ei = (vj , y) ∈ E(G), y ∈ V (G)}.
We set color ci to the interval I(ei , vj ). I2 contain n · n3 small intervals of color cm+1 ,
and I3 contain n · n4 small intervals of color cm+1 . I4 contains one large interval Iℓ of
color c1 .
We create the following vertex gadget Xi for each vertex vi ∈ V (G). Xi contains the
following medium size intervals {I(e, vi ) : e = (vi , x) ∈ E(G)} corresponding to the
edges that are incident on vi . These intervals span the same region si of unit length on
the real line L; hence they are mutually completely overlapping. In the vertex gadget
Xi , we also include a total of n3 mutually non-overlapping small intervals in the set
I2 . Span of all the small intervals in Xi is contained in the span si of the medium
sized intervals in Xi (see Figure 5).
Each vertex gadget is placed one after another (in a non-overlapping manner) along
the line L in an arbitrary order, such that a total of n4 mutually non-overlapping small
intervals can be drawn between two consecutive vertex gadgets. Thus, I3 contains n
sets of n4 non-overlapping small intervals. Finally, I4 contains a single large interval
Iℓ that contains all the intervals in I1 ∪ I2 ∪ I3 . This completes the construction.

n3 small intervals medium intervals

v4 v2 G n4 small intervals

v3 X1 X2 Iℓ X3 X4
IG

Figure 5 An example reduction.

▶ Lemma 8. The graph G has a vertex cover of size at most k if and only if the corresponding
interval graph G(IG ) has a consistent subset of size at most K = k(3 + n3 ).

FSTTCS 2024
7:14 Minimum Consistent Subset in Trees and Interval Graphs

Proof. With a slight abuse of notations, we will denote the vertex in G(IG ) corresponding
to an interval I ∈ IG by I. (⇒) Let A ⊆ V (G) be a vertex cover of G. Consider the set of
S
intervals IA = vi ∈A Xi . We prove that IA is a consistent subset of G(IG ). Note that as
|Xi | = 3 + n3 , |IA | = k(3 + n3 ).
As the vertices in A cover all the edges in G, IA must contain at least one interval of each
color {c1 , · · · , cm }. As the unit intervals in Xi associated with a vertex vi ∈ A are of colors
different from the color of the small intervals in Xi , IA contains at least n3 small intervals.
Therefore IA contains at least one interval from each color in {c1 , . . . cm+1 }. Observe that,
(i) the interval Iℓ ∈ I4 of color c1 contains an interval of color c1 that corresponds to the
edge e1 , and (ii) the distance between any two nodes corresponding to medium intervals in
two different vertex gadgets of G(IG ) is 2 (via the node corresponding to the interval Iℓ in
G(IG )). Thus, IA is a consistent subset.
(⇐) Let IB ⊆ IG be any consistent subset of G(IG ) of cardinality (3 + n3 )k. Now, if IB
contains Iℓ then |IG | − 2 ≤ |IB | ≤ |IG | because if e1 = (vi , vj ) be the edge of color c1 , then
we can only do not take the medium intervals of color c1 from the vertex gadget Xi and
Xj in IB because they are covered by Iℓ , which contradicts the fact that |IB | = (3 + n3 )k.
Thus, we have Iℓ ∈ / IB .
By the definition, IB contains at least one color from {c1 , · · · , cm , cm+1 }. Also, if IB
contains one interval from the gadget Xi of any vertex vi , then it must contain all the
intervals from Xi ; otherwise, it can not be a consistent subset. Thus IB is the union X
of at most k sets from {Xi : i ∈ [n]}, and it contains at least one interval from each color
{c1 , · · · , cm }, and a few intervals of color cm+1 . Now, consider the set VB = {vi : Xi ∈ X },
which is a vertex cover for the graph G of size at most k.
Hence, the lemma is proved. ◀

6 Conclusion
We have shown that MCS is NP-complete for trees and interval graphs, and have given
an exact algorithm parameterized w.r.t. the number of colors for trees. As a direction for
future research, possibilities of approximating MCS can be explored for interval graphs and
related graph classes like circle graphs, circular arc graphs, etc. Parameterized algorithms
may also be explored where, in addition to a parameter for the number of colours, there is
also a parameter specifying the structural properties of the input graph.

References
1 Hiroki Arimura, Tatsuya Gima, Yasuaki Kobayashi, Hiroomi Nochide, and Yota Otachi.
Minimum consistent subset for trees revisited. CoRR, abs/2305.07259, 2023. doi:10.48550/
arXiv.2305.07259.
2 Sandip Banerjee, Sujoy Bhore, and Rajesh Chitnis. Algorithms and hardness results for
nearest neighbor problems in bicolored point sets. In Michael A. Bender, Martín Farach-
Colton, and Miguel A. Mosteiro, editors, LATIN 2018: Theoretical Informatics, pages 80–93,
2018. doi:10.1007/978-3-319-77404-6_7.
3 Ahmad Biniaz, Sergio Cabello, Paz Carmi, Jean-Lou De Carufel, Anil Maheshwari, Saeed
Mehrabi, and Michiel Smid. On the minimum consistent subset problem. Algorithmica,
83(7):2273–2302, 2021. doi:10.1007/S00453-021-00825-8.
4 Ahmad Biniaz and Parham Khamsepour. The minimum consistent spanning subset problem
on trees. Journal of Graph Algorithms and Applications, 28(1):81–93, 2024. doi:10.7155/
JGAA.V28I1.2929.
A. Banik et al. 7:15

5 Rajesh Chitnis. Refined lower bounds for nearest neighbor condensation. In Sanjoy Das-
gupta and Nika Haghtalab, editors, International Conference on Algorithmic Learning The-
ory, 29 March - 1 April 2022, Paris, France, volume 167 of Proceedings of Machine Learn-
ing Research, pages 262–281. PMLR, 2022. URL: https://proceedings.mlr.press/v167/
chitnis22a.html.
6 Marek Cygan, Fedor V. Fomin, Lukasz Kowalik, Daniel Lokshtanov, Daniel Marx, Marcin
Pilipczuk, Michal Pilipczuk, and Saket Saurabh. Parameterized Algorithms. Springer Publishing
Company, Incorporated, 1st edition, 2015.
7 Sanjana Dey, Anil Maheshwari, and Subhas C. Nandy. Minimum consistent subset problem
for trees. In Evripidis Bampis and Aris Pagourtzis, editors, Fundamentals of Computation
Theory - 23rd International Symposium, FCT 2021, Athens, Greece, September 12-15, 2021,
Proceedings, volume 12867 of Lecture Notes in Computer Science, pages 204–216. Springer,
2021. doi:10.1007/978-3-030-86593-1_14.
8 Sanjana Dey, Anil Maheshwari, and Subhas C. Nandy. Minimum consistent subset of simple
graph classes. Discret. Appl. Math., 338:255–277, 2023. doi:10.1016/J.DAM.2023.05.024.
9 Reinhard Diestel. Graph theory, volume 173 of. Graduate texts in mathematics, page 7, 2012.
10 Michael R. Garey, David S. Johnson, and Larry J. Stockmeyer. Some simplified NP-
complete graph problems. Theoretical Computer Science, 1(3):237–267, 1976. doi:10.1016/
0304-3975(76)90059-1.
11 Peter E. Hart. The condensed nearest neighbor rule (corresp.). IEEE Transactions on
Information Theory, 14(3):515–516, 1968. doi:10.1109/TIT.1968.1054155.
12 Kamyar Khodamoradi, Ramesh Krishnamurti, and Bodhayan Roy. Consistent subset problem
with two labels. In B.S. Panda and Partha P. Goswami, editors, Algorithms and Discrete
Applied Mathematics, pages 131–142, 2018. doi:10.1007/978-3-319-74180-2_11.
13 Ran Raz and Shmuel Safra. A sub-constant error-probability low-degree test, and a sub-
constant error-probability pcp characterization of np. In Proceedings of the Twenty-Ninth
Annual ACM Symposium on Theory of Computing, STOC ’97, pages 475–484, New York, NY,
USA, 1997. Association for Computing Machinery. doi:10.1145/258533.258641.
14 Vijay V. Vazirani. Approximation Algorithms. Springer Publishing Company, Incorporated,
2010.
15 Gordon Wilfong. Nearest neighbor problems. In Proceedings of the Seventh Annual Symposium
on Computational Geometry, SCG ’91, pages 224–233, 1991. doi:10.1145/109648.109673.

FSTTCS 2024

Exactly Hittable Interval Graphs: K.K. Nisha N.S. Narayanaswamy S.M. Dhannya
No ratings yet
Exactly Hittable Interval Graphs: K.K. Nisha N.S. Narayanaswamy S.M. Dhannya
22 pages
Optimal Algorithms for BIG Graphs
No ratings yet
Optimal Algorithms for BIG Graphs
20 pages
Lecture 3
No ratings yet
Lecture 3
57 pages
Graph Coloring: Branch-and-Cut Approach
No ratings yet
Graph Coloring: Branch-and-Cut Approach
16 pages
1996 - Zheng - Maximum Independent Sets of Circular Arc Graphs Simplified Algorithm and Proofs
No ratings yet
1996 - Zheng - Maximum Independent Sets of Circular Arc Graphs Simplified Algorithm and Proofs
5 pages
Lec 13
No ratings yet
Lec 13
6 pages
NIPS 2009 An Integer Projected Fixed Point Method For Graph Matching and Map Inference Paper
No ratings yet
NIPS 2009 An Integer Projected Fixed Point Method For Graph Matching and Map Inference Paper
9 pages
CS Algorithms Cheatsheet
No ratings yet
CS Algorithms Cheatsheet
3 pages
Minimum Spanning Trees (Ch. 23) ! Minimum Spanning Trees!
No ratings yet
Minimum Spanning Trees (Ch. 23) ! Minimum Spanning Trees!
5 pages
Minimum Spanning Trees
No ratings yet
Minimum Spanning Trees
51 pages
Unit IV
No ratings yet
Unit IV
86 pages
Review 4: CSCI 2720: Data Structures
No ratings yet
Review 4: CSCI 2720: Data Structures
33 pages
Problem Set 1
No ratings yet
Problem Set 1
6 pages
Solutions/Hints For The Problems: 1. Traffic Safe City (Points: 300)
No ratings yet
Solutions/Hints For The Problems: 1. Traffic Safe City (Points: 300)
6 pages
Minimum Spanning Tree
No ratings yet
Minimum Spanning Tree
7 pages
Book 3506
No ratings yet
Book 3506
57 pages
Conflict-Free Covering: Applied Math & Statistics, Stony Brook University Computer Science, Ben-Gurion University, Israel
No ratings yet
Conflict-Free Covering: Applied Math & Statistics, Stony Brook University Computer Science, Ben-Gurion University, Israel
6 pages
Lecture 3
No ratings yet
Lecture 3
62 pages
Barenboim JACM2016
No ratings yet
Barenboim JACM2016
45 pages
CS124 Spring 2011
No ratings yet
CS124 Spring 2011
6 pages
Conflict-Free Coloring: Graphs of Bounded Clique-Width and Intersection Graphs
No ratings yet
Conflict-Free Coloring: Graphs of Bounded Clique-Width and Intersection Graphs
41 pages
Graph Coloring Below Guarantees Via Co-Triangle Packing: Shyan Akmal Tomohiro Koana
No ratings yet
Graph Coloring Below Guarantees Via Co-Triangle Packing: Shyan Akmal Tomohiro Koana
16 pages
1 Greedy
No ratings yet
1 Greedy
116 pages
Linian SIAM1992
No ratings yet
Linian SIAM1992
9 pages
Aaim2024 016
No ratings yet
Aaim2024 016
11 pages
Slides 14
No ratings yet
Slides 14
212 pages
The Hitchhiker's Guide To The Programming Contests
100% (2)
The Hitchhiker's Guide To The Programming Contests
78 pages
Cmu850 f20
No ratings yet
Cmu850 f20
309 pages
Cs 180 Notes UCLA
No ratings yet
Cs 180 Notes UCLA
3 pages
Lecture 11 - Minimum Spanning Tree
No ratings yet
Lecture 11 - Minimum Spanning Tree
61 pages
V Adj U V Q
No ratings yet
V Adj U V Q
5 pages
Minimum Spanning Trees: - Given An Undirected Graph G (V, E)
No ratings yet
Minimum Spanning Trees: - Given An Undirected Graph G (V, E)
34 pages
FLSSS: R Package for Subset Sum Optimization
No ratings yet
FLSSS: R Package for Subset Sum Optimization
22 pages
GrahamHell HistoryMST
No ratings yet
GrahamHell HistoryMST
15 pages
Optimization Paradigms Lecture Notes
No ratings yet
Optimization Paradigms Lecture Notes
118 pages
Introduction To Approximation Algorithms
No ratings yet
Introduction To Approximation Algorithms
5 pages
Greedy Algorithms
No ratings yet
Greedy Algorithms
11 pages
B1-ICPC Programming Contest Guide. Dynamic Programming & Graphs
No ratings yet
B1-ICPC Programming Contest Guide. Dynamic Programming & Graphs
79 pages
Design and Analysis of Algorithms Course Notes
No ratings yet
Design and Analysis of Algorithms Course Notes
161 pages
Book Waterloo
No ratings yet
Book Waterloo
193 pages
Algorithms & Data Structures: Week 10
No ratings yet
Algorithms & Data Structures: Week 10
4 pages
BCS 042 RJT Notes
No ratings yet
BCS 042 RJT Notes
25 pages
Distributed Graph Coloring Insights
No ratings yet
Distributed Graph Coloring Insights
160 pages
Slide 2 Onenote
No ratings yet
Slide 2 Onenote
40 pages
14 MST
No ratings yet
14 MST
20 pages
The Physics of Quantum Information
No ratings yet
The Physics of Quantum Information
23 pages
Selection Problem
No ratings yet
Selection Problem
16 pages
QUESTION 1 (30 Marks)
No ratings yet
QUESTION 1 (30 Marks)
3 pages
15CS43 2017 Jul
No ratings yet
15CS43 2017 Jul
2 pages
AI Algasikhaled
No ratings yet
AI Algasikhaled
60 pages
Lecture 1
No ratings yet
Lecture 1
25 pages
Time and Space Complexity Basics
No ratings yet
Time and Space Complexity Basics
132 pages
Mod 2 (Aad)
No ratings yet
Mod 2 (Aad)
26 pages
Kruskal's Minimum Spanning Tree Algorithm
No ratings yet
Kruskal's Minimum Spanning Tree Algorithm
4 pages
Approximation Algorithms
No ratings yet
Approximation Algorithms
23 pages
8 PPT Complexity Classes P, NP, NP-hard and NP-complete
No ratings yet
8 PPT Complexity Classes P, NP, NP-hard and NP-complete
31 pages
Computer Science
No ratings yet
Computer Science
18 pages
Data Structure (Second Sem) NOTES
No ratings yet
Data Structure (Second Sem) NOTES
2 pages
Midsem 2 Important Questions
No ratings yet
Midsem 2 Important Questions
2 pages
DS Complexities
No ratings yet
DS Complexities
5 pages
Matching (Graph Theory)
No ratings yet
Matching (Graph Theory)
6 pages
Ch. 5 Ada PDF
No ratings yet
Ch. 5 Ada PDF
41 pages
Maximum Sum Subarray of Size K (Easy)
No ratings yet
Maximum Sum Subarray of Size K (Easy)
6 pages
Module 3 - DAA Vtu
No ratings yet
Module 3 - DAA Vtu
80 pages
Strongly Connected Components (Jacques Amsel)
No ratings yet
Strongly Connected Components (Jacques Amsel)
26 pages
The Theory of Computation
100% (21)
The Theory of Computation
471 pages
Chapter 3
No ratings yet
Chapter 3
81 pages
B.Tech CSE Algorithm Model Papers
No ratings yet
B.Tech CSE Algorithm Model Papers
12 pages
Computer Science Math Essentials
No ratings yet
Computer Science Math Essentials
41 pages
Lec1-Introduction To Data Structure and Algorithms
100% (3)
Lec1-Introduction To Data Structure and Algorithms
20 pages
Shortest Path
No ratings yet
Shortest Path
10 pages
Quiz 2
No ratings yet
Quiz 2
5 pages
Courses For Preparedness, Online MSCS
No ratings yet
Courses For Preparedness, Online MSCS
7 pages
Cse2012 PPS10 w2022
No ratings yet
Cse2012 PPS10 w2022
2 pages
Asymptotic Notations by Codewithharry
No ratings yet
Asymptotic Notations by Codewithharry
4 pages

Minimum Consistent Subset in Trees and Interval Graphs

Uploaded by

Minimum Consistent Subset in Trees and Interval Graphs

Uploaded by

Minimum Consistent Subset in Trees and Interval

Anil Maheshwari # Bubai Manna #

Consistent Subset Problem on Graphs(CSPG)

Input: A graph G, a coloring function C : V (G) → [c], and an integer ℓ.

A consistent subset of minimum size is called a minimum consistent subset (MCS).

2 Notation and Preliminary Results

From Lemma 1, we have the following theorem.

3 NP-hardness of MCS for Trees

Interval Graph Construction.

y17 z17 y37 z37

(x1 ∨ x2) (x1 ∨ x3) (x2 ∨ x3)

Figure 1 An example of construction of (Tθ , Cθ ) where θ = (x1 ∨ x2 ) ∧ (x1 ∨ x3 ) ∧ (x2 ∨ x3 ). For

̸ ∅. Let sji ∈ Si ∩VC . But then every vertex v ∈ Si \{sji }

Case 1: Si ⊂ VC , S i ∩ VC = ∅. Notice that there must be a vertex in the literal path

Case 2: S i ⊂ VC , Si ∩ VC = ∅. Case 2 may be argued in a manner similar to Case 1.

4 MCS for Trees: A Parameterized Algorithm

Minimum Consistent Subset for Trees Parameter: c

We consider T as a rooted tree by taking an arbitrary vertex r as its root. We use V (T ′ )

Figure 2 Illustration of the bottom-up dynamic programming routine, ■ vertices denotes

▶ Definition 6. Let din ∈ Z+ out sib +

d(v, W ) = din . (i.e., the distance of v to its nearest member(s) in W is din )

▶ Lemma 7. SW = (ST \ Svin ) ∪ W is a consistent subset for T .

Case 1: If C(v) ∈ / Cvmin , return undefined.

Figure 3 Illustration of the Case 3, where δvin = 0.

trying to capture this distance information, and Ci−1 sib

Also, define Ciout = A ∪ B ∪ D, where

From vi in a hypothetical consistent subset CS for T , which is compatible with the

Case 6: Cb = ∅ and Ca = Cvin .

5 NP-hardness of MCS for Interval Graphs

Interval Graph Construction.

Interval Graph Construction.

n3 small intervals medium intervals

Figure 5 An example reduction.

You might also like