Proving the set-covering problem is NP-complete (using reduction from the vertex cover problem)


In this post, we will prove that the decision version of the set-covering problem is NP-complete, using a reduction from the vertex covering problem (which is NP-complete). This is Exercise 35.3-2 in CLRS3.

We will follow the template given in an earlier post.

Problem Statement

We seek to answer whether a set cover C\mathcal{C} exists for XX (the set containing the elements to be covered), containing a maximum of kk sets.

Showing the problem is in NP

Let C={S1,S2,,Sn}\mathcal{C} = \{S_1, S_2, \ldots, S_n\} be the certificate. We assume that each SCS \in \mathcal{C} uniquely covers at least one xXx \in X, so:

CXSC:SX \begin{aligned} |\mathcal{C}| &\leq |X|\\ \forall S \in \mathcal{C} : |S| &\leq |X| \end{aligned}

This gives us that we can verify that nkn \leq k and that C\mathcal{C} covers XX in O(X2)\mathcal{O}(|X|^2), which proves the problem is in NP.

Showing the problem is NP-hard

We will show that the decision version of the set-covering problem is NP-hard by showing:

Vertex-CoverPSet-Cover \text{Vertex-Cover} \leq_\text{P} \text{Set-Cover}

Reduction

A vertex cover of an undirected graph G=(V,E)G = (V, E) is a subset VVV' \subseteq V, such that each eEe \in E is adjacent to at least one vVv' \in V. Its decision version is to determine whether a vertex cover of at most size kk exists for GG.

We can reduce any instance of decision-vertex-cover for G=(V,E)G = (V, E) to an instance of decision-set-cover by letting X=EX = E and letting the family of sets F=SvvV deg(v)>0\mathcal{F} = {S_v | v \in V\ \land \text{deg}(v) > 0}, such that SvS_v contains all edges in EE adjacent to vv. We can clearly perform this reduction in O(E+V)\mathcal{O}(|E| + |V|), which is in polynomial time.

We will now prove for any vertex cover VV' for GG and a set cover C\mathcal{C} for sets XX and F\mathcal{F} constructed in accordance with the reduction above:

VV:Vk    CF:Ck \exists V'\subseteq V : |V'| \leq k \iff \exists \mathcal{C} \subseteq \mathcal{F} : |\mathcal{C}| \leq k

Since X=EX = E, and we know VV' is a vertex cover for G=(V,E)G = (V, E), we can find C\mathcal{C} for (X,F)(X, \mathcal{F}) by selecting from F\mathcal{F} all sets corresponding to vVv' \in V'.

Similarily, if C\mathcal{C} exists, by the way we defined the reduction, we know it covers X=EX = E. This means We can find VV' by taking the vertices corresponding to SvCS_v \in \mathcal{C}.

Due to the 1:1 correspondence of vertices in VV' and sets in C\mathcal{C}: V=C |V'| = |\mathcal{C}|~\blacksquare

Example

An example is probably useful here. See the image below for an example of a vertex cover of size 4 (with shaded nodes being in VV').

Example of vertex cover (shaded nodes are in V')

The result of performing the reduction for this example is:

F={Sv1={e1,e3},Sv3={e2,e5},Sv4={e3,e6},Sv6={e4,e7,e8}} \begin{aligned} \mathcal{F} = \{S_{v_1} &= \{e_1, e_3\}, \\ S_{v_3} &= \{e_2, e_5\},\\ S_{v_4} &= \{e_3, e_6\},\\ S_{v_6} &= \{e_4, e_7, e_8\}\} \end{aligned}