An improved primal-dual approximation algorithm for the k-means problem with penalties
Abstract In the k-means problem with penalties, we are given a data set ${\cal D} \subseteq \mathbb{R}^\ell $ of n points where each point $j \in {\cal D}$ is associated with a penalty cost p j and an integer k. The goal is to choose a set ${\rm{C}}S \subseteq {{\cal R}^\ell }$ with |CS| ≤ k and a penalized subset ${{\cal D}_p} \subseteq {\cal D}$ to minimize the sum of the total squared distance from the points in D / D p to CS and the total penalty cost of points in D p , namely $\sum\nolimits_{j \in {\cal D}\backslash {{\cal D}_p}} {d^2}(j,{\rm{C}}S) + \sum\nolimits_{j \in {{\cal D}_p}} {p_j}$ . We employ the primal-dual technique to give a pseudo-polynomial time algorithm with an approximation ratio of (6.357+ε) for the k-means problem with penalties, improving the previous best approximation ratio 19.849+∊ for this problem given by Feng et al. in Proceedings of FAW (2019).