Operations Research Transactions >
2022 , Vol. 26 >Issue 1: 99 - 112
DOI: https://doi.org/10.15960/j.cnki.issn.1007-6093.2022.01.007
The seeding algorithm for $\mu$-similar Bregman divergences $k$-means problem with penalties
Received date: 2021-07-21
Online published: 2022-03-14
The $k$-means problem is a classic NP-hard problem in clustering analysis. If somedata points are allowed not to be clustered but to pay some penalty, the $k$-means problem with penalty is induced. In this paper, weextend the $k$-means problem with penalty from Euclidean distance tothe more general $\mu $-similar Bregman divergences, and study theinitialization algorithm for the $\mu $-similar Bregman divergences $k$-means problem with penalty. The seeding algorithm is given wherethe approximation ratio is only related to $\mu $ and the ratio ofthe maximum value to the minimum value of the data point penalty.
Wenjie LIU, Dongmei ZHANG, Peng ZHANG, Juan ZOU . The seeding algorithm for $\mu$-similar Bregman divergences $k$-means problem with penalties[J]. Operations Research Transactions, 2022 , 26(1) : 99 -112 . DOI: 10.15960/j.cnki.issn.1007-6093.2022.01.007
| 1 | Aloise D , Deshpande A , Hansen P , et al. NP-hardness of Euclidean sum-of-squares clustering[J]. Machine Learning, 2009, 75, 245- 248. |
| 2 | Drineas P , Frieze A , Kannan R , et al. Clustering large graphs via the singular value decomposition[J]. Machine Learning, 2004, 56, 9- 33. |
| 3 | Awasthi P, Charikar M, Krishnaswamy R, et al. The hardness of approximation of Euclidean k-means[C]//Proceedings of SoCG, 2015: 754-767. |
| 4 | Bachem O, Lucic M, Hassani S H, et al. Approximate k-means++ in sublinear time[C]//Proceedings of AAAI, 2016: 1459-1467. |
| 5 | Bl?mer J, Lammersen C, Schmidt M, et al. Theoretical analysis of the k-means algorithm|A survey[M]//Algorithm Engineering|Selected Results and Surveys, Cham: Springer, 2016: 81-116. |
| 6 | Kanungo T , Mount D M , Netanyahu N S , et al. A local search approximation algorithm for k-means clustering[J]. Computational Geometry: Theory and Applications, 2004, 28, 89- 112. |
| 7 | Lee E , Schmidt M , Wright J . Improved and simplified inapproximability for k-means[J]. Information Processing Letters, 2017, 120, 40- 43. |
| 8 | Xu D , Xu Y , Zhang D . A survey on algorithm for k-means problem and its variants[J]. Operations Research Transactions, 2017, 21, 101- 109. |
| 9 | Ahmadian S, Norouzi-Fard A, Svensson O, et al. Better guarantees for k-means and Euclidean k-median by primal-dual algorithms[C]//Proceedings of FOCS, 2017: 61-72. |
| 10 | Lloyd S . Least squares quantization in PCM[J]. IEEE Transactions on Information Theory, 1982, 28, 21- 33. |
| 11 | Arthur D, Vassilvitskii S. k-means++: The advantages of careful seeding[C]//Proceedings of SODA, 2007: 1027-1035. |
| 12 | Aggarwal A, Deshpande A, Kannan R. Adaptive sampling for k-means clustering[C]//Proceedings of APPROX and RANDOM, 2009: 15-28. |
| 13 | Bachem O, Lucic M, Hassani S H, et al. Fast and provably good seedings for k-means[C]//Proceedings of NIPS, 2016: 55-63. |
| 14 | Bachem O, Lucic M, Krause A. Distributed and provably good seedings for k-means in constant rounds[C]//Proceedings of ICML, 2017: 292-300. |
| 15 | Bahmani B, Moseley B, Vattani A, et al. Scalable k-means++[C]//Proceedings of the VLDB Endowment, 2012: 622-633. |
| 16 | Ostrovsky R , Rabani Y , Schulman L , et al. The effectiveness of Lloyd-type methods for the k-means problem[J]. Journal of the ACM, 2012, 59, 1- 22. |
| 17 | Tseng G C . Penalized and weighted k-means for clustering with scattered objects and prior information in high-throughput biological data[J]. Bioinformatics, 2007, 23, 2247- 2255. |
| 18 | Zhang D, Hao C, Wu C, et al. A local search approximation algorithm for the k-means problem with penalties[C]//Proceedings of COCOON, 2017: 568-574. |
| 19 | Chang X Y, Wang Y, Li R, et al. Sparse k-means with l∞/l0 penalty for high-dimensional data clustering[EB/OL]. (2014-03-31)[2021-07-08]. https://arxiv.org/pdf/1403.7890.pdf. |
| 20 | Ackermann M R. Algorithms for the Bregman k-Median problem[D]. Paderborn: University of Paderborn, 2009. |
| 21 | Banerjee A , Merugu S , Dhillon I S , et al. Clustering with Bregman divergences[J]. Journal of Machine Learning Research, 2005, 6, 1705- 1749. |
| 22 | Li M , Xu D , Yue J , et al. The seeding algorithm for k-means problem with penalties[J]. Journal of Combinatorial Optimization, 2020, 39, 15- 32. |
/
| 〈 |
|
〉 |