Operations Research Transactions >
2018 , Vol. 22 >Issue 2: 31 - 40
DOI: https://doi.org/10.15960/j.cnki.issn.1007-6093.2018.02.003
A survey on the initialization methods for the k-means algorithm
Received date: 2017-10-16
Online published: 2018-06-15
The k-means problem which is one of the classical NP-hard problems keeps attracting wide attention in combinatorial optimization and computer science area since been proposed. Given a set of observations consisting of N d-dimensional real vectors, the objective is to partition the N observations into k(\leqslant N) sets, so as to minimize the within-cluster sum of squares, where the center of a cluster is defined as the mean of all the observations in it. As one of the heuristic algorithms for solving k-means problem, k-means algorithm is quite popular in practice because of its excellent perform of convergence. The k-means algorithm can be described as: given initial solution for k-means problem, repeat assignment (assign observations to its nearest center) and update (compute the new centers in the new clusters) step until convergence to a solution. Although the algorithm is usually considered as linearly convergency, its shortcomings are obvious. The k-means algorithm is unable to ensure the global optimality and the output is rely badly on the choice of initial solutions. Therefore, variants of initialization methods are proposed to improve the performance of the k-means algorithm. In this paper we mainly filter and list some of them for readers.
Key words: k-means algorithm; initialization method
XU Dachuan, XU Yicheng, ZHANG Dongmei . A survey on the initialization methods for the k-means algorithm[J]. Operations Research Transactions, 2018 , 22(2) : 31 -40 . DOI: 10.15960/j.cnki.issn.1007-6093.2018.02.003
/
| 〈 |
|
〉 |