k-均值算法的初始化方法综述

doi:10.15960/j.cnki.issn.1007-6093.2018.02.003

Operations Research Transactions

Previous Articles Next Articles

A survey on the initialization methods for the k-means algorithm

XU Dachuan¹ XU Yicheng^1,2 ZHANG Dongmei^3,*

1. College of Applied Sciences, Beijing University of Technology, Beijing 100124, China; 2. Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, Guangdong, China; 3. School of Computer Science and Technology, Shandong Jianzhu University, Jinan 250101, China

Received:2017-10-16 Online:2018-06-15 Published:2018-06-15

Abstract

Abstract:

The k-means problem which is one of the classical NP-hard problems keeps attracting wide attention in combinatorial optimization and computer science area since been proposed. Given a set of observations consisting of N d-dimensional real vectors, the objective is to partition the N observations into k(\leqslant N) sets, so as to minimize the within-cluster sum of squares, where the center of a cluster is defined as the mean of all the observations in it. As one of the heuristic algorithms for solving k-means problem, k-means algorithm is quite popular in practice because of its excellent perform of convergence. The k-means algorithm can be described as: given initial solution for k-means problem, repeat assignment (assign observations to its nearest center) and update (compute the new centers in the new clusters) step until convergence to a solution. Although the algorithm is usually considered as linearly convergency, its shortcomings are obvious. The k-means algorithm is unable to ensure the global optimality and the output is rely badly on the choice of initial solutions. Therefore, variants of initialization methods are proposed to improve the performance of the k-means algorithm. In this paper we mainly filter and list some of them for readers.

Key words: k-means algorithm, initialization method

XU Dachuan, XU Yicheng, ZHANG Dongmei. A survey on the initialization methods for the k-means algorithm[J]. Operations Research Transactions, doi: 10.15960/j.cnki.issn.1007-6093.2018.02.003.

A survey on the initialization methods for the k-means algorithm

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 0

Recommended Articles

Metrics

Comments