A Modified Version of the K-means Clustering Algorithm

Authors

  • Juhi Katara

  • Naveen Choudhary

Keywords:

clustering, data mining, initial centroids, k-means clustering

Abstract

Clustering is a technique in data mining which divides given data set into small clusters based on their similarity. K-means clustering algorithm is a popular, unsupervised and iterative clustering algorithm which divides given dataset into k clusters. But there are some drawbacks of traditional k-means clustering algorithm such as it takes more time to run as it has to calculate distance between each data object and all centroids in each iteration. Accuracy of final clustering result is mainly depends on correctness of the initial centroids, which are selected randomly. This paper proposes a methodology which finds better initial centroids further this method is combined with existing improved method for assigning data objects to clusters which requires two simple data structures to store information about each iteration, which is to be used in the next iteration. Proposed algorithm is compared in terms of time and accuracy with traditional k-means clustering algorithm as well as with a popular improved k-means clustering algorithm.

How to Cite

Juhi Katara, & Naveen Choudhary. (2015). A Modified Version of the K-means Clustering Algorithm. Global Journal of Computer Science and Technology, 15(C7), 1–6. Retrieved from https://computerresearch.org/index.php/computer/article/view/1301

A Modified Version of the K-means Clustering Algorithm

Published

2015-05-15