A Modified Version of the K-means Clustering Algorithm
Keywords:
clustering, data mining, initial centroids, k-means clustering
Abstract
Clustering is a technique in data mining which divides given data set into small clusters based on their similarity. K-means clustering algorithm is a popular, unsupervised and iterative clustering algorithm which divides given dataset into k clusters. But there are some drawbacks of traditional k-means clustering algorithm such as it takes more time to run as it has to calculate distance between each data object and all centroids in each iteration. Accuracy of final clustering result is mainly depends on correctness of the initial centroids, which are selected randomly. This paper proposes a methodology which finds better initial centroids further this method is combined with existing improved method for assigning data objects to clusters which requires two simple data structures to store information about each iteration, which is to be used in the next iteration. Proposed algorithm is compared in terms of time and accuracy with traditional k-means clustering algorithm as well as with a popular improved k-means clustering algorithm.
Downloads
- Article PDF
- TEI XML Kaleidoscope (download in zip)* (Beta by AI)
- Lens* NISO JATS XML (Beta by AI)
- HTML Kaleidoscope* (Beta by AI)
- DBK XML Kaleidoscope (download in zip)* (Beta by AI)
- LaTeX pdf Kaleidoscope* (Beta by AI)
- EPUB Kaleidoscope* (Beta by AI)
- MD Kaleidoscope* (Beta by AI)
- FO Kaleidoscope* (Beta by AI)
- BIB Kaleidoscope* (Beta by AI)
- LaTeX Kaleidoscope* (Beta by AI)
How to Cite
Published
2015-05-15
Issue
Section
License
Copyright (c) 2015 Authors and Global Journals Private Limited
This work is licensed under a Creative Commons Attribution 4.0 International License.