Abstract

Many issues concerned with clustering process are due to large datasets involves. In clustering computation become expensive when there are large data sets involved and work efficiently when there is limited number of cluster with relatively small data set. This paper will present a new technique for clustering for large datasets. That will work efficiently equally with large data set as well as with small data sets. The main idea behind this method is to divide the whole process in two steps. The first step uses a cheap approximate distance measure that divide the data into overlapped subsets we call it stubs. Then in second step clustering is performed for measuring exact distances only between points that occur in common stubs. The stub based clustering approach reduces computation time over a traditional clustering and also increases its efficiency.

How to Cite
BAGGA, DR. G.N. SINGH, Simmi. Clustering Method for categorical and Numeric Data sets. Global Journal of Computer Science and Technology, [S.l.], dec. 1969. ISSN 0975-4172. Available at: <https://computerresearch.org/index.php/computer/article/view/834>. Date accessed: 25 jan. 2021.