# Introduction lustering is a type of categorization imposed rules on a group of data points or objects. A broad definition of clustering could be "the process of categorizing a finite number of data points into groups where all members in the group are similar in some manner". As a result, a cluster is a aggregation of objects. All data points in the same cluster have common properties (e.g. distance) which are different to the data points laying in other clusters. Cluster analysis is an iterated process of knowledge discovery and it is a a multivariate statistical technique which identifies groupings of the data objects based on the inter-object similarities computed by a chosen distance metric .Clustering algorithms can be classified into two categories: Hierarchical clustering and Partitional clustering [1]. The partitional clustering algorithms, which differ from the hierarchical clustering algorithms, are usually to create some sets of clusters at start and partition the data into similar groups after each iteration. Partitional clustering is more used than hierarchical clustering because the dataset can be divided into more than two subgroups in a single step but for hierarchy method, always merge or divide into 2 subgroups, and don't need to complete the dendrogram [2]. Cluster analysis of data is an important task in knowledge discovery and data mining. Cluster analysis aims to group data on the basis of similarities and dissimilarities among the data elements. The process can be performed in a supervised, semi-supervised or unsupervised manner. Different algorithms have been proposed which take into account the nature of the data and the input parameters in order to partition the data. Data vectors are clustered around centroid vectors. The cluster the data vector belongs to is determined by its distance to the centroid vector. Depending on the nature of the algorithm, the numbers of centroids are either defined in advance by the user or automatically determined by the algorithm. Discovering the optimum number of clusters or natural groups in the data is not a trivial task. The popular clustering techniques which are suggested so far are either partition based or hierarchy based, but both approaches have their own advantages and limitations in terms of the number of clusters, shape of clusters, and cluster overlapping [3] .Some other approaches are designed using different clustering techniques and involve optimization in the process. The involvement of intelligent optimization techniques has been found effective to enhance the complex, real time, and costly data mining process. # II. # K-means algorithm The conventional K-mean algorithm is based on decomposition, most popular technique in data mining field. The concept of K-Means algorithm uses K as a parameter, Divide n object into K clusters, to create relatively high similarity in the cluster and, relatively low similarity between clusters. And minimize the total distance between the values in each cluster to the cluster center. The cluster center of each cluster is the mean value of the cluster. The calculation of similarity is done by mean value of the cluster objects. The measurement of the similarity for the algorithm selection is done by the reciprocal of Euclidean distance. That is to say, the closer the distance, the bigger the similarity of two objects, and vice versa. a) Procedure of K-mean Algorithm K-mean distributes all objects to K number of clusters at random; 1) Calculate the mean value of each cluster, and use this mean value to represent the cluster; 2) Re-distribute the objects to the closest cluster according to its distance to the cluster center; 3) Update the mean value of the cluster, say, calculate the mean value of the objects in each cluster; 4) Calculate the criterion function E, until the criterion function converges. Usually, the K-mean algorithm criterion function adopts square error criterion, defined as: In which, E is total square error of all the objects in the data cluster, p is given data object, mi is mean value of cluster Ci (p and m are both multidimensional). The function of this criterion is to make the generated cluster be as compacted and independent as possible [4]. b) Analysis of the Performance of the K-mean Algorithm i. # Advantages 1) It is a classic algorithm to resolve cluster problems; this algorithm is simple and fast; 2) For large data collection, this algorithm is relatively flexible and highly efficient, because the complexity is O (ntk), among which, n is the number of all objects, k is the number of cluster, t is the times of Iteration. Usually, k<