# I. Introduction he volume of digital images produced in the world wide has increased dramatically over the past 10 decades and the World Wide Web plays a vital role in this upsurge. This has created the availability of huge digital image databases or libraries. The handling and accessing of these data base images by human annotations is impractical and it has led to the automatic search mechanisms and it has created a demand for content based image retrieval (CBIR) models. CBIR is defined as a process that searches and retrieves images from a large database. The retrieval operation is performed on the basis of derived image features such as color, texture and shape. A good literature survey was conducted on CBIR and is available in [1][2][3][4]. The color is one of the significant feature of the CBIR and one of the simple color based CBIR is the color histogram [5].The retrieval performance of this generally limited due to its low discrimination power mainly on immense data. To improve this various color descriptors are proposed in the literature using neural networks [6], DCT-domain vector quantization [7], supervised learning [8] and color edge co-occurrence histograms [9]. The natural images are visualized by their rich content of texture mosaic and color. The texture descriptors are based on grey scale variation and they can also integrate with color component of image retrieval (IR). It is very difficult to give unique definition to texture and it is one of the significant and salient features for CBIR. The texture based image retrieval is reported in the literature based on the characteristics of images in different orientations [10,11,12,13,14,15]. Extraction of texture features on wavelets [16], wavelet transform based texture features [16] and correlagrams [17] are also proposed for efficient IR. The performance of the correlograms [17] is further improved using genetic algorithms (GA) [18]. The integrated methods that combine the color histograms with texture features [19,20] and correlograms with rotated wavelets [21] attained a good IR rate. Recently, the research focuses on CBIR systems that is fetching the exact cluster of relevant images and reducing the elapsed time of the system. For this purpose, various data mining techniques have been developed to improve the performance of CBIR system. Clustering is one of the vital techniques of data mining for quick retrieval of information from the large data repositories. Clustering is an unsupervised process, thus the evolution of clustering algorithm is important due to the extraction of hidden patterns [22,23]. There are many applications in the real-world with clustering like credit card, mark analysis, web data categorization, image analysis, text mining, pattern recognition, market data analysis, weather report analysis [24].Data clustering explicitly divides the data into a set of k user specified number of groups by trying to minimize intra-cluster variance and maximize inter-cluster variance in an iterative manner [25,26]. Various methods are proposed in the literature to improve the performance of the data clusters [27,28,29] in various applications. K-means [30] is one of the popular and efficient clustering algorithms. Later various variations to k-means algorithm are proposed to improve the efficiency [31,32,33]. A content-based image retrieval method using adaptive classification and cluster-merging is proposed for image retrieval to find multiple clusters of a complex image query [34]. This method [34] achieves the same retrieval quality, under linear transformations, regardless of the shapes of clusters of a query. A cluster-based image retrieval system by unsupervised learning (CLUE), is proposed for improving user interaction with image retrieval systems by fully exploiting the similarity information [35]. The CLUE retrieves image clusters by applying a graph-theoretic clustering algorithm and it is dynamic in nature. The CLUE retrieves image clusters instead of a set of ordered images. The principle of unsupervised hierarchical clustering is also used in CBIR [36]. The modified fuzzy c-means (MFCM) clustering scheme introduced fuzzy weights and it reduced the time of clustering and also used for image retrieval [37,66]. A content-based parallel image retrieval system to achieve high responding ability is proposed and it is based on cluster architectures [38]. It has several retrieval servers to supply the service of content-based image retrieval. Many researchers used k-means clustering with variations and achieved a good image retrieval rate [39,40,41,42]. K-means clustering technique is helpful to reduce the elapsed time of the system. The rest of the paper organized as follows: The proposed method, local directional pattern (LDP),kmeans and query matching are given in Section 2. Experimental results and discussions are summarized in section 3. Based on above work, conclusions are made in section 4. # II. Proposed Method The present paper initially converts the color image into grey level image using HSV quantization. The present paper derives integrated features that significantly holds edge, shape and texture features, for this initially edge responses are obtained then shape features in the form of textons are evaluated. Then GLCM features are obtained. Images are clustered based on the two point perimeter K-means (TPP-KM) clustering scheme. A similarity measure in the form of Euclidian distance is used to retrieve the top most similarity images. # a) Algorithm for feature extraction The features are extracted based on the following steps Step 1: The color image is converted in to grey level images using HSV color space. Step 2: Conversion of edge response image in to ternary pattern image. This is derived based on two sub steps 2(a) and 2 (b). Step 2 a): The local features in the form of edge responses in eight directions are obtained on the grey level image based on local direction pattern (LDP) coded image. The formation process of LDP is explained below. The LDP is an eight bit binary code that describes the relative edge value of a pixel in different directions [43]. The present paper evaluates edge responses in eight directions on a central pixel of a 3 x 3 neighborhood using Kirsch masks [68].Out of eight (m i /i=0, 1 ?.7) only the k-most significant edges are given a value 1 and the remaining are set to zero. The three greatest responses, i.e. k=3 are considered in the present paper. The reason for this is the occurrence of corner or edge indicates a huge edge response value in a particular direction. The LDP code generation on a 3x3 neighborhood is shown below in Figure 1 Image with noise Step 2b): Conversion of LDP coded image in to ternary form, based on a threshold. This mechanism simplifies the extraction of textons that represent shape of the texture in the next step. This also makes the present process to be resistant to lighting effects, noise and neighborhood values are assigned one of the ternary values T i . (Equation 1). # ???????? (?? ?? ) = ? 2 ?? ?? ? (?? ?? + ??) 1 |?? ?? ? ?? ?? | < ?? 0 ?? ?? ? (?? ?? ? ??) (1) The process of generation of this is illustrated in Figure 3 with l=3. The proposed edge responses generate a total of 0 to K*(P-1) codes and this is considered as the main disadvantage. Here k is the number of greatest edge responses considered and p is the number of neighboring pixels. This is not considered as the disadvantage in the present paper, since we are not deriving LDP coded image and we are only deriving ternary patterns out of the LDP coded image. Further it is more convenient to derive shape feature (in the next step) on local ternary patterns (0 or 1 or 2) derived from edge responses. Step 3: Derivation of local shape features in the form of textons on the ternary image. The method of deriving textons on ternary image is given in Figure 4. The basic unit of an image is pixels and its intensity and experiments based on this have not resulted any satisfactory results. In order to progress the performance the pattern and shape based methods are employed. A pattern and shape consists of group or set of neighboring pixels with similar intensity levels. One of such popular measure is "texton" proposed by Julesz [44]. Textons are defined as emergent patterns or blobs. These "textons" share a common property all over the image. The methods based on LBP and textons are very useful for texture analysis and classification [45,46,47] face recognition [48], age classification [49,50,51,52], image retrieval [15] etc. Variousarray grammar models are proposed in the literature to represent patterns and shapes [53,54]. Based on textons one can say whether texture is fine or coarse or in any other form. Textons can be derived on a 2x 2 or on a 3x3 or on any neighborhood window. The present paper utilized all texton patterns that forms only with two and four pixels on a 2×2 grid. This derives seven textons on a 2 x 2 grid. The derivation of texton image with the above 7 local shape features (textons) is shown below Figure 4. The present paper evaluated four Haralick features [55] for effective image retrieval and they are listed below. The features homogeneity, energy, contrast and correlation are evaluated with an angle of 0 o , 45 o , 90 o and 135 o and the average value of this are considered as texture feature. Homogeniety or Angular Second Moment (ASM): ASM= ? ? {P(i, j)} 2 G?1 j=0 G?1 i=0(2) ASM is a measure of homogeneity of an image. A homogeneous scene will contain only a few grey levels, giving a GLCM with only a few but relatively high values of P (i, j ). Thus, the sum of squares will be high. Energy : Energy = ( ) 2 , , ? j i j i P (3) Contrast : Contrast=? n 2 G?1 n=0 ?? ? P(i, j) G j=1 G i=1 ?, |i ? j| = n(4) This measure of contrast or local intensity variation will favor contributions from P (i, j) away from the diagonal, i.e. i ! = j. Correlation : Correlation = ? ? {iXj }XP (i,j)??µ x Xµ y ? ? x X? y G?1 j=0 G?1 i=0 (5) Correlation is a measure of grey level linear dependence between the pixels at the specified positions relative to each other. # b) Clustering method One of the commonly used and simplest algorithm for clustering is the K-means algorithm. Kmeans is one of the fundamental algorithms of clustering and it employs the square error criterion. The numbers of partitions are to be defined in K-means initially. The cluster centers are randomly initialized for predefined number of clusters. If the initial number of clusters is not properly chosen then the output of algorithm may converge to false cluster locations and completely different clustering result [58,59]. This measure is often called the squared-error distortion [60, The present paper outlined a new variation to the existing K-means algorithm to reduce the number of iterations and to increase the overall retrieval rate. This new variation of K-means scheme is denoted as two point perimeter -K-means (TPP-KM) clustering scheme. The present scheme selects two points instead of one point in K-means and also a perimeter is also evaluated and the similarity is evaluated by using Euclidean distance. # c) Query matching and performance measure The present retrieval model selects 20 top images from the database images that are matching with query image. This is accomplished by measuring the distance between the query image and database images. The present paper used Euclidean distance as the distance measure and as given below ???????? ?? (?? ?? , ?? ?? ) = ?? ?ð??"ð??" ?? (?? ?? ) ? ð??"ð??" ?? (?? ?? )? 2 ??,?? =1 ? 1/2 (6) The database image is used as the query image in our experiments. If the retrieved image belongs to the same category as that of query image we say that the system has suitably identified the predictable image otherwise the system fail to find the image. The performance of the present model is evaluated in terms of precision, recall rate and F-Measure as given in equation 7, 8 and 9. # ?????????????????? ?? = ???????????? ??ð??"ð??" ???????????????? ???????????? ?????????????????? (?? ???? ) ???????????? ??ð??"ð??" ?????????????????? ???????????? (?? ???? ) (7) ???????????? ?? = ???????????? ??ð??"ð??" ???????????????? ???????????? ?????????????????? (?? ???? ) ?????????? ???????????? ??ð??"ð??" ???????????????? ???????????? ???? ????? ???????????????? (?? ???? ) (8) The algorithms that improve precession may degrade recall and vice versa. The present paper also evaluates another parameter called F-measure that is based on both precession and recall. ?? ? ?????????????? = 2 * (?????????????????? * ?????? ?????? ) (?????????????????? +???????????? )(9) # III. Results and Discussion In order to efficiently investigate the performance of the present retrieval model, we have considered the Wang database [64]. Wang is a subset of Corel stock photo database of 1000 images. These images are grouped into 10 classes, each class contains 100 images. Within this database, it is known whether any two images are of the same class. Classification of the images in the database into 10 classes makes the evaluation of the system easy. The hefty size of each class and the heterogeneous image class contents made Wang data base as one of the popular database for image retrieval. The present paper considered 7-classes of images and 100 images per each class. For a query image the relevant images are assumed to be the remaining 99 images of the same class. The images from all other classes are treated as irrelevant images. The retrieval performance of the proposed method is judged in terms of precession, recall and F-measure. The proposed clustering method derived integrated novel features from edge responses, shapes in the form of textons and statistical parameters in the form of texture features (GLCM features). The average retrieval performance of the proposed method is compared with CBIR methods using data mining techniques [65, 66, 67] and the proposed method with K-means clustering method. The proposed method outperformed all the other methods in terms of precession, recall and F-measure and this is shown in the Figure 5, 6 and Figure 7. In the method [65] the features are extracted by GLCM features. In the existing method [67] fuzzy C-means clustering scheme is used with GLCM features and the method [66] used both color and statistical features with portioned clustering scheme. The advantage of the proposed method is the derivation of significant and powerful local features. Figure 8 shows seven examples of retrieval images, i.e. one image from each class, by the proposed method with 20-top most retrieved image. Where T n query image, I n image in database; The present paper proposed a CBIR method using a data mining algorithm. The proposed method used a simple clustering scheme and achieved high retrieval rate when compared with the other existing methods because the proposed method extracted powerful and significant local features derived from edge responses, shape and textural properties. As with many other clustering algorithms, a limitation with our algorithm is that it requires the number of clusters to be known in prior. The advantage of edge responses is it can sustain with non-monotonic illumination variation and random noise. The shape features derived from textons are rotationally invariant. The texture features in the form of GLCM features with the help of clustering scheme retrieved the images in an accurate manner. The proposed method is experimented with one of the popular and heterogeneous dataset "Wang" and the experimental results indicates the superiority of the present method over the other existing methods. Year 2016 ( ) F ![. The advantage of LDP over Local binary pattern (LBP) is, LDP can sustain the noise. And this is shown Figure 2. The Figure 2(b) corresponds to the noisy or fluctuated neighborhood of Figure 2(a). In this case the LBP code changes drastically whereas the LDP retains the same value.](image-2.png "") 2![Figure 2: Stability of LDP vs. LBP (a) Original image (b) Image with noise](image-3.png "Figure 2 :") 3![Figure 3: Transformation of local edge responses image into ternary pattern image](image-4.png "Figure 3 :") 4![Figure 4: Transformation of Texton process: a) Original image (b) Textons identification (c) Texton image Derivation of GLCM features. GLCM features are computed on the derived texton matrix.The present paper evaluated four Haralick features[55] for effective image retrieval and they are listed below. The features homogeneity, energy, contrast and correlation are evaluated with an angle of 0 o , 45 o , 90 o and 135 o and the average value of this are considered as texture feature. Homogeniety or Angular Second Moment (ASM):](image-5.png "Figure 4 :") 4![61] and this type of clustering falls into the general category of variance-based clustering [62, 63].](image-6.png "Step 4 :") 5![Figure 5: Average precision graph](image-7.png "Figure 5 :F") 76818![Figure 7: Average F-Measure graph](image-8.png "Figure 7 :Figure 6 :Figure 8 1 FFigure 8 F") © 2016 Global Journals Inc. (US) * A CBIR method based on color-spatial feature, in: TENCON ZLei LFuzong ZBo Proceedi-ngs of the IEEE Region 10 Conference eedi-ngs of the IEEE Region 10 Conference 1999 * Color traits transfer to grayscale images HBKekre SDThepade IEEE Int. Conference on Emerging Trends in Engineering and Technology 2008 ICETET * Fast and robust, color feature extraction for content-based image retrieval CMPun CFWong Int. J. Adv. Comput. Technol 3 6 2011 * Human color perception in the HSV space and its application in histogram generation for image Retrieval AVadivel SSural AKMajumdar Proc. SPIE, Color Imaging X: Processing, Hardcopy, and Applications SPIE, Color Imaging X: essing, Hardcopy, and Applications 2005 * Colour-based image retrieval using spatial-chromatic histogram LCinque GCiocca SLevialdi APellicano RSchettini Image Vis. Comput 19 2001 * Image retrieval using color histograms generated by Gauss mixture vector quantization SJeong CSWon RMGray Computer Vision Image Understanding iss-1-3, june-2004 94 * Content-based image retrieval using color and texture fused features JYue ZLi LLiu ZFu Math. Comput. Modelling 54 2011 * A smart contentbased image retrieval system based on color and texture feature CHLin RTChen YKChan Image Vis. Comput 27 6 2009 * Color object detection using spatial-color joint probability function JLuo DCrandail IEEE transitions on Image Processing 2006 15 * Content-based manipulation of image databases RWPentland SPicard PhotobookSclaroff Int. J. Comput. Vis 18 1996 * Color and texture feature for content based image retrieval JWu ZWei YChang Int. J. Digit. Content Technol. Appl 4 3 2010 * Texture features for browsing and retrieval of image data BSManjunathi WYMa IEEE Trans. Pattern Anal. Mach. Intell 8 8 1996 * Age classification based on simple LBP transitions VGorti S Murty AKumar Obulesu International journal of computer science and engineering (IJCSE) 5 10 OCT-2013 * Facial image retrieval based on local and regional features AObulesu JS Kiran Kumar IEEE-2015 International Conference on Applied and Theoretical Computing and Communication Technology (iCATccT) Oct. 2015 * Dual Transition Uniform LBP Matrix for Efficient Image Retrieval VKumar ASrinivasaRao Yk Sundara Krishna I.J. Image, Graphics and Signal Processing 8 2015 * Texture image retrieval using rotated wavelet filters MKokare PBiswas BChatterji J. Pattern Recognition. Lett 28 2007 * A new algorithm for image indexing and retrieval using wavelet correlograms HAMoghaddam THajoie AHRouhi Int. Conf. Image Process Tehran, Iran 2003 2 N. Toosi Univ. of Technl. * Enhanced wavelet correlogram methods for image indexing and retrieval MTSaadatmand HAMoghaddam IEEE Int.Conf. Image Procc Iran 2005 N. Toosi Univ. of Technol. Tehran * Color and texture features for content based image retrieval LBirgale MKokare DDoye Int. Conf. Comput. Grafics. Image Visual Wash., USA 2006 * Color and texture features for image indexing and retrieval MSubramanyam ABGonde RPMaheshwari IEEE Int. Adv. Comput. Conf. Patial., Ind 2009 * A correlogram algorithm for image indexing and retrieval using wavelet and rotated wavelet filters MSubramnyam RPMaheshwari RBalasubramanian Int. J. SignalImag. Syst.Eng 4 1 2011 * An Efficient K-means Clustering Algorithm KAlsabti SRanka VSingh Proc. First Workshop High Performance Data Mining First Workshop High Performance Data Mining 1998 March * MEDLINE Text Mining: An Enhancement Genetic Algorithm Based Approach for Document Clustering WB AKaraa ASAshour DBSassi PRoy NKausar N&dey Applications of Intelligent Optimization in Biology and Medicine Springer International Publishing 2016 * Biomedical Image Analysis and Mining Techniques for Improved Health Outcomes WB AKarâa 2015 IGI Global * Data clustering: A review AJain MMurty PFlynn ACM Comput. Surv 31 3 1999 * Algorithms for Clustering Data AJain R&dubes 1988. 1988 Prentice-Hall Englewood Cliffs, NJ * A New Approach to Cluster Datasets without Prior Knowledge of Number of Clusters Ch VSwetha Swapna JV RVijaya Kumar Murthy JSIR 74 05 May 2015 * A Novel Hybrid Clustering Algorithm: Integrated Partitional and Hierarchical clustering algorithm for categorical data"-International journal of computer science and Emerging Technologies RishiSayal GvsrDr Dr V VijayaPrasad Kumar IJCSET) * GPrasad V S N R V * VKrishna Venkata * Clustering Approaches Based On Initial Seed Points VKumar Vijaya International Journal on Computer Science and Engineering 3 12 Dec 2011 Automatic * Some methods for classification and analysis of multivariate observations JMcqueen Proc. of 5th Berkeley Symposium on Mathema-tical Statistics and Probability of 5th Berkeley Symposium on Mathema-tical Statistics and Probability 1967 * Recherche approximative de plus prochesvoisinsaveccontr?oleprobabiliste de la pr´ecision; application `a la recherchedimages SABerrani 2004 par le contenu, PHD thesis, 210 pages * Innovative Modified K-Mode Clustering algorithm RishiSayal Dr V VijayaKumar Interna-tional journal of Engineering Research and Applications (IJERA) 2 July 2012 * Improving Efficiency of K-Means Algorithm for Large Datasets Ch VSwapna JV RKumar Murthy International Journal of Rough Sets and Data Analysis 3 2 IJRSDA) * Qcluster: Relevance Feedback Using Adaptive Clustering for Content-Based Image Retrieval Deok-HwanKim Chin-WanChung SIGMOD 2003 June 9-12 * CLUE: Cluster-Based Retrieval of Images by Unsupervised Learning YChen JZWang RKrovetz IEEE Trans. Image Processing 14 8 AUGUST 2005 * Automatic Content-Based Image Retrieval Using Hierarchi-cal Clustering Algorithms KJarrah SriKrishnan LingGum Int. Joint Conf. on Neural Networks July 16-21, 2006 * An Effective and Fast Retrieval Algorithm for Content-based Image Retrieval LiuPengyu LvzhuoyiJiakebin Congress on Image and Signal Processing 2008 * A Content-based Parallel Image Retrieval System ZhouBing YangXin-Xin Int. Conf. On Computer Design And Appliations 2010 * Image Retrieval using Clustering Based Algorithm AkashSaxena SandeepSaxena AkankshaSaxena International Journal of Latest Trends in Engineering and Technology 2012 * A Content Based Image Retrieval Using K-means Algorithm AAbduljawad Amory 2012 * Content Based Image Retrieval Using K-means clustering technique DeepikaNagthane Int. Jour. of CAIT 3 June-July 2013 * Fast Query Point Movement Techniques for Large CBIR Systems DLiu KAHua KVu NYu IEEE Trans. on Knowledge and Data Engg 21 5 MAY 2009 * Robust facial expression recognition based on local directional pattern TJabid MHKabir OChae ETRJ journal 32 5 2010 * Textons, the elements of texture perception, and their interactions BJulesz Nature 290 5802 1981 * A new method of texture classification using various wavelet transforms based on primitive patterns US NRaju KChandra Sekharan VVKrishna ICGST-Graphics, vision and image processing (ICGST-GVIP) July-2008 8 * Employing long linear patterns for texture classification relying on wavelets U S NV Vijaya Kumar ChandraRaju V VSekaran Krishna ICGST-Graphics, vision and image processing (ICGST-GVIP) Jan-2009 8 * Texture classification based on binary cross diagonal shape descriptor texture matrix (BCDSDTM)", ICGST-Graphics vision and image processing P. Kiran KumarReddy VKumar BEswarReddy ICGST-GVIP) Aug-2014 14 * A Method for Facial Recognition Based on Local Features KReddy VKrishna VVijaya Kumar International Journal of Mathematics and Computation 0974--570X * 2016 27 * New method for classification of age groups based on texture shape features PVijaya Kumar BEswaraChandra Sekhar Reddy Reddy International journal imaging and robotics 0974-0637 15 1 2015 * A dynamic transform noise Resistant uniform Local Binary Pattern (DTNR-ULBP) for Age Classification PJ SKumar VKrishna VVijaya Kumar International Journal of Applied Engineering Research 0973-4562 11 1 2016 * An effective age classification using topological features based on compressed and reduced grey level model of the facial skin JangalaVijaya Kumar VVSasikiran Hari Chandana International journal of image, graphics and signal processing (IJIGSP) Nov-2013 6 * Age classification of facial images using third order neighbourhood Local Binary Pattern PJ SVijaya Kumar Kumar S V V S RPullela Kumar International Journal of Applied Engine-ering Research 0973-4562 10 2015 Number * Overwriting grammar model to represent 2D image patterns VishnuGMurthy VVijaya Kumar ICGST-Graphics vision and image processing Dec-2014 14 ICGST-GVIP * Employing simple connected pattern array grammar for generation and recognition of connected patterns on an image neighborhood VishnuGMurthy VKumar BVReddy ICGST-Graphics vision and image processing Aug-2014 14 ICGST-GVIP) * Textural features for image classification RMHaralick KShanmugan IDinstein IEEE Trans. Sysr., Man., Cybern 3 6 1973 * A spatial filtering approach to texture analysis JMCoggins AKJain Pattern Recogni-non Letters 3 1985 * School of computer science and technology china university of mining and Technology WangJuntao SuXiaolong 2011 An improved K-Means clustering algorithm