# INTRODUCTION

icroarrays, widely recognized as the next revolution in molecular biology, enable scientists to analyze genes, proteins and other biological molecules on a genomic scale [1]. A microarray is a collection of spots containing DNA deposited on the solid surface of glass slide. Each of the spot contains multiple copies of single DNA sequence [2].

Microarray expression technology helps in the monitoring of gene expression for tens and thousands of genes in parallel. During the biological experiment, the mRNA of two biological tissues of interest is extracted and purified. Each of the mRNA samples are reverse transcribed into complementary DNA (cDNA) copy and labeled with two different fluorescent dyes resulting in two fluorescence-tagged cDNA (red Cy5 and green Cy3). The tagged cDNA copies, called the sample probe, are hybridized with the slide's DNA spots. The hybridized glass slides are fluorescently scanned at different wavelengths (corresponding to the different dyes used), and two digital images are produced, one for each population of mRNA. Each digital image contains a number of spots of various fluorescence intensities. The intensity of each spot is proportional to the hybridization level of the cDNAs and Author ? ? ? : BVC Engineering College, Odalarevu.

the DNA dots, the gene expression information is obtained by analyzing the digital images [3].

The processing of the microarray images usually consists of the following three steps: (i) gridding, which is the process of assigning the location of each spot in the image. (ii) Segmentation, which is the process of grouping the pixels with similar features and (iii) Intensity extraction, which calculates red and green foreground intensity pairs and background intensities.

Nowadays, segmentation algorithms such as Kmeans and Fuzzy C-Means have been used for the segmentation of spots of the microarray images. In this paper, we present a histogram clustering algorithm for segmentation of spots of the microarray image. The proposed algorithm is based on the minimization of the mutual information loss, where now the input variable represents the histogram bins and the output is given by the set of regions obtained from the split and merge algorithm. The rest of the paper is organized as follows.

Section II presents K-Means Algorithm, Section III presents Fuzzy C-Means Algorithm, Section IV presents present Histogram Clustering algorithm for segmentation of spots in Microarray image, Section V presents experimental results and finally Section VI reports conclusion.


# II. K-MEANS CLUSTERING ALGORITHM

K-means is one of the basic methods in clustering introduced by Hartigan et al. in 1979 [3]. This method is applied to microarray image segmentation in recent years [21]. K-means clustering algorithm implemented in this paper aims to group the pixels into two clusters. Given x = {x 1 ,x 2 ,...,x N } and c = {c 1 , .. c j } representing the pixels of microarray image and clusters respectively, the objective is to minimize the sum of squares of the distances given by the following:
d ij = || x i -c j ||. arg min ? ? = = C j N i 1 1 d ij 2 (1)
First two cluster centers c 1 and c 2 , the centroid of spots and background have to be initialized at the outset. Iteratively, the pixels are assigned to the closest cluster and the new centroid of a cluster is calculated by the following: The k-means algorithm to segment microarray image is summarized as below: 
u ij =1(2)
For all i= 1,2,??.N, where c is the number of clusters and N is the number of pixels in microarray image.

Step_2: Compute the centroid values for each cluster c j . Each pixel should have a degree of membership to those designated clusters. So the goal is to find the membership values of pixels belonging to each cluster. The algorithm is an iterative optimization that minimizes the cost function defined as follows:
F= ? ? = = c i N j 1 1 u ij m || x j -c i || 2 (3)
Where u ij represents the membership of pixel x j in the i th cluster and m is the fuzziness parameter.

Step_3: Compute the updated membership values u ij belonging to clusters for each pixel and cluster centroids according to the given formula. End.


# IV. HISTOGRAM CLUSTERING ALGORITHM

We present a greedy histogram clustering algorithm that takes as input partitioned image and obtain histogram clustering based on the minimization of the loss of Mutual Information. The Mutual Information between two random variables X and Y is defined by


# I(X,Y)=H(X)-H(X|Y)

Where H(X)= -? ?X x p(x)logp(x) and
H(X|Y)= -? ?X x p(x) ? ?Y y p(y|x)logp(y|x)(5)
That is we group the bins of the histogram so that the mutual Information is maximally preserved. From the perspective of the information bottleneck method the binning process is controlled by a given partition of the image. The histogram clustering algorithm is presented in [9].

Our Clustering algorithm is based on the channel G?R, and is defines by the conditional probability matrix p(R|G) which expresses how the pixels corresponding to each histogram bin are distributed into regions of the image . Bayes' theorem, expressed by p(g)p(r|g)=p(r)p(g|r), establishes the relationship between the conditional probabilities of both channels G?R and R?G. The basic idea underlying our histogram clustering algorithm is to capture the maximum information of the image with the minimum number of histogram bins. In general, if the two bins are very similar the channel can be simplified by substituting these two bins by their clustering, without a significant loss of information. The algorithm proceeds by merging the two bins so that the loss of information is minimum. During the clustering process H(R)=H(R|G) + I(G,R), where H(R) is the entropy of p( R) and H(R|G) and I(G,R) represent, respectively, the successive values of conditional entropy and MI obtained after successful clusterings. Observe also that H(R|G) is the average entropy of the bins and increases at each iteration. 


# EXPERIMENTAL RESULTS

Segmentation steps of the microarray image processing are performed on a sample microarray slide that has 48 blocks, each block consisting of 110 spots. A sample block has been chosen and 108 spots of the block have been cropped for simplicity. The sample image is a 154*200 pixel image that consists of a total of 30800 pixels. The RGB colored image microarray image have been converted to grayscale image to specify a single intensity value that varies from the darkest (0) to the brightest (255) for each pixel shown in figure1.  


# CONCLUSION

Histogram clustering algorithm constitutes a valid tool to segment the spots of microarray image. Even though the mathematical bases for these techniques are complex, their implementation is simple, quick and easier on the user. The proposed segmentation algorithm has the advantage of processing spots of variable shapes and being insensitive to variations. In order to process the images of low intensity background correction is necessary. The proposed algorithm provides a more efficient way of segmenting the microarray image when compared with the segmentation achieved by K-Means and Fuzzy c-Means.
2011![Global Journals Inc. (US) Global Journal of Computer Science and Technology Volume XI Issue XIX Version I 31 2011 November Algorithm KM(x,n,c) Input: N=number of pixels to be clustered; x = {x 1 ,x 2 ,...,x N } pixels of microarray image; c=2: foreground and background clusters; Output: cl: cluster of pixels Begin Step_1: Cluster centroids are initialized, Step_2: Compute the closest cluster for each pixel and classify it to that cluster, Step_3: Compute new centroids after all the pixels are clustered, Step_4: Repeat the Steps 2-3 till the sum of squares given in Equation End. III. FUZZY C-MEANS CLUSTERING Algorithm Fuzzy C-Means(x,n,c,m) Input: N=number of pixels to be clustered; x = {x1,x2 ,...,xN}: pixels of microarray image; c=2: foreground and background clusters; m=2: the fuzziness parameter; Output: u: membership values of pixels Begin Step_1: Initialize the membership matrix u ij is a value in (0,1) and the fuzziness parameter m. The sum of all membership values of a pixel belonging to clusters should satisfy the constraint expressed in the following.](image-2.png "M © 2011")
4![Step_4: Repeat steps 2-3 until the cost function is minimized.](image-3.png "( 4 )")
![Fig1 : a) RGB Color microarray image b) Grayscale Image The segmented microarray image using three different segmentation algorithms (K-means, Fuzzy c-Means and Histogram Clustering algorithm) is shown in figure 2.](image-4.png "Fig1")
![Fig2 : a) K-means b) Fuzzy c-means c) Histogram Clustering Algorithm The histogram gives the distribution of intensity values for each cluster. The K-means have calculated mean of the spots as 25.32 and the mean of the](image-5.png "Fig2")
			© 2011 Global Journals Inc. (US) Global Journal of Computer Science and Technology Volume XI Issue XIX Version I
		
		
* 
	
		Quantitative Monitoring of gene expression patterns with a complementary DNA microarray
		
			MSchena
		
		
			DShalon
		
		
			RonaldWDavis
		
		
			PatrickOBrown
		
	
		Science
		
			270
			
			199
		
	
* 
	
		An Automated Gridding and Segmentation method for cDNA Microarray Image Analysis
		
			Wei-BangChen
		
		
			ChengcuiZhang
		
		
			Wen-LinLiu
		
	
		19 th IEEE Symposium on Computer-Based Medical Systems
				
	
* 
	
		Error Reduction on Automatic Segmentation in Microarray Image
		
			Tsung-Han Tsai Chein-PoYang
		
		
			Pin-HuaWei-Chitsai
		
		
			Chen
		
		
			2007
			IEEE
		
	
* 
	
		Analysis of microarray imagesusing FCM and kmeans Clustering Algorithm
		
			EErguit
		
		
			YYardimci
		
		
			EMumcuoglu
		
		
			OKonu
		
	
		Proc IJCI
				IJCI
		
			2003
			
		
* 
	
		Ihsan Omur Bucak, Clustering based Spot Segmentation of cDNA Microarray Images
		
			Volkan Uslan
		
		IEEE 2010
		
	
* 
	
		
			CRafael
		
		
			RichardEGongalez
		
		
			Woods
		
		Digital Image Processing
				
			Pearson Education
		
	
	Third Edition


* 
	
		Grey-Scale Morphology Based on Fuzzy Logic
		
			TDeng
		
		
			HHeijmans
		
	
		Journal of Mathematical Imaging and Vision
		
			16
			2
			
			2002
			Springer
		
	
* 
	
		Application of Fuzzy Morphology to Contrast Enhancement
		
			MAWirth
		
		
			DNikitento
		
		
			2005
			IEEE
		
	
* 
	
		An Information Theoretic Framework for image segmentation
		
			JRigau
		
		
			MFeixas
		
		
			MSbert
		
	
		IEEE
		
			2004