1. Introduction

he rapidly decreasing price of storage, processing and bandwidth has already made digital media increasingly popular over conventional analog media. Each day the amount of internet users increases very fast. Each day a lot of information (text, image, video, audio, etc) travels between people. People want the fastest and easiest ways of information sharing but they don't know how. [1].

Though the increment of internet users is very fruitful for the spread of information but along with it also came some difficulties. This day by day increment of internet users is a hurdle in front of the fast retrieval of information. In this paper a new technique is proposed for the fast image retrieval. The proposed technique has three stages, each stage containing some tasks to be performed by system. First stage is related with image acquisition, second stage is related with Global Features extraction of query image and database images and histograms of each image. In the third and last stage take some comparison between histograms and global features of images.

Author: e-mail: t [email protected] II. Image acquisition means that how a computer treats an image. We all know very well that the computer only understands the binary language. This means that the computer first converts an image into binary image for its understanding. Image acquisition is a combination of some mathematical operations which is used to digitize the image in order to create an enhanced image that is more functional or pleasing to a human spectator, or to do some of the analysis, segmentation, detection, and recognition tasks.

2. Previous Work

3. b) Image Features Extraction

In pattern recognition and in image processing, feature extraction is a special form of dimensionality reduction.

4. i. Feature Extraction via Boundary Detection

This method uses K-nearest neighbor technique to find out the boundary. The binary matrix (image) is scanned until the output (boundary) does not come out. [2] In this method first find out the foreground pixels P and the set of connected foreground pixels. After that feature vector is found out which is also called Fourier descriptor. This Fourier descriptor helps to find out Fourier coefficient and through this Fourier coefficient it is ensured whether the boundary is fully covered or not. This is done by checking whether the first and last position coordinates values are equal or not.

5. ii. Feature Extraction based on Color

Retrieving image on the basis of color similarities is a very common technique. Many researchers work on this technique but mostly of them are variations on the basic idea.

There are too many color models used in this world but two of them are most commonly used RGB and HSV. The methods which extract to features on the bases of colors worked upon the following descriptors. Comparison of all colors between two images is very complex and time consuming. Color histogram is the most helpful technique to resolve such problems of time.

In this method color histogram of each image is taken and then stored in database. In this Method each color axis is divided in to number of "bins.

Bins are just like a plot. A three dimensional RGB (8*8*8) histogram contains total 512 bins. At the time of image indexing, the color of each pixel is find out, and its corresponding bin's count incremented by one. [3].

In this method the desired portion of each color (red, green, blue) is specify from user through this desired portion the color histogram is calculated. After that system search out those images from database whose color histogram closely match the desired or query image color histogram.

The total number of bits in each pixel of an image represents the total number of elements in a histogram. For e.g. suppose that a pixel of n number of bits, then

6. Total elements of histogram = 2 n

7. Pixels values = 0 to 2 n -1

The color histogram is mostly used for large data sets as follows.

8. h = Histogram

(1)

9. b. Color Coherent Vector

One drawback of Color histogram is that it does not think about spatial data of an image. But Color coherent vector resolve this problem by patronized the color histogram into two types: Coherent and Incoherent Pixel value related with small color region fall into incoherent type and pixel value related with large color region fall into coherent type. [4] This classification is done for each color in an image in this method.

10. c. Color Moments

The term color moments defines the means, standard deviation and variance of an image. These terms are widely used in image processing. These terms are mostly used on matrix form of image.

11. iii. Feature Extraction via PCA Algorithm

In this method each row is containing into a long thin vector for reducing the 2D (two dimensional) data into one dimensional format. In the test data the common part of each image is calculated and to get the unique part of image is obtained by subtracting the common and calculated part of images from the original image. After this the method finds out covariance matrix. In this system the feature vector is also called Eigen faces. [6]

12. iv. Feature Extson via Slope Magnitude Method

Slope Magnitude technique is also a widely used technique for Shape features extraction. This method of features extraction worked on connection between edges. Because the connection between edges are very important to represent the boundaries of a object shape. G radiant operator is used in this method to find out the connected boundaries of object to extract the features. After that slope magnitude method is applied on the gradient of images in both horizontal and vertical directions to find out the feature vector. But one drawback of this method is that the dimension of query image and all other images would be equal. [7].

13. v. Feature Extraction using Transforms

There are too many transforms which are widely used for the feature detection. Most commonly used transforms are given below.

14. a. Feature Extraction using Fast Fourier Transforms

The Fast Fourier Transform FFT works in frequency domain. The FFT work on the means values of real and imaginary part of complex number of polar coordinates to find out the feature vector. To generate feature vector FFT consider all real and imaginary part of Red, Green and Blue planes. In a complex plane each complex number is shown as a point which helps to generate the components of feature vector based on complex plane. [8].

15. b. Feature Extraction using Discrete Cosine Transforms

This method takes combination of coefficient of consecutive odd and even coefficient of each column to make the feature vector and put the odd coefficient on y-axis and even coefficient on x-axis.

16. c) Similarity Measurement

Similarity measurement represents the degree of similarity between two images. This part of CBIR depends upon the previous part (feature extraction) of CBIR. This means that if features of image are extracted perfectly then the output of similarity measurements will also come perfect.

There are too many methods to measure the similarities between two images. Some most widely used methods are discussed below:

?(x -v) 2 = | x -v | (2)

Euclidean distance method is reportedly faster than the other distance measurement methods. The one drawback of this method is that it does not work properly in high noisy signals.

17. Proposed Work a) First part

In the first part of proposed method a technique is choose which automatically performed the image querying and retrieval and also chooses the database (location) for those images and their histograms which are used for comparison with query image and their histogram. Once a database (location) is decided for images then collect images to fill up the database. For the completion of this step different websites are used to collect different type images to make a huge collection. After that image sizes were reduced in to 16 * 16 to achieve the less time consumption for processing of proposed method.

18. b) Image Quantization

This part of proposed method is related with quantization of color distribution into histogram to reduce the time and complexity of comparison of all colors between two images.

In this part the proposed method divided the different color axes into "bins" (some type of data). The total number of bins and their width is very important to achieve the right output. Total number of bins depends upon the size of three dimensional RGB models. For example for a three dimensional After that color of each pixels were find out and its corresponding bin's count and as you know always considered to bin's count incremented by one.

In this part the histograms of all database images and query image were taken and saved.

19. And So on ? ? ? ? c) Global Features Extraction

After taking histograms of all images the proposed method find out the global features of all images including query image in this part.

Global features are also known as texture features (Cross correlation, Color properties (color skew ness, color variance, color expectancy)).

Table .2 shows the global features of query image and database images.

20. d) Comparison Between Global Features Of Query Image And Database Images

In the previous part the global features of all images were extracted and stored in a table. Now in this part the proposed method find out the matched images by taking comparison between global features of database images and query image. This comparison was not giving the accurate result just like a one step toward accuracy.

Table .2 shows this difference on the basis of global features. To find out match percentile two things are needed. One is the total number of images and second thing is the rank of retuned images.

Match Percentile = (Total number of images -Rank of returned images)/(Total number of images-1) (3) The percentile is applied on each image of database. Table shows the percentile results of each image.

The proposed method considered only those images whose percentile is greater than or equal to 75%.

21. Global Journal of C omp uter S cience and T echnology

Volume XV Issue I Version I Year ( )

22. Conclusion

Color Histogram is very helpful technique which works on the basis of color distribution of image. But one drawback of color histogram is that it only considers the color and not considers object's locations. Therefore for those images which have same colors distribution but image appearance (object's location) is not same not give the good results.

Hence to enhance the efficiency the proposed method used some other measurements (Global Features) of image and color histogram. By using this idea higher successful rate is obtained.

The drawback of color histogram is minimized by using global features of images.

Today many companies work on CBIR to achieve fast and accurate results. Each CBIR consists of three parts a) Image acquisition b) Image feature extraction c) Similarity matching a) Image Acquisition — Figure 1.

Global Journal of C omp uter S cience and T echnologyVolume XV Issue I Version — Figure 2. T

i. Euclidean Distance method is used to find out distance measures. These distance measurements indicated the similarities. The low value of distance measurement represent the close (good) similarity relation otherwise the high value of distance measurement represent the open (bad) similarity relation between two images. The metric used in Euclidean distance is called Euclidean metric to find out the distance measurements. In one dimensional, Euclidean distance method works on the basis of the following formula.[9] — Figure 3.

ii. Neural Network Used the concept of Classifiers by themselves for Similarity measurements. Classifiers use the set of statistical data to find out the closest match. Neural Networks work like a brain. Our brain is combination of neurons to memorize the different activities of our life. In neural network these neurons are called nodes and for matching pattern the sequence of traversal through the nodes is very important. In neural network there are three kinds of nodes to perform different functions. (Input nodes, hidden and Output nodes) [10] as shown in figure. — Figure 4.

Figure 1: Neural Network Systemiii. Mahalanobis Distance is a statistical distance measuring metric. This method is used to analyze the patterns on the bases of correlation between variables. This method work on unknown and known sample set to find out the similarities. Known set is the image database and unknown set is the query image. This method works with observation of more than one variable and the strength of their respective relationships.[10] [11] — Figure 5. Figure 1 :Figure 2 :

8 * 8 * 8 histogram contain 512 bins, 16 *16 * 16 contains 4096 bins and 256 * 256 * 256 contains 1677716 bins. In figure a RGB three dimensional 8 * 8 * 8 histogram is shown. — Figure 6.

Figure 3 : Three Dimensional RGB Histogram (8*8*8)Table 1 : Histograms of all Query Images and all Database Images Images Image Histogram in R space Mode — Figure 7. Figure 3 :Table 1 :

Figure 4 : Block Diagram of Proposed CBIR System — Figure 8. Figure 4 :

Figure 9. Table 2 :

Images	Color Variance Color Expectancy Color Skewness			Color Correlation
Query Image	23	45	88	25
Image 1	33	49	84	20
Image 2	45	45	82	26
Image 3	23	69	78	35
Image 4	23	65	88	34
Image 5	25	56	81	21
Image 6	27	65	83	23
Image 7	21	45	90	24
Image 8	18	45	92	26
Image 9	34	45	84	20
Image 10	76	34	82	26
Image 11	23	21	78	35
Image 12	34	34	88	34
Image 13	45	47	81	21
Image 14	67	78	83	23
Image 15	12	45	90	24
Image 16	23	46	92	26
Image 17	56	46	84	20
Image 18	56	45	82	26
Image 19	23	67	78	35
Image 20	23	56	89	34
Image 21	78	65	81	21
Image 22	34	45	83	23
Image 23	31	45	97	24
Image 24	33	45	76	26
Image 25	21	34	89	21
Image 26	22	21	85	23
Image 27	34	21	67	24
Image 28	22	34	89	26
Image 29	23	45	95	20
Image 30	23	45	71	26
e) Percentile Matching
After comparison of global features of images
the proposed method now applied the percentile to
achieve accuracy in output. Through match percentile
the proposed method select those images which are
nearest to the query image.

Figure 10. Table 3 :

Images	Color	Color	Color Skewness	Color	Total Percentile
	Variance (CV)	Expectancy (CE)	(CS)	Correlation (CC)	(TP)
Query Image	23	45	88	25	20
Image 1	33	49	84	20	22
Image 2	45	45	82	26	28
Image 3	23	69	78	35	30
Image 4	23	65	88	34	32
Image 5	25	56	81	21	34
Image 6	27	65	83	23	45
Image 7	21	45	90	24	78
Image 8	18	45	92	26	44
Image 9	34	45	84	20	67
Image 10	76	34	82	26	76
Image 11	23	21	78	35	66
Image 12	34	34	88	34	67
Image 13	45	47	81	21	22
Image 14	67	78	83	23	34
Image 15	12	45	90	24	54
Image 16	23	46	92	26	33
Image 17	56	46	84	20	35
Image 18	56	45	82	26	56
Image 19	23	67	78	35	25
Image 20	23	56	89	34	45
Image 21	78	65	81	21	23
Image 22	34	45	83	23	33
Image 23	31	45	97	24	45
Image 24	33	45	76	26	41
Image 25	21	34	89	21	46
Image 26	22	21	85	23	22
Image 27	34	21	67	24	23
Image 28	22	34	89	26	34
Image 29	23	45	95	20	44
Image 30	23	45	71	26	29
f) Comparison between Query image histogram and
matched images histograms

Figure 11. Table 4 :

Images matched with	Difference according
Query image	Histogram
Image	88
Image	76
Image	78
Image	80
image	84

Figure 12. Table 5 :

Images matched with	Mean difference
Query image
Image	35
Image	12
Image	34
Image	23
image	14

Content base Image Retrival using Color Histogram and Global Features

Table of contents