Content base Image Retrival using Color Histogram and Global Features

Table of contents

1. Introduction

he rapidly decreasing price of storage, processing and bandwidth has already made digital media increasingly popular over conventional analog media. Each day the amount of internet users increases very fast. Each day a lot of information (text, image, video, audio, etc) travels between people. People want the fastest and easiest ways of information sharing but they don't know how. [1].

Though the increment of internet users is very fruitful for the spread of information but along with it also came some difficulties. This day by day increment of internet users is a hurdle in front of the fast retrieval of information. In this paper a new technique is proposed for the fast image retrieval. The proposed technique has three stages, each stage containing some tasks to be performed by system. First stage is related with image acquisition, second stage is related with Global Features extraction of query image and database images and histograms of each image. In the third and last stage take some comparison between histograms and global features of images.

Author: e-mail: t [email protected] II. Image acquisition means that how a computer treats an image. We all know very well that the computer only understands the binary language. This means that the computer first converts an image into binary image for its understanding. Image acquisition is a combination of some mathematical operations which is used to digitize the image in order to create an enhanced image that is more functional or pleasing to a human spectator, or to do some of the analysis, segmentation, detection, and recognition tasks.

2. Previous Work

3. b) Image Features Extraction

In pattern recognition and in image processing, feature extraction is a special form of dimensionality reduction.

4. i. Feature Extraction via Boundary Detection

This method uses K-nearest neighbor technique to find out the boundary. The binary matrix (image) is scanned until the output (boundary) does not come out. [2] In this method first find out the foreground pixels P and the set of connected foreground pixels. After that feature vector is found out which is also called Fourier descriptor. This Fourier descriptor helps to find out Fourier coefficient and through this Fourier coefficient it is ensured whether the boundary is fully covered or not. This is done by checking whether the first and last position coordinates values are equal or not.

5. ii. Feature Extraction based on Color

Retrieving image on the basis of color similarities is a very common technique. Many researchers work on this technique but mostly of them are variations on the basic idea.

There are too many color models used in this world but two of them are most commonly used RGB and HSV. The methods which extract to features on the bases of colors worked upon the following descriptors. Comparison of all colors between two images is very complex and time consuming. Color histogram is the most helpful technique to resolve such problems of time.

In this method color histogram of each image is taken and then stored in database. In this Method each color axis is divided in to number of "bins.

Bins are just like a plot. A three dimensional RGB (8*8*8) histogram contains total 512 bins. At the time of image indexing, the color of each pixel is find out, and its corresponding bin's count incremented by one. [3].

In this method the desired portion of each color (red, green, blue) is specify from user through this desired portion the color histogram is calculated. After that system search out those images from database whose color histogram closely match the desired or query image color histogram.

The total number of bits in each pixel of an image represents the total number of elements in a histogram. For e.g. suppose that a pixel of n number of bits, then

6. Total elements of histogram = 2 n

7. Pixels values = 0 to 2 n -1

The color histogram is mostly used for large data sets as follows.

8. h = Histogram

(1)

9. b. Color Coherent Vector

One drawback of Color histogram is that it does not think about spatial data of an image. But Color coherent vector resolve this problem by patronized the color histogram into two types: Coherent and Incoherent Pixel value related with small color region fall into incoherent type and pixel value related with large color region fall into coherent type. [4] This classification is done for each color in an image in this method.

10. c. Color Moments

The term color moments defines the means, standard deviation and variance of an image. These terms are widely used in image processing. These terms are mostly used on matrix form of image.

11. iii. Feature Extraction via PCA Algorithm

In this method each row is containing into a long thin vector for reducing the 2D (two dimensional) data into one dimensional format. In the test data the common part of each image is calculated and to get the unique part of image is obtained by subtracting the common and calculated part of images from the original image. After this the method finds out covariance matrix. In this system the feature vector is also called Eigen faces. [6]

12. iv. Feature Extson via Slope Magnitude Method

Slope Magnitude technique is also a widely used technique for Shape features extraction. This method of features extraction worked on connection between edges. Because the connection between edges are very important to represent the boundaries of a object shape. G radiant operator is used in this method to find out the connected boundaries of object to extract the features. After that slope magnitude method is applied on the gradient of images in both horizontal and vertical directions to find out the feature vector. But one drawback of this method is that the dimension of query image and all other images would be equal. [7].

13. v. Feature Extraction using Transforms

There are too many transforms which are widely used for the feature detection. Most commonly used transforms are given below.

14. a. Feature Extraction using Fast Fourier Transforms

The Fast Fourier Transform FFT works in frequency domain. The FFT work on the means values of real and imaginary part of complex number of polar coordinates to find out the feature vector. To generate feature vector FFT consider all real and imaginary part of Red, Green and Blue planes. In a complex plane each complex number is shown as a point which helps to generate the components of feature vector based on complex plane. [8].

15. b. Feature Extraction using Discrete Cosine Transforms

This method takes combination of coefficient of consecutive odd and even coefficient of each column to make the feature vector and put the odd coefficient on y-axis and even coefficient on x-axis.

16. c) Similarity Measurement

Similarity measurement represents the degree of similarity between two images. This part of CBIR depends upon the previous part (feature extraction) of CBIR. This means that if features of image are extracted perfectly then the output of similarity measurements will also come perfect.

There are too many methods to measure the similarities between two images. Some most widely used methods are discussed below:

?(x -v) 2 = | x -v | (2)

Euclidean distance method is reportedly faster than the other distance measurement methods. The one drawback of this method is that it does not work properly in high noisy signals.

17. Proposed Work a) First part

In the first part of proposed method a technique is choose which automatically performed the image querying and retrieval and also chooses the database (location) for those images and their histograms which are used for comparison with query image and their histogram. Once a database (location) is decided for images then collect images to fill up the database. For the completion of this step different websites are used to collect different type images to make a huge collection. After that image sizes were reduced in to 16 * 16 to achieve the less time consumption for processing of proposed method.

18. b) Image Quantization

This part of proposed method is related with quantization of color distribution into histogram to reduce the time and complexity of comparison of all colors between two images.

In this part the proposed method divided the different color axes into "bins" (some type of data). The total number of bins and their width is very important to achieve the right output. Total number of bins depends upon the size of three dimensional RGB models. For example for a three dimensional After that color of each pixels were find out and its corresponding bin's count and as you know always considered to bin's count incremented by one.

In this part the histograms of all database images and query image were taken and saved.

19. And So on ? ? ? ? c) Global Features Extraction

After taking histograms of all images the proposed method find out the global features of all images including query image in this part.

Global features are also known as texture features (Cross correlation, Color properties (color skew ness, color variance, color expectancy)).

Table .2 shows the global features of query image and database images.

20. d) Comparison Between Global Features Of Query Image And Database Images

In the previous part the global features of all images were extracted and stored in a table. Now in this part the proposed method find out the matched images by taking comparison between global features of database images and query image. This comparison was not giving the accurate result just like a one step toward accuracy.

Table .2 shows this difference on the basis of global features. To find out match percentile two things are needed. One is the total number of images and second thing is the rank of retuned images.

Match Percentile = (Total number of images -Rank of returned images)/(Total number of images-1) (3) The percentile is applied on each image of database. Table shows the percentile results of each image.

The proposed method considered only those images whose percentile is greater than or equal to 75%.

21. Global Journal of C omp uter S cience and T echnology

Volume XV Issue I Version I Year ( )

22. Conclusion

Color Histogram is very helpful technique which works on the basis of color distribution of image. But one drawback of color histogram is that it only considers the color and not considers object's locations. Therefore for those images which have same colors distribution but image appearance (object's location) is not same not give the good results.

Hence to enhance the efficiency the proposed method used some other measurements (Global Features) of image and color histogram. By using this idea higher successful rate is obtained.

The drawback of color histogram is minimized by using global features of images.

Figure 1.
Today many companies work on CBIR to achieve fast and accurate results. Each CBIR consists of three parts a) Image acquisition b) Image feature extraction c) Similarity matching a) Image Acquisition
Figure 2. T
Global Journal of C omp uter S cience and T echnologyVolume XV Issue I Version
Figure 3.
i. Euclidean Distance method is used to find out distance measures. These distance measurements indicated the similarities. The low value of distance measurement represent the close (good) similarity relation otherwise the high value of distance measurement represent the open (bad) similarity relation between two images. The metric used in Euclidean distance is called Euclidean metric to find out the distance measurements. In one dimensional, Euclidean distance method works on the basis of the following formula.[9]
Figure 4.
ii. Neural Network Used the concept of Classifiers by themselves for Similarity measurements. Classifiers use the set of statistical data to find out the closest match. Neural Networks work like a brain. Our brain is combination of neurons to memorize the different activities of our life. In neural network these neurons are called nodes and for matching pattern the sequence of traversal through the nodes is very important. In neural network there are three kinds of nodes to perform different functions. (Input nodes, hidden and Output nodes) [10] as shown in figure.
Figure 5. Figure 1 :Figure 2 :
12Figure 1: Neural Network Systemiii. Mahalanobis Distance is a statistical distance measuring metric. This method is used to analyze the patterns on the bases of correlation between variables. This method work on unknown and known sample set to find out the similarities. Known set is the image database and unknown set is the query image. This method works with observation of more than one variable and the strength of their respective relationships.[10] [11]
Figure 6.
8 * 8 * 8 histogram contain 512 bins, 16 *16 * 16 contains 4096 bins and 256 * 256 * 256 contains 1677716 bins. In figure a RGB three dimensional 8 * 8 * 8 histogram is shown.
Figure 7. Figure 3 :Table 1 :
31Figure 3 : Three Dimensional RGB Histogram (8*8*8)Table 1 : Histograms of all Query Images and all Database Images Images Image Histogram in R space Mode
Figure 8. Figure 4 :
4Figure 4 : Block Diagram of Proposed CBIR System
Figure 9. Table 2 :
2
Images Color Variance Color Expectancy Color Skewness Color Correlation
Query Image 23 45 88 25
Image 1 33 49 84 20
Image 2 45 45 82 26
Image 3 23 69 78 35
Image 4 23 65 88 34
Image 5 25 56 81 21
Image 6 27 65 83 23
Image 7 21 45 90 24
Image 8 18 45 92 26
Image 9 34 45 84 20
Image 10 76 34 82 26
Image 11 23 21 78 35
Image 12 34 34 88 34
Image 13 45 47 81 21
Image 14 67 78 83 23
Image 15 12 45 90 24
Image 16 23 46 92 26
Image 17 56 46 84 20
Image 18 56 45 82 26
Image 19 23 67 78 35
Image 20 23 56 89 34
Image 21 78 65 81 21
Image 22 34 45 83 23
Image 23 31 45 97 24
Image 24 33 45 76 26
Image 25 21 34 89 21
Image 26 22 21 85 23
Image 27 34 21 67 24
Image 28 22 34 89 26
Image 29 23 45 95 20
Image 30 23 45 71 26
e) Percentile Matching
After comparison of global features of images
the proposed method now applied the percentile to
achieve accuracy in output. Through match percentile
the proposed method select those images which are
nearest to the query image.
Figure 10. Table 3 :
3
Images Color Color Color Skewness Color Total Percentile
Variance (CV) Expectancy (CE) (CS) Correlation (CC) (TP)
Query Image 23 45 88 25 20
Image 1 33 49 84 20 22
Image 2 45 45 82 26 28
Image 3 23 69 78 35 30
Image 4 23 65 88 34 32
Image 5 25 56 81 21 34
Image 6 27 65 83 23 45
Image 7 21 45 90 24 78
Image 8 18 45 92 26 44
Image 9 34 45 84 20 67
Image 10 76 34 82 26 76
Image 11 23 21 78 35 66
Image 12 34 34 88 34 67
Image 13 45 47 81 21 22
Image 14 67 78 83 23 34
Image 15 12 45 90 24 54
Image 16 23 46 92 26 33
Image 17 56 46 84 20 35
Image 18 56 45 82 26 56
Image 19 23 67 78 35 25
Image 20 23 56 89 34 45
Image 21 78 65 81 21 23
Image 22 34 45 83 23 33
Image 23 31 45 97 24 45
Image 24 33 45 76 26 41
Image 25 21 34 89 21 46
Image 26 22 21 85 23 22
Image 27 34 21 67 24 23
Image 28 22 34 89 26 34
Image 29 23 45 95 20 44
Image 30 23 45 71 26 29
f) Comparison between Query image histogram and
matched images histograms
Figure 11. Table 4 :
4
Images matched with Difference according
Query image Histogram
Image 88
Image 76
Image 78
Image 80
image 84
Figure 12. Table 5 :
5
Images matched with Mean difference
Query image
Image 35
Image 12
Image 34
Image 23
image 14
1
2
3

Appendix A

  1. Image retrieval based on color coherence. B Y Kim , H J Kim , S J Jang . TENCON 99. Proceedings of the IEEE Region 10 Conference Volume, 1999. 1 p. .
  2. Four Walsh transform sectors Features vectors for image retrieval from image databases. Dr , H B Kekre , D Mishra . International Journal of Engineering and Technology 2010. 1 (2) . (Published in)
  3. CBI using upper six FFT sectors of color Images for Feature Vector Generation. Dr , H B Kekre , D Mishra . International Journal of Engineering and Technology 2010. 2 (2) . (Published in)
  4. CBIR using density distribution and mean of binary patterns of Walsh transformed Color images. Dr , H B Kekre , D Mishra . International Journal of Engineering and Technology 2011. 3 (2) . (Published in)
  5. Image Retrieval with Shape Features Extracted using Gradient Operators and Slope Magnitude Technique with BTC. Dr Kekre , HB . Published in International Journal of Computer Applications 2010. (8) p. .
  6. Comparing Images Using Color Coherence Vectors, Greg Pass , Ramin Zabih , Justin Miller . Computer Science Department, Cornell University
  7. Facial Expression Recognition for Neonatal Pain Assessment. Guanming Lu , Xiaonan Li , Haibo Li . IEEE Int. Conference Neural Networks & Signal Processing, (Zhenjiang, China
    ) 2008.
  8. Frame work for content based image retrieval (Text Based ) system. Jalil Abbad , Salman Qadri , Muhammad Idrees . Journal of American Science 2010. 6 (9) .
  9. Efficient Color Histogram Indexing for Quadratic Form Distance Functions. James Hafner , S Harpreet , Will Sawhney , Myron Equits , Wayne Flickner , Niblack . IEEE Trans. on Pattern Analysis and Machine Intelligence July 1995. 17 (7) .
  10. Object detection using image reconstruction with PCA. L M Borja , O D Fuentesj , Foley . Image and Vision Computing 2009. 27 p. .
  11. R Datta . Image Retrieval: Ideas, influences, and trends of the new age, 2008. 40. The penssylvania state University
  12. Content based Image Retrieval for large biomedical Image Archives, Sameer Antani , Rodney Long , R George , Thoma . MEDINFO 2004M. Fieschi et al. (ed.) 2004. Amsterdam: IOS Press.
  13. Facial Gesture recognition using Correlation and Mahalanobis Distance. Supriya Kapoor , Shruti Khanna , Rahul Bhatia . International Journal of Computer Science and Information Security 2010. 7 (2) .
Notes
1
© 2015 Global Journals Inc. (US)
2
© 2015 Global Journals Inc. (US) 1
3
Content base Image Retrival using Color Histogram and Global Features
Date: 2015-01-15