rop mapping is widely used in agriculture andremote sensing science. Crop mapping using classification methodologies serves various applications in agricultural sciencelike gauging water and soil demand etc.. For such applications information on the spatial distribution of classification error is of particular interest [1].Recent progresses in Information Technology systems, lead to dynamic communication among people of every profession. Information technology systems have changed the way people meet and communicate. There is an increasing tendency of professionals and experts in the agriculture sector to communication best practices in the field of agriculture via the medium of internet. Farmers who use the medium of internet get benefited from the various forums used therein to communicate advanced crop yield technologies. Crop mapping can also facilitate the farmers in planning their crop management in advance and they do not see internet and modern technologies has a hurdle [2].
Data is everywhere, abundant, continuous, increasing and heterogeneous. Extracting meaningful information from that data is useful but very difficult: rich data but poor information is a common phenomenon in the world. Data mining (DM) refer to extracting or mining useful knowledge from large amounts of data. One of the various phases of data mining is classification.
Classification is the process in which available data items are categorized into two or more categories depending on the various criterions. Methodologies in which the class label is known apriori is called supervised classification and those in which class labels are not known apriori are called unsupervised classificationor clustering [3]. Supervised classification can be further categorized as parametric and nonparametric categories. Based on whether or not the approach is based on probability distribution or density functions [4].
A well-known statistical method that can be used to solve optimization problems is Support Vector machines (SVM). The proposed methodology used here is SVM. The data items can be represented as feature vectors in a hyper plane and a line passing through the hyper plane can be used to categorize the data items into two different categories. The line can be considered a naïve form of SVM [6] [7]. The An advantage of SVM as a classification method is that in has feature extraction method in-built in its architecture. SVM is better compared to other existing classification methodologies like Naïve bayes, Artificial neural network, decision tree based classification etc. depending on previous research[8] [9].
SVM which is inherently linear in nature. However by using kernel function it can be extended to non-linear space as well. In either of the approaches SVM takes a lot of time to classify the data items. SVM approach is used to solve a multi-class classification problem in this research work Its finds a suitable line which is far off from all equidistant points in the hyper plane [10][11][12][13][14][15][16].
SVM has numerous applicationsas inland analysis [10], species mapping [11], medicine [9], error identification [12], text and speech analysis[5,13], signal analysis[14etc... SVM is used in this research to classify raster TIFF datasets. Subsequent section explains about Literature Survey on SVM. Later Proposed methodology is explained followed by result analysis. The final section deals with conclusion followed by references.
Literature Survey a) Introduction to SVM SVM isa promising methodology which is used in various applications. It solves both two class and multi-class problems [15] [16]. Problems in which input data items need to categorized into two categories are called two class problems and the ones in which data items need to be categorized into multiple classes are called multi-class problems [17]. The multi-class classification problem can be solved using divide and conquer approach. In this approach the problem can be divided into many two-class problems and in the future the results can be merged to arrive at the final solution to the problem.
The classification boundary functions of SVMs maximize the margins, which leads to maximizing generalization performance [18].
SVM can be categorized as linear and nonlinear SVM as in Fig 1 . In linear SVM the hyper plane categorized under two different class labels by a line passing through the hyperplane [18][19] [20].
The line representing the SVM can be denoted by (1) [21]:
m?i+ c> = + 1 m?i+ c< = -1(1)Data items can be represented by (2) [22]:
f(x)= sign(mc+ b)(2)Where sign() represents sign function, m denotes slope and ? happens to be the angle. Sign function is denoted by: sign(c)=?
1 if c > 0 0 if c = 0 ?1 if c < 0 (3)Numerous lines might be able to split the planeas different categories but the one that maximizes the distance between itself and the data items in the two categories is known as the support vector as denoted in Figure 2. The above distance cam be denoted as:
M= ?? + ?? -?.m |m| = 2 |m| (4) h(m)= 1 2 m t m (5) subject to y i (m? i + b) >=1,?iThe solution can be denoted with the help of a Lagrange multiplier ? i as: m=? ?? i ?? i ? i b=y k -m t x k for any x k such that Lagrange multiplier ? k #0 (6) Classifier representation [16]:
f(?)=? ? i y i ? i x + b (7)Systematic nonlinear classification via kernel tricks: SVMs effectively handle non-linear classifications using kernel tricks.
To improve the efficiency of the solution the input data item space can be mapped to a higher dimensional feature space denoted by [18]:
K(?i,?j)=f(? i ) t .f(? j ) (8)The above representation is also known as a kernel function and can be denoted as [23]:
Linear Kernel function =? i t ? j Polynomial kernel function = (1 + ? i t ? j ) p Gaussian kernel = exp(- |?i??j| 2 2? 2 ) Sigmoid kernel = tanh(? 0 ?i?j+? 1 ) (9) c) Multi-class SVMMulti class SVM can be categorized as oneversus-all, one-versus-one, and k-class SVM's [18].
In this approach SVM classifiers are constructed which separate one class from remaining patterns [18].
In this approach k different SVM classifiers are constructed for every pair of classes [18].
In this approachK binary classifiersare built concurrently [18].
iii. Step 3: If a data item is not assigned any of theregions mentioned then add it to set of support vectors V
Step 4: end loop
Finally the built model is validated against the test data set. Herein the test data set under consideration is the crop coverage area that is not covered as part of the selected training data set sample. One of the key steps involved in the classification process is feature extraction as mentioned below:
Energy (E): It facilitates in computing homogeneousness in the data set and is denoted by:
E = ? ? (p(i, j)) 2 n?1 j=1 m?1 i=1 (9)Contrast(C): Contrast helps identify local data set variation and is denoted as:
C = ? ? (i ? j) 2 p(i, j) n?1 j=1 m?1 i=1(10)Inverse difference moment (IDM): Local texture alterations can be located using:
IDM = ? ? 1 1+(j?2) 2 p(i, j)n?1 j=1 m?1 i=1 (11) Entropy (S): The data set complexity can be computed by: S = ? ? p(i, j)log (p(i, j))
n?1 j=1 m?1 i=1 (12) Where ? k and mxn are the mean and size of the blockB k Spatial Frequency (SF): Frequency changes in the data set can be computed using:
SF = (RF ) 2 + (CF ) 2 ? ? ? ? ? Where RF = ? 1 m×n ? ? [I(i, j) ? I(i, j ? 1)] 2 n j=2 m i=1 and CF = ? 1 m × n ? ?[I(i, j) ? I(i ? 1, j)] 2 n j=2 m i=1Variance (V): Level of focus in a data set can be computed using:
V = 1 m×n ? ? (I n j=1 (i, j) ? ?) 2 m i=1 (16)Where ? is the mean value of the block image and m × n is the image size Energy of Gradient (EOG): Measure of focus can also be computed using:
EOG = ? ? (f i 2 n?1 j=1 + f j 2 ) m?1 i=1(17)Where, f i = f (i+1, j) -f (i, j)
f j = f (i, j+1) -f (i, j)IV.
Agricultural map of Gujarat was used as a dataset to perform the said classification. A region of interest (ROI) was extracted from the map that acted as a training data and it was validated against the complete data segment pertaining to a particular crop in the map. Environment in which the research was undertaken is shown in Table 1[27].
The ratio of correctly classified and uncorrectly classified data items can be represented using confusion matrix view as mentioned in Table 2. It helps measure the efficacy of the performed classification. Classification results is given in Figure 4.
Confusion matrix in research is mentioned in Table 3.
1 : Environment Setting | |
Item | Capacity |
CPU | Intel CPU @2 GHz processor |
Memory/OS | 4GB /WIN 7 |
Applications | Monteverdi |
Year 2016 | |||
4 | |||
Volume XVI Issue III Version I | |||
( ) | |||
Global Journal of Computer Science and Technology | |||
3 : Confusion Matrix | |||
Prediction | Reference Rice Millets Cotton | ||
Rice | 14 | 0 | 0 |
Millets | 0 | 16 | 0 |
Cotton | 0 | 0 | 11 |
Data set type | Accuracy | Kappa Statistics | |
Raster TIFF | 100 | 100 | |
datasets | |||
Year 2016 | |||
5 | |||
References Références Referencias | |||
( ) | |||
Kappa statistics=Sensitivity + Specificity -1 | (19) |
Class-specific GMM based intermediate matching kernel for classification of varying length patterns of long duration speech using support vector machines. 10.1016/j.specom.2013.09.010. http://dx.doi.org/10.1016/j.specom.2013.09.010 Speech Communication 0167-6393. February 2014. 57 p. .
Forest classification trees and forest support vector machines algorithms: Demonstration using microarray data. Computers in Biology and Medicine May 2010. 40 p. .
Study on Recognition of Bird Species in Minjiang River Estuary Wetland. 10.1016/j.proenv.2011.09.386. http://dx.doi.org/10.1016/j.proenv.2011.09.386 Procedia Environmental Sciences 1878-0296. 2011. 10 p. .
Protein subcellular localization of fluorescence microscopy images: Employing new statistical and Texton based image features and SVM based ensemble classification. 10.1016/j.ins.2016.01.064. http://dx.doi.org/10.1016/j.ins.2016.01.064 Information Sciences 0020-0255. 1 June 2016. 345. (Pages 65-80)
CEAP: SVM-based intelligent detection model for clustered vehicular ad hoc networks. 10.1016/j.eswa.2015.12.006. http://dx.doi.org/10.1016/j.eswa.2015.12.006 Expert Systems with Applications 0957-4174. 15 May 2016. 50 p. .
Comparison of classifiers for lip reading with CUAVE and TULIPS database. 10.1016/j.ijleo.2015.08.192. http://dx.doi.org/10.1016/j.ijleo.2015.08.192 Optik -International Journal for Light and Electron Optics 0030-4026. December 2015. 126. (Pages 5753-5761)
Drainage water level classification using support vector machines. 10.1109/NUiCONE.2013.6780068. Nirma University International Conference on, 2013. 28-30 Nov. 2013doi. 6. (Engineering (NUiCONE))
Comparison of support vector machine, neural network, and CART algorithms for the land-cover classification using limited training data points. 10.1016/j.isprsjprs.2012.04.001. http://dx.doi.org/10.1016/j.isprsjprs.2012.04.001 ISPRS Journal of Photogrammetry and Remote Sensing 0924-2716. June 2012. 70 p. .
An enhanced M-ary SVM algorithm for multi-category classification and its application. 10.1016/j.neucom.2015.08.101. http://dx.doi.org/10.1016/j.neucom.2015.08.101 Neurocomputing 0925-2312. 26 April 2016. 187. (Pages 119-125)