# Introduction

achine learning techniques are being adopted by various applications from different domains to build predictive models. These techniques are broadly classified as supervised learning and unsupervised learning based on the availability of class labels to build the model. Supervised learning methods require labeled data to build a classifier model that predicts the class labels of unknown examples based on the information available in the form of class labels. However, it is usually very expensive and timeconsuming process to collect the labeled data (Han et al., 2011). Even in domains with abundance of unlabeled data, labeled data are usually scarce and would require some effort to collect such data. However, to build classifier with better generalized accuracy, large number of labeled data is required, more so for datasets with high dimensionality -one of the problems associated with curse of dimensionality (Ramona et. Al.,2012).

Accordingly, it is believed that with fixed number of labeled examples, the predictive power of the classifier decreases with the increase in number of dimensions thus requiring larger number of labeled examples for building classifier (Advani, 2011).

In unsupervised learning methods such as clustering, unlabeled data, if available in abundance, suffice to extract hidden patterns of knowledge from a given dataset. Traditional clustering algorithms take into account the entire feature space to partition the datasets into clusters such that there is homogeneity among the instances within a cluster. The proximity between the instances in the cluster is measured in terms of distance function. However, with the increase in dimensions, the distance measures employed in the clustering algorithm becomes insignificant and clusters so produced will be meaningless. Hence clustering will full feature space, especially when the number of dimensions are large, may not produce good clusters.

Finding the subset of feature space to produce meaningful clusters comes under the purview of subspace clustering. Subspace clustering focuses on finding a subset of features or a smaller set of transformed features with an aim to define cluster-able object spaces (Han et al., 2011;Sim et al., 2013). In high dimensional datasets due to exponentially large number of subsets of the feature set, subspace clustering techniques have to eliminate enormous possibilities before identifying the appropriate feature space that contain intrinsically significant clusters (Han et al., 2011). The basic research in subspace clustering falls into unsupervised learning as it tries to identify clusters based on the distribution of objects in various feature sub-spaces irrespective of the class labels of the objects. The clusters thus formed may be meaningful but may not be relevant to the intended purpose or context. For instance, the census data is described in terms of different features like social, economic, education, health, etc.,. However, it needs to be clustered in groups depending on the purpose of the data analysis. Features corresponding to social backwardness and eco-nomic status is used to identify the welfare schemes to be adopted, whereas features corr-espo-nding to place of living, commutability, etc., are used to decide the location of new amenities centers. In both the cases, features used and their relative significance will vary with the context or purpose thus requiring the clustering algorithm to give proper emphasis to appropriate features in accordance with the context for which the Context-aware-subspace clustering aims to find appropriate feature subspace for a given context represented in the form of class labels of a few labeled examples which are consistent with a large collection of unlabeled examples belonging to the same dataset. To the best of our knowledge, not much research was published in support of feature selection algorithms making use of combination of labeled as well as unlabeled examples. Hence semi supervised feature selection algorithms are needed to be developed for formation of context-aware clusters in domains having only limited examples labeled and the rest being left unlabeled.

Semi Supervised Learning which is an integration of supervised and unsupervised learning; makes use of both labeled and unlabeled examples to build a model (Zhu and Goldberg, 2009). Semi supervised learning has two forms namely semi supervised classification and semi supervised clustering. Semi supervised classification uses both labeled and unlabeled data to build the classifier. Using the limited number of labeled data, probable class labels for the unlabeled data is derived which in turn is added to the pool of labeled data thus increasing the number of labeled examples (Han et al., 2011). The basic assumption in this technique is that the similar data will have same class labels (cluster assumption) (Chapelle et al., 2006;Wang et al., 2012). Different methods like self training, co-training, generative probabilistic models, graph based and support vector machines are used for semi supervised classification (Zhu, 2008). In semi supervised clustering, a large set of unlabeled data is accompanied by a small amount of domain knowledge in the form of either class labels or pairwise constraints (must-link and cannot-link) (Grira et al., 2004;Ding et al., 2012). This domain knowledge is used to guide the clustering of unlabeled data so that the intra-cluster similarities are maximized and intercluster similarities are minimized and there exist consistency between the partition and the available knowledge (Gao et al., 2006).

Based on the above arguments, authors proposes context-aware semi supervised subspace clustering framework which leverages the domain knowledge in terms of class labels for at least some of the examples (if labeled examples are expensive) in order to estimate the suitability of the features to the intended cluster solution. Proper selection of features and their relative significance is essential in producing context-aware clusters which are probably uni-class clusters. Uni-class clusters contain all or majority of the elements belonging to same class label which is reflected in terms of cluster purity. The clustering framework is further extended to build a classifier which is referred to as semi supervised classifier that requires minimum information for prediction. The authors also proposes 'Semi Supervised Feature Relevance Estimation', (SFRE), algorithm to estimate the relevant features and their relative significance in terms of weights that define appropriate subspaces for different targets/context. The framework was tested on a few benchmark datasets from UCI repository which has given promising results.


# II.


# Related Work

Researchers in the past came up with different methods for semi supervised learning. One popular approach is constrained based clustering. Constraint based methods uses pairwise constraints in the form of must-link and cannot-link that guides the clustering process to partition the data in a way that do not violate these constraints (Wagstaff et al., 2001;Basu et al., 2004;Lu and Leen, 2004). Recently Xiong et al., (2014) proposed an iterative based active learning approach to select pairwise constraints for semi supervised clustering. It uses the concept of neighbourhood that contains labeled examples of different clusters based on pairwise constraints. The uncertainty associated with each point's neighbor is resolved through queries. However, repeated clustering is required with growing list of constraints.

Another popular approach for semi supervised clustering is distance based techniques which is based on the cluster assumption. Yin and Hu (2011) proposed semi supervised clustering algorithm using adaptive distance metric learning where clustering and distance metric learning are performed simultaneously. The clustering results are used to learn the distance metric and the data is projected into a low dimensional space such that data seperability is maximized. Gao et al., (2006) focused on semi supervised clustering in terms of features rather than examples. It addresses the problem where labeled and unlabeled dataset have different feature set with few common features.

In terms of feature selection, Padmaja et al., (2010) proposed a dimensionality reduction approach that estimates the significance of features based on the fractal dimensions and accordingly selects a subset of features that are essential to capture the characteristics of the dataset. The algorithm detects all types of correlations among features to identify the essential features after eliminating the redundant and irrelevant features. Kernel based feature selection was also explored by a few researchers (Wang, 2008;Ramona et al., 2012). Clustering based feature selection for classification was proposed by Song et al., (2013) where features are clustered based on graph theoretic clustering method.

Research on feature weighting and ranking concentrated more on supervised learning (Eick et al., 2006;Al-Harbi and Rayward-Smith, 2006;Zhao and Qu, 2009). Most of these research studies initially weigh the features by using some random guess or equal weights. These initial weights are then adjusted accordingly. Such approach may take much time to arrive at the final optimum weights if the initial guess is not appropriate.

This paper deals with semi supervised learning methods with wrapper based feature selection method that uses discriminant analysis results to initialize the weights. These weights are adjusted accordingly in a stepwise refinement process using both labeled and unlabeled examples. The proposed framework is used to develop a classifier and a pertinent cluster solution.


# III. Context-aware Semi Supervised

Subspace Clustering Framework  A dataset may be clustered in multiple ways by appropriately selecting a subset of features /attributes depending on the purpose. Hence to produce clusters conforming to a particular purpose or context, weights must be given to features that depict the importance of the feature. Researchers in the past initially start with a guess/random weights or equal weights to the feature and proceeds further to determine the more acceptable weights. Instead of starting with some arbitrary values, it is proposed to use the information from the available labeled data to initialize the weights which can be adjusted later. Authors thus propose usage of discriminant analysis that finds the relationship between the independent features (predictors) and the dependent feature (class label), to initialize the feature weights.

Discriminant analysis is a method that is used to predict categorical value from a given set of independent feature. It assumes the independent features to be normally distributed. The linear equation of Discriminant analysis is (Equation 1)
D=V 1 X 1 + V 2 X 2 + V 3 X 3 + ?.. + V i X i + a 1
Where D=Discriminant Score V i = the discriminant coefficient or weight of i th feature X i = Value of i th feature a = a constant Discriminant analysis thus identifies the relevant features and its coefficients reflect the relevancy of the feature. The outcome of the discriminant analysis in terms of coefficients is normalized and is used as initial weights for developing binary cluster solution where as development of multi-class cluster solution involves integration of results given through multiple discriminant functions. The proposed framework use potency index as per the approach given in Dharmavaram and Mogalla (2013) for determining the initial weights of various features based on the labeled examples in case of multiclass datasets.


# b) Clustering Algorithm

The initial weight vector is used to form the initial cluster solution by using any partitional clustering algorithm. The authors have chosen K-means algorithm for its simplicity and computational efficiency to deal with numerical features. While dealing with datasets described in terms of numerical attributes, generally Kmeans algorithm employs Euclidean distance to compute the distance from each data point to the cluster centroid. Euclidean distance assumes that all the features are equally important while forming the clusters. However, as discussed previously, weights of the feature will determine the relevancy of the feature in forming the desired cluster solution and accordingly Weighted Euclidean Distance metric is used for distance calculation which has the following equation (Equation 2):
dw (x i , x j ) = ?? ?? ?? (?? ???? ? ?? ???? ) 2 ?? ?? =1 2 where ? ?? ?? = 1 ?? ?? =1
where w m indicates the weight of the m th feature. If the significance of the feature is more, its weight will be more. The weight of an irrelevant feature can be set to zero.

For clustering, the number of clusters, K, is taken to be more than the number of classes. Larger values of K results in formation of large number of small uni-class clusters and hence, multiple clusters are associated with a single class. Each of these clusters The cluster concurrence is estimated for each cluster based on the agreement of the members of the cluster towards a particular class label and hence reflects the uni-class property of a cluster. In order to estimate the cluster concurrence of k th cluster, the support, S kj , available for each class, j, in that cluster is aggregated as shown in Equation 3.
S kj = 1 |?? ?? | ? ?? × ?? ?? (??) ????? ??3
Where P j (n) indicates the probability of the example n belonging to the class j |?? ?? | is the cardinality of the cluster k, i.e., the number of examples that are assigned to cluster C k .
M = ? 1 ???? ?? = ???????????? ????? { ?? ?? ?? (??)} 0 ?????????????????
The binary term M acts as a deciding factor to indicate whether the example contributes to the support of class j or not. It may be noted that each example, whether labeled or unlabeled, contributes to the support of only one class: the unlabeled example support the class with the maximum probability, while the labeled example naturally support one and only one true class label.

P j (n) is calculated as per the equation given below (Equation 4) where d(i,n) is the weighted Euclidean distance between i and n.
?? ?? (??) = ? ? ? ? ? 1 ???? ?? ? ?? ?????? ?? ?? = ?? ? 1 ??(??,?? ) 2 ???????? ?? ?????? ?? ?? =?? ? 1 ??(??,?? ) 2 ?? ?????? ?? ???? ?? ? ?? ? ?? ?? 0 ?????????????????4
The predicted label of an unlabeled example, t, is the label for which the probability is maximum.

The cluster concurrence of k th cluster is estimated as:
CC k = max j {S kj }
Overall cluster purity of the cluster solution is taken as the weighted sum of individual cluster concurrences and is given below (Equation 5) The new algorithm, SFRE is guided by cluster purity estimated in terms of labeled as well as unlabeled examples belonging to various feature subspaces. The algorithm accepts the dataset D that includes L and U, initial cluster purity and the outcome of discriminant analysis as initial weights for formation of initial weight vector as input. The output of the algorithm is accurate relevance estimates of the feature set referred to as weight vector that defines the feature subspace for the given purpose indicated through class labels.
CP = ? |?? ?? | |??| ???? ?? ?? ??=15
The cluster purity obtained by the initial weights is assigned to current cluster purity as initialization step, after which the algorithm executes the following three steps iteratively:
( D D D D ) Year C
Step 1: Finding Relevant Features Step 2: Updating Weights Step 3: Check for convergence In the first step, each feature in the feature set is checked for its relevance. Taking one feature at a time, clusters are formed without that feature and cluster purity is estimated. If there is a decrease in cluster purity when compared to the current cluster purity, it indicates that the absence of the feature has resulted in the loss in purity and hence it is marked as relevant feature and its relevance increment is calculated based on the proportionate difference in the cluster purity estimated with and without the feature. If there is increase in the cluster purity when compared to the current cluster purity, it indicates that the absence of the feature has resulted in the gain in purity and hence it is marked as irrelevant feature. The outcome of this step is to mark each feature either relevant or not and to estimate the relevance increment for those relevant features.

In the second step, based on the relevance marking, the weights are adjusted such that weights of the relevant features are incremented in accordance with the relevance increment calculated in step 1. The weights of those features marked irrelevant, are made zero and finally the weight vector is normalized to sum up to 1.

In the final step, clusters are formed with the adjusted weights to judge the final solution. The new cluster purity obtained from clusters formed with updated weights and features is compared with the current cluster purity. If there is improvement in the cluster purity, the new weights are accepted and the new cluster purity is taken as the current cluster purity for comparison in the next iteration. The steps are repeated till there is not much significant improvement in the cluster purity. To change the order in which the features are selected in the subsequent iterations; features are randomly selected without replacement. This supports in avoiding any overlap or correlation in the features and to avoid local maxima.


# e) Formal listing of Proposed Algorithm (SFRE)

Let CPcurr be the cluster purity estimated for the initial cluster solution then stepwise refinement in weights proceeds as follows:

Step 1: For each feature x, randomly selected without replacement from the feature set F Perform K-means without the feature x by appropriately normalizing the weight vector Estimate Cluster Purity CP F-x If CP F-x < CP curr then x is relevant calculate relevance increment, Rel x = ???????????? -????????? ???? ????????


# Else x is not relevant

Step 2: Increase the weight of each relevant feature x, W x = W x (1 + Rel x ) For each irrelevant feature x, W x = 0

Normalise the weight vector

Step 3: Perform K-means with adjusted weights Estimate the cluster purity CP new If CP new > CP curr Accept new weights CP curr = CP new Perform above steps till there is no improvement in the cluster purity.

The final cluster solution thus formed consists of context-aware clusters with final set of relevant features and weights.


# IV. Semi Supervised Classification Framework

The  However, in the presence of overlapping examples or outliers, the examples in a cluster may not strongly agree on a particular class and such cluster is not considered as uni-class / decisive cluster and is not labeled as they are considered as indecisive cluster. The final cluster solution formed in the training phase contains K clusters with each cluster containing examples belonging to one or more classes. The support of a class in a cluster S kj , is estimated in terms of true class labels of labeled examples and the predicted (probabilistic) class labels of unlabeled examples in the k th cluster. In a given cluster, the difference between the support available for majority class and its competing class reflects the decisiveness of the cluster in concurrence with the majority class. For this purpose, the authors propose a metric referred to as 'Purity Margin' which is measured for each cluster and is compared against purity threshold as detailed below.


# b) Purity Threshold of the cluster

The 'Purity Threshold', PT, of a cluster, C k , PT k is set as the minimum difference or margin, to be imposed between two competing classes in a cluster, for it to be considered as the decisive cluster. The purity threshold is estimated as a pre-defined fraction (?) of the product of cluster concurrence CC k and the number of classes in the dataset. In a dataset with q classes, purity threshold PT k , for a cluster C k is calculated as (Equation 6)
PT k = ?.CC k . q6
Various experiments conducted on the value of ? shows that 0.1 which indicates 10% of support value, is a good measure to get optimum purity threshold.


# c) Purity Margin of the cluster

The purity margin measures the difference between the maximum support of a class in a cluster and the support of its immediate competitor class. Larger the margin, more pure the cluster is. Intuitively it is taken that it should be greater than or equal to the purity threshold.

For a cluster C k , the purity margin PM(C k ) is calculated as (Equation 7) PM(C k ) = CC k -S kp where p is the competing class. 7 d) Decisive cluster A cluster C k , is considered to be a uni-class or a decisive cluster, if PM(C k ) ? PT k else it is considered as indecisive cluster. The decisive cluster is labeled with the majority class label i.e., the class label that has maximum support of the examples in the cluster, over all classes in the cluster. The indecisive cluster is left unlabeled and the details of the cluster including the predicted labels of unlabeled examples are stored to apply the weighted nearest neighbour classification while classifying any unknown / test example.


# e) Hybrid Model for Classification

The authors propose a hybridization of modelbased classification and instance-based classification for classifying any unknown / test example based on whether it is compatible to decisive cluster or an indecisive cluster.

Let the cluster, C k be the most compatible cluster for unknown example x: ? If the cluster, C k , is decisive then ? Assign the cluster label, ?? ?? ?? to the example x.

? If the cluster is indecisive then ? Apply weighted nearest neighbor classification to predict the class label of x. f) Finding the most compatible cluster for unknown / test example Consider a set of clusters C={C 1 ,C 2 ,?,C K } with centroids as c= {c 1 ,c 2 ,?c K }. Weighted Euclidean distances are calculated between unknown / test example, t, and each centroid, c i . The cluster C k , which has the minimum distance among all the clusters is said to be the most compatible cluster for the example, t. Mathematically, it may be expressed as (Equation 8)
k = ???????????? ????? ?? {d(t,c i )} 8
Hybrid model for classification is applied on the value of k as discussed earlier. Hence, the proximity of the unknown / test example, 't', to each class must be measured. The closer the example, 't', is to the neighborhood dominated by particular class label, it is more likely to share the same class label of its neighbors (Cluster Assumption). Accordingly, all the members of the most compatible cluster C k , are considered as neighbors with weights assigned in the inverse proportion of their squared distance to the test example. The proximity of the example, t, to a class label, p, denoted by W tp , is estimated by aggregating the weights of the members belonging to that particular class. Mathematically it may be expressed as (Equation 9)
W tp = ? 1 ?? (??,??) 2 ????? ?? ?????? ?? ? ?? =?? 9
where d(t,i) is the Euclidean distance between t and i. This proximity estimate will ensure that the examples that are far (possibly an outlier) from the test example has less impact on prediction compared to the ones that are closer by. The unknown / test example is assigned the class label for which the proximity is maximum (Equation 10). V.


# Global Journal of Computer


# Experiments and Results


# a) Experimental Setup

The proposed model was implemented on Intel Pentium dual core processor with 3GB of DDR2 667 MHz memory and coded using .NET framework. SPSS statistic tool is used for performing discriminant analysis.

Experiments were conducted on benchmark datasets obtained from UCI repository and one dataset from SPSS Inc. to test the performance of the proposed framework. Five binary datasets and six multi-class datasets were used in the experimentation as shown in table 1. The labels from some For binary class datasets, experiments were conducted with 100% labeled examples to assess the performance of the framework when all the examples in the datasets are labeled. However availability of labeled examples upto 100% does not call for semi supervised learning. The case with 100% labeled examples was demonstrated only to prove that the proposed method can handle datasets having less labeled examples in the similar way with datasets having 100% labeled maintaining consistently high performance. The complexity of cluster regularization and estimation of cluster concurrence and purity margin for development of hybrid classifier are not required for datasets having near 100% labeled examples and they may be better processed by an appropriate supervised learning algorithm. The performance of the model for multi-class datasets was analysed starting from 75%.

In both the cases of clustering and classification, discriminant analysis is performed using SPSS statistics tool on the labeled examples in the datasets to produce the discriminant function(s). For binary class datasets, discriminant coefficients, and for multi-class datasets, potency index values are used to get the initial weights of the features in the dataset, which are referred to as initial weight vector.


# b) Results

In case of Semi Supervised Subspace Clustering, the cluster purity was estimated based on the cluster concurrence and the number of relevant features identified for the benchmark datasets are tabulated in table 2 and    From Fig. 2 and Fig. 3, it is observed that the proposed model has consistent performance in term of cluster purity and not much change is observed with variation in percentage of labeled example. Only in the case of Zoo dataset, there has been huge decline in the cluster purity when there are few labeled examples. This is attributed to the fact, that number of examples in zoo dataset are only 101 and 15% of labeled data is very less compared to number of class labels and may not capture representatives from all the 7 class examples. In case of Semi Supervised Classification, the training sets of benchmark datasets are used to build the classifier and the accuracy of the classifier is tested on the test set where the predicted class labels are compared with true class labels of the test examples. These test results given in terms of accuracy is compared with the proven classifier models. The models considered for comparison are Weka implementation of C4.5 and an ensemble method, Bagging. Only one ensemble method is considered for comparison as all the other ensemble methods has similar performance on most of the datasets (Tan et al., 2006: Table 5.5). The results are tabulated in table 4 and 5 and a sample comparison graphs for a dataset in binary and multiple class is shown in Fig. 4 and Fig. 5.  Experiments on the benchmark datasets shows that the proposed framework for both clustering and classification have performed consistently better for building models on the training set with varied range (75% to 15%) of labeled examples. When compared to other proven techniques, the proposed framework sustained its performance even when the number of labeled examples is reduced to 15% thus establishing its validity as a semi supervised learning model. The proposed framework was able to identify the relevant features along with their weightages thus reducing the information requirement for handling unknown situations may it be classification or clustering.


# VI.


# Conclusion

In this paper, the authors proposed a framework for context-aware semi supervised learning in terms of both clustering and classification. The proposed framework is useful to work in the domains where availability of labeled data is either scarce or difficult/expensive to obtain. The framework with wrapper based feature selection is very much useful in handling high dimensional datasets. With dimensions reduced, a cluster and classification solution is defined with lesser number of features. This is very useful in cases where there are time and space constraints. The proposed framework not only identifies the relevant features but also estimates the importance of a feature in terms of weights such that cluster solutions are formed as per the intended purpose. Though the framework has used K-means for the formation of cluster solution, the proposed SFRE algorithm can be wrapped into any partitional clustering algorithm with equal ease for producing context-aware semi supervised subspace clusters leveraging a few labeled examples for defining the context.

Since the model uses discriminant analysis for identifying attributes, it is limited to the numerical data. However, in reality, many of the applications contains mixed data, a combination of numeric and categorical data. This opens an avenue for further research to extend the model to work with categorical data. 
![Given a dataset with 'l' number of labeled and 'u' number of unlabeled examples such that l<<u, the proposed context-aware semi supervised subspace clustering framework has a four phase architecture as shown in Fig.1. The phases are explained in the subsequent paragraphs.](image-2.png "")
1![Figure 1 : Architecture of the Proposed Context-Aware Semi Supervised Subspace Clustering Framework a) Initial weights calculation](image-3.png "Figure 1 :")
![reflecting the natural grouping or profiling within the same class examples. However, clusters formed with a very few labeled examples may not be enough to define the purity of the clusters. The presence of an outlier labeled example will also alter/hamper the definition of the cluster. Clusters defined in terms of such labeled examples may lead to overfitting that could result in poor predictive performance for unseen examples. The cluster solution thus needs to be regularized to take care of the distribution of the labeled as well as unlabeled examples in the cluster space. The regularization procedure takes into account the presence of unlabeled examples in the vicinity of labeled examples in a cluster while estimating the purity of a cluster. The proximity measure between the labeled and unlabeled examples is used to define the probable class labels of the unlabeled examples. Cluster purity is thus defined in terms of true class labels for the labeled examples and probable class labels for the unlabeled examples. c) Cluster Regularization and Estimation of Semi supervised Cluster Purity Given a dataset D = L U consisting set of 'l' labeled examples, L and 'u' unlabeled examples, U, where l<<u. Each labeled example, n, in L has a label y n Q, where Q is a set of 'q' number of classes. The author devised a new metric called 'Cluster Concurrence' to measure the purity of the clusters based on the probabilistic class labels of unlabeled examples and true class labels of labeled examples. In case of labeled examples, true class labels are taken from the dataset. In the case of unlabeled examples, probabilistic class labels ?? ? , are generated in terms of the class labels of the labeled examples in their neighbourhood, based on the weighted Euclidean distance between the labeled and the unlabeled examples.](image-4.png "")
![d) Estimating feature relevance (SFRE algorithm)Though discriminant analysis is a proven technique for estimating the relevance of various features in a dataset, for the purpose of semi supervised subspace clustering, it can only make use of the limited labeled examples. In order to estimate the feature relevance more accurately leveraging abundantly available unlabeled examples in the dataset, the author proposes a new algorithm namely Semi supervised Feature Relevance Estimation (SFRE) algorithm, that improves the preliminary estimates made by discriminant analysis and thereby contributes to contextaware semi supervised subspace clustering.](image-5.png "")


1S.No.Dataset#Instances # Attributes Class1.Breast68392Cancer2.Credit6901523.Ionosphere3513424.Pima768825.Bankloan700826.Ecoli336787.Glass2149108.Iris150439.Wine17813310.Yeast148481011.Zoo10177
2Supervised Subspace clustering -Binary ClassDatasetsS.No Dataset100% 75 % 50 % 25 % 15 %1Bcancer97.24 96.94 95.76 96.34 96.292Credit86.52 86.26 85.63 85.77 85.783Ionosphere 90.56 88.23 90.21 88.24 88.564Pima77.65 76.14 75.86 76.79 77.905Bankloan80.077.92 76.91 77.39 73.94
3Supervised Subspace clustering -Binary ClassDatasetsS.No Dataset 75 %50 %25 %15 %1Ecoli86.2482.9083.8282.812Glass72.3173.7272.7669.013Iris96.6496.6495.3095.924Wine96.6197.7496.6195.445Yeast58.0457.9056.3056.106Zoo84.8197.092.065**The size of the zoo dataset is 101. As 15% of the examples could notcover all the seven classes, the error has increased unnaturally.
4Ensemble -BaggingC4.5Proposed ModelDataset100755025151007550251510075502515BreastCancer97.5695.2195.2095.0986.8294.8495.24 94.0391.7091.6197.6097.5696.6895.7495.74Credit92.0281.0880.4179.0277.2086.3783.78 80.4780.079.086.5285.2180.2880.9279.51Ionosphere94.0189.4788.3186.8480.5199.092.20 90.7888.3184.2191.7688.086.686.6484.0Pima88.9376.5374.8675.6971.8084.1171.82 71.5070.9470.3977.7776.8376.2775.070.05Bankloan85.2376.4074.072.072.090.073.93 72.3472.070.078.1174.6374.6274.0972.27
5Ensemble -BaggingC4.5Proposed ModelDataset755025157550251575502515Ecoli74.6674.0270.66567675.3270.6654.66 76.575.373.969.86Glass61.6662.7149.1549.15 62.7161.6649.1545.76 60.357.72 56.857.7Iris10094.1194.1158.82 97.1194.1191.1776.47 96.9696.96 96.593.93Wine91.6691.6688.8886.11 91.6691.6688.8883.33 97.1494.294.291.4Yeast58.7154.1551.8745.652.952.5351.4451.74 55.5254.71 52.551.21Zoo82.1477.777553.57 82.1478.5777.7764.28 88.8885.18 81.48 72.22
			© 2014 Global Journals Inc. (US)
			© 2014 Global Journals Inc. (US)A Framework for Context-Aware Semi Supervised Learning
		
		
* 
	
		Learning from High Dimensional fMRI Data using Random Projections
		
			MAdvani
		
		
			2011
		
	
* 
	
		Adapting k-means for supervised clust-erin-g
		
			SHAl-Harbi
		
		
			VJSmith
		
	
		Applied Intelligence
		
			24
			3
			
			2006
		
	
* 
	
		A probabilistic framework for semi-supervised clustering
		
			SBasu
		
		
			MBilenko
		
		
			RJMooney
		
	
		Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
				the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
		
			ACM
			2004. August
			
		
* 
	
		20-6) Semisupervised learning
		
			OChapelle
		
		
			BSchölkopf
		
		
			AZien
		
		
			MIT press
			Cambridge, MA
		
	
* 
	
		Semi Supervised Weighted K-Means Clustering for Multi Class Data Classification
		
			VGDharmavaram
		
		
			SMogalla
		
		
			2013
			IJCAIT
			
		
* 
	
		Research of semi-supervised spectral clustering based on constraints expansion
		
			SDing
		
		
			BQi
		
		
			HJia
		
		
			HZhu
		
		
			LZhang
		
	
		Neural Computing and Applications
		
			
			2012
		
	
* 
	
		Using clustering to learn distance functions for supervised similarity asses-sm-ent. Engineering Applications of Artificial Inte-lligen-ce
		
			CFEick
		
		
			ARouhana
		
		
			ABagherjeiran
		
		
			RVilalta
		
		
			2006
			19
			
		
* 
	
		Semi-Supervised Clustering with Partial Background Information
		
			JGao
		
		
			PNTan
		
		
			HCheng
		
	
		SDM
				
			2006. April
		
	
* 
	
		Unsupervised and semi-supervised clustering: a brief survey. A review of machine learning tech-niques for processing multimedia content
		
			NGrira
		
		
			MCrucianu
		
		
			NBoujemaa
		
		
			2004
		
	
	Report of the MUSCLE European Network of Excellence (FP6)


* 
	
		Data mining: concepts and techniques
		
			JHan
		
		
			MKamber
		
		
			JPei
		
		
			2006
			Morgan kaufmann
			New Delhi
		
	
* 
	
		Semi-supervised learning with penalized probabilistic clustering. I-n Advances in neural information processing sy-stems
		
			ZLu
		
		
			TKLeen
		
		
			2004
			
		
* 
	
		Intrinsic Dimensionality Based Conceptual Clustering
		
			PPadmaja
		
		
			MShashi
		
		
			KN KTeja
		
	
		International Journal of Advanced Computer Engineering
		
			3
			1
			
			2010
		
	
* 
	
		Multiclass feature selection with kernel Gram-matrix based criteria. Neural Networks and Learning Systems
		
			MRamona
		
		
			GRichard
		
		
			BDavid
		
	
		IEEE Transactions on
		
			23
			10
			
			2012
		
	
* 
	
		A survey on enhanced subspace clu-stering. Data mining and knowledge dis-covery
		
			KSim
		
		
			VGopalkrishnan
		
		
			AZimek
		
		
			GCong
		
		
			2013
			26
			
		
* 
	
		A fast clustering-based feature subset selection algorithm for high dimensional data. Knowledge and Data Engineering
		
			QSong
		
		
			JNi
		
		
			GWang
		
	
		IEEE Transactions on
		
			25
			1
			
			2013
		
	
* 
	
		Introduction to Data Mining
		
			TanPang-Ning
		
		
			SteinbachMichael
		
		
			KumarVipin
		
		
			2006
			Pearson Education
			New Delhi
		
	
* 
	
		Constrained k-means clustering with background knowledge
		
			KWagstaff
		
		
			CCardie
		
		
			SRogers
		
		
			SSchrödl
		
	
		ICML
				
			2001. June
			1
			
		
* 
	
		Feature selection with kernel class separability. Pattern Analysis and Machine Intelligence
		
			LWang
		
	
		IEEE Transactions on
		
			30
			9
			
			2008
		
	
* 
	
		New semisupervised classification method based on modified cluster assumption. Neural Networks and Learning Systems
		
			YWang
		
		
			SChen
		
		
			ZHZhou
		
	
		IEEE Transactions on
		
			23
			5
			
			2012
		
	
* 
	
		Active learning of constraints for semi-supervised clustering. Kno-wledge and Data Engineering
		
			SXiong
		
		
			JAzimi
		
		
			XFern
		
	
		IEEE Transaction on
		
			26
			1
			
			2014
		
	
* 
	
		Distance metric learning guided adaptive subspace semi-supervised clustering. Frontiers of computer science in china
		
			XYin
		
		
			EHu
		
		
			2011
			5
			
		
* 
	
		A Novel Supervised Clustering Based on the Feature Classification Weight
		
			QZhao
		
		
			HQu
		
	
		Computational Intelligence and Natural Computing, 2009. CINC'09. International Conference on
				
			IEEE
			2009. June
			1
			
		
* 
	
		Semi-supervised learning literature survey
		
			XZhu
		
		
			2006
			2
			3
		
		
			Computer Science, University of Wisconsin Madison
		
	
* 
	
		Introduction to semi-supervised learning. Synthesis lectures on art-ificial intelligence and machine learning
		
			XZhu
		
		
			ABGoldberg
		
		
			2009
			3