Semantic learning is an important mechanism for the document classification, but most classification approaches are only considered the content and words distribution. Traditional classification algorithms cannot accurately represent the meaning of a document because it does not take into account semantic relations between words. In this paper, we present an approach for classification of documents by incorporating two similarity computing score method. First, a semantic similarity method which computes the probable similarity based on the Bayes' method and second, n-grams pairs based on the frequent terms probability similarity score. Since, both semantic and N-grams pairs can play important roles in a separated views for the classification of the document, we design a semantic similarity learning (SSL) algorithm to improves the performance of document classification for a huge quantity of unclassified documents. The experiment evaluation shows an improvisation in accuracy and effectiveness of the proposal for the unclassified documents.

How to Cite
VINEETH KUMAR, DR. N SATYANARAYANA., V. Probability of Semantic Similarity and N-grams Pattern Learning for Data Classification. Global Journal of Computer Science and Technology, [S.l.], may 2017. ISSN 0975-4172. Available at: <https://computerresearch.org/index.php/computer/article/view/1532>. Date accessed: 19 aug. 2019.