Probability of Semantic Similarity and N-grams Pattern Learning for Data Classification
Keywords:
semantic similarity, classification, naive bayes, n-grams pattern
Abstract
Semantic learning is an important mechanism for the document classification, but most classification approaches are only considered the content and words distribution. Traditional classification algorithms cannot accurately represent the meaning of a document because it does not take into account semantic relations between words. In this paper, we present an approach for classification of documents by incorporating two similarity computing score method. First, a semantic similarity method which computes the probable similarity based on the Bayes' method and second, n-grams pairs based on the frequent terms probability similarity score. Since, both semantic and N-grams pairs can play important roles in a separated views for the classification of the document, we design a semantic similarity learning (SSL) algorithm to improves the performance of document classification for a huge quantity of unclassified documents. The experiment evaluation shows an improvisation in accuracy and effectiveness of the proposal for the unclassified documents.
Downloads
- Article PDF
- TEI XML Kaleidoscope (download in zip)* (Beta by AI)
- Lens* NISO JATS XML (Beta by AI)
- HTML Kaleidoscope* (Beta by AI)
- DBK XML Kaleidoscope (download in zip)* (Beta by AI)
- LaTeX pdf Kaleidoscope* (Beta by AI)
- EPUB Kaleidoscope* (Beta by AI)
- MD Kaleidoscope* (Beta by AI)
- FO Kaleidoscope* (Beta by AI)
- BIB Kaleidoscope* (Beta by AI)
- LaTeX Kaleidoscope* (Beta by AI)
How to Cite
Published
2017-05-15
Issue
Section
License
Copyright (c) 2017 Authors and Global Journals Private Limited
This work is licensed under a Creative Commons Attribution 4.0 International License.