Domain Based Ontology and Automated Text Categorization Based on Improved Term Frequency – Inverse Document Frequency

Full Text (PDF, 499KB), PP.28-35

Views: 0 Downloads: 0

Author(s)

Sukanya Ray 1,* Nidhi Chandra 1

1. Amity School of Engineering & Technology, Amity University, Noida (U.P.), India

* Corresponding author.

DOI: https://doi.org/10.5815/ijmecs.2012.04.04

Received: 6 Jan. 2012 / Revised: 25 Feb. 2012 / Accepted: 10 Mar. 2012 / Published: 8 Apr. 2012

Index Terms

Term Frequency – Inverse Document Frequency, Ontology, Dependency Graph, Text Categorization

Abstract

In recent years there has been a massive growth in textual information in textual information especially in the internet. People now tend to read more e-books than hard copies of the books. While searching for some topic especially some new topic in the internet it will be easier if someone knows the pre-requisites and post- requisites of that topic. It will be easier for someone searching a new topic. Often the topics are found without any proper title and it becomes difficult later on to find which document was for which topic. A text categorization method can provide solution to this problem. In this paper domain based ontology is created so that users can relate to different topics of a domain and an automated text categorization technique is proposed that will categorize the uncategorized documents. The proposed idea is based on Term Frequency – Inverse Document Frequency (tf -idf) method and a dependency graph is also provided in the domain based ontology so that the users can visualize the relations among the terms.

Cite This Paper

Sukanya Ray, Nidhi Chandra, "Domain Based Ontology and Automated Text Categorization Based on Improved Term Frequency – Inverse Document Frequency", IJMECS, vol.4, no.4, pp.28-35, 2012. DOI:10.5815/ijmecs.2012.04.04

Reference

[1]J. D. Novak and A. J. C. Nas, "The theory underlying concept maps and how to construct and use them," in Technical Report IHMC C map Tools 2006-01 Rev 01-2008. Florida Institute for Human and Machine Cognition, 2008.
[2]W. M.-j. YUN Hong- yan, XU Jian-liang and X. Jing, "Development of domain ontology for e-learning course," in ITIME-09 IEEE international symposium, 2009.
[3]T.R.Guber, "Towards principles for the design of ontologies used for knowledge sharing," in Int..J.Human-Computer Studies. Florida Institute for Human and Machine Cognition,43(5-6), p.p 9.7-928, 1993.
[4]D. Fensel, I. Horrocks, F. van Harmelen, D. L. McGuinness, and P. Patel-Schneider, "Oil: An ontology infrastructure for the semantic web," IEEE Intelligent Systems, vol. 16, no. 2, 2001.
[5]Wikipedia, "Dependency graph — wikipedia, the free encyclopedia," 2011, [Online; accessed 16-February-2011]. [Online]Available:http://en.wikipedia.org/w/index.php?title=Dependency_graph&oldid=408804604
[6]Ma Zhanguo, Feng Jing, Chen Liang, Hu Xiangyi, Shi Yanqin, Ma Zhanguo "An Improved Approach to Terms Weighting in Text Classification" 978-1-4244-9283-1/11 2011 IEEE
[7]Sukanya Ray and Nidhi Chandra "A Term Frequency-Inverse Document Frequency Based Prototype Model for Easing Text Categorization Effort for Conference Organizing Committee" International Journal of Computational Intelligence and Information Security, February 2012 Vol. 3, No. 2 pp 33 – 37