Masoumeh Zareapoor

Work place: Department of Computer Science, Jamia Hamdard, New Delhi, India

E-mail: mzarea@jamiahamdard.ac.in

Website:

Research Interests:

Biography

Masoumeh Zareapoor is a Ph.D. student at Jamia Hamdard University, New Delhi, India. She received her Master degree in computer science from Jamia Hamdard University in 2010.

Author Articles
Feature Extraction or Feature Selection for Text Classification: A Case Study on Phishing Email Detection

By Masoumeh Zareapoor Seeja K. R

DOI: https://doi.org/10.5815/ijieeb.2015.02.08, Pub. Date: 8 Mar. 2015

Dimensionality reduction is generally performed when high dimensional data like text are classified. This can be done either by using feature extraction techniques or by using feature selection techniques. This paper analyses which dimension reduction technique is better for classifying text data like emails. Email classification is difficult due to its high dimensional sparse features that affect the generalization performance of classifiers. In phishing email detection, dimensionality reduction techniques are used to keep the most instructive and discriminative features from a collection of emails, consists of both phishing and legitimate, for better detection. Two feature selection techniques - Chi-Square and Information Gain Ratio and two feature extraction techniques – Principal Component Analysis and Latent Semantic Analysis are used for the analysis. It is found that feature extraction techniques offer better performance for the classification, give stable classification results with the different number of features chosen, and robustly keep the performance over time.

[...] Read more.
Other Articles