Empirical Analysis of Cervical and Breast Cancer Prediction Systems using Classification

Full Text (PDF, 593KB), PP.1-15

Views: 0 Downloads: 0

Author(s)

Prabhjot Kaur 1,* Yashita Pruthi 1 Vidushi Bhatia 1 Janmjay Singh 1

1. Department of Information technology, MSIT, India

* Corresponding author.

DOI: https://doi.org/10.5815/ijeme.2019.03.01

Received: 30 Jan. 2019 / Revised: 21 Mar. 2019 / Accepted: 25 Apr. 2019 / Published: 8 May 2019

Index Terms

Cancer Prediction Systems, cervical cancer, Breast Cancer

Abstract

Cancer is a life-threatening disease with high mortality rates. In the Indian subcontinent, women have a higher possibility to be diagnosed with cancer than men. The most common cancers identified in Indian women are Breast Cancer and Cervical Cancer. Both these cancers have high survival rates in case of early prediction. This paper reviews the attributes which are used in the existing datasets for prediction of these two cancers. The paper also proposes new attributes to overcome the limitations of existing ones, which will further increase the effectiveness of cancer prediction systems. The efficiency of existing and proposed attributes is compared by processing datasets through data mining algorithms using WEKA tool. The algorithms used for this study are – J48 (Decision Tree), Na?ve Bayes, Random Forest, Random Tree, KStar and Bagging Algorithm. The empirical analysis done in the paper reported improvement in the efficiency of cancer prediction over existing prediction systems.

Cite This Paper

Prabhjot Kaur, Yashita Pruthi, Vidushi Bhatia, Janmjay Singh,"Empirical Analysis of Cervical and Breast Cancer Prediction Systems using Classification", International Journal of Education and Management Engineering(IJEME), Vol.9, No.3, pp.1-15, 2019. DOI: 10.5815/ijeme.2019.03.01

Reference

[1]ICMR (India Council of Medical Research) Report of common cancers in India –  http://www.dailypioneer.com/nation/over-17-lakh-new-cancer-cases-in-india-by-2020-icmr.html

[2]Cancer Prevention Measures – https://www.mayoclinic.org/healthy-lifestyle/adult-health/in-depth/cancer-prevention/art-20044816

[3]Cancer statistics based on gender in India, a study by NICPR ( National Institute of Cancer Prevention and Research) –  http://cancerindia.org.in/cancer-hits-women-india-men-men-die/

[4]Common cancers in India: Research by National Institute of Cancer Prevention and Research. http://cancerindia.org.in/common-cancers/

[5]Dipti N. Punjani, Dr. Kishor H. Atkotiya, “Cervical Cancer Prediction using Data Mining”, International Journal for Research in Applied Science & Engineering Technology, Volume 5 Issue XII, December 2017

[6]Neelam Singh, Santosh Kumar Singh Bhadauria, “Early Detection of Cancer using Data Mining”, International Journal of Applied Mathematical Sciences, Volume 9, pp. 47-52, 2016

[7]K. Arutchelvan, Dr. R. Periyasamy, “Cancer Prediction System using Data Mining Technique”, International Research Journal of Engineering and Technology, Volume 2 Issue 8, November 2015

[8]V.Krishnaiah, Dr.G.Narsimha, Dr.N.Subhash Chandra, “Diagnosis of Lung Cancer Prediction System using Data Mining Classification Techniques”, International Journal of Computer Science and Information Technologies, Vol. 4 (1) , pp. 39 – 45, 2013

[9]A.Priyanga, Dr.S.Prakasam, “The Role of Data Mining-Based Cancer Prediction system (DMBCPS) in Cancer Awareness”, International Journal of Computer Science and Engineering Communications, Vol.1 Issue.1, December 2013

[10]A.Priyanga, S.Prakasam, “Effectiveness of Data Mining - based Cancer Prediction System (DMBCPS)”, International Journal of Computer Applications, Volume 83 No 10, December 2013

[11]P.Ramachandran, N.Girija, T.Bhuvaneswari, “Early Detection and Prevention of Cancer using Data Mining Techniques”, International Journal of Computer Applications, Volume 97 No.13, July 2014

[12]General statistics of Cancer – http://cancerindia.org.in/statistics/ 

[13]Cancer statistics in India from – http://www.cancerindex.org/India

[14]Research source to find multiple datasets – https://www.kaggle.com/ 

[15]Research source to find multiple datasets – http://tunedit.org/research 

[16]Breast Cancer Data Source: http://tunedit.org/repo/UCI/breast-cancer.arff

[17]Breast Cancer attributes description and study –  https://pdfs.semanticscholar.org/4945/4263b6a75a87dbeb94dbe0ba418dba16f459.pdf

[18]Cervical Cancer Data Source – https://www.kaggle.com/loveall/cervical-cancer-risk-classification/data 

[19]Study on cervical cancer, it’s common trends and statistics – https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4404964/

[20]Symptoms of cervical cancer – https://www.cancercenter.com/cervical-cancer/symptoms/

[21]Age group of cervical cancer – https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4404964/ 

[22]Sex Steroids and cervical cancer – https://www.ncbi.nlm.nih.gov/pubmed/22843872 

[23]Hall, Mark, Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer, Peter Reutemann, and Ian H. Witten, "The WEKA data mining software: an update”, ACM SIGKDD explorations newsletter 11, no. 1 (2009), pp. 10-18.

[24]Dr. Neeraj Bhargava, Girja Sharma, Dr. Ritu Bhargava, Manish Mathuria, “Decision Tree Analysis on J48 Algorithm for Data Mining”, International Journal of Advanced Research in Computer Science and Software Engineering, Volume 3, Issue 6, June 2013

[25]McCallum, Andrew, and Kamal Nigam, "A comparison of event models for naive bayes text classification.", AAAI-98 workshop on learning for text categorization, volume 752, no. 1, pp. 41-48, 1998.

[26]Leo Breiman, "Consistency for a simple model of random forests”, 2004

[27]Ajay Kumar Mishra, Bikram Kesari Ratha, “Study of Random Tree and Random Forest Data Mining Algorithms for Microarray Data Analysis”, International Journal of Electrical and Computer Engineering, Volume 3 Issue 4, 2016

[28]Dayana C. Tejera Hernández, “An Experimental Study of K* Algorithm”, I.J. Information Engineering and Electronic Business, Volume 7, no. 2, March 2015

[29]G.T. Prasanna Kumari, “A Study Of Bagging And Boosting Approaches To Develop Meta-Classifier”, Engineering Science and Technology: An International Journal, Volume 2 Number 5, October 201.