IJIEEB Vol. 9, No. 5, 8 Sep. 2017
Cover page and Table of Contents: PDF (size: 618KB)
Full Text (PDF, 618KB), PP.1-9
Views: 0 Downloads: 0
Classification, diabetes diagnosis, stability selection
Diabetes is a chronic, metabolic disease related to the rise of levels of blood glucose. According to the current data from the World Health Organization, 422 million adults have diabetes in the world and prevalence of diabetes is 13.2%. Disregarding the diagnosis and treatment of the disease leads to some major problems on kidneys, heart and blood vessels, eyes, nerves, pregnancy and wound healing. The most common type of diabetes and usually in adults, Type 2 diabetes occurs when the body becomes resistant to insulin or does not make enough insulin. The main objective of this study is to make more successful this disease by investigating the important attributes based on assessing the importance of attributes using the Stability Selection method. The proposed method might be a powerful tool for the importance of attributes, and effective diagnosis of this disease with the classification accuracy is 78.57% and ROC value is 0.75.
Kemal Akyol, "Assessing the Importance of Attributes for Diagnosis of Diabetes Disease", International Journal of Information Engineering and Electronic Business(IJIEEB), Vol.9, No.5, pp.1-9, 2017. DOI:10.5815/ijieeb.2017.05.01
[1]Internet: WHO, World Diabetes Statistics, Geneva, Switzerland, WHO. http://www.who.int/diabetes/en/index.html, 2017.
[2]National Research Council (US) Committee on Population; Sandefur GD, Rindfuss RR, Cohen B, editors. Changing numbers, changing needs: American Indian demography and public health. Washington DC, USA: National Academy of Sciences, 1996.
[3]Internet: Type 2 Diabetes: the basics. http://www.webmd.com/diabetes/type-2-diabetes-guide/type-2-diabetes, 2017.
[4]M. McCarthy and S. Menzel, “The genetics of type 2 diabetes,” Brit J Clin Pharmaco, vol. 51, no. 3, pp. 195-199, 2001.
[5]L. Marks, British Diabetic Association, and King's Fund Policy Institute, “Counting the cost: The real impact of non-insulin-dependent diabetes,” London: British Diabetic Association, 1996.
[6]G. Reaven, “Role of insulin resistance in human disease,” Diabetes, vol. 37, no. 12, pp. 1595-1607, 1988.
[7]S. Chittineni and R.B. Bhogapathi, “Determining Contribution of Features in Clustering Multidimensional Data Using Neural Network,” I.J. Information Technology and Computer Science, vol. 10, pp. 29-36, 2012.
[8]R. Parimala and R. Nallaswamy, “Feature Selection using a Novel Particle Swarm Optimization and It’s Variants,” I.J. Information Technology and Computer Science, vol. 5, pp. 16-24, 2012.
[9]P. Kalpana and K. Mani, “An Exploratory Analysis between the Feature Selection Algorithms IGMBD and IGChiMerge,” I.J. Information Technology and Computer Science, vol. 7, pp. 61-68, 2017.
[10]A.F. Alia and A. Taweel, “Feature Selection based on Hybrid Binary Cuckoo Search and Rough Set Theory in Classification for Nominal Datasets,” I.J. Information Technology and Computer Science, vol. 4, pp. 63-72, 2017.
[11]S. Lekkas, L. Mikhailov, “Evolving fuzzy medical diagnosis of Pima Indians diabetes and of dermatological diseases,” Artif Intell Med, vol.50, no.2, pp. 117-126, 2010.
[12]H. Temurtas, N. Yumusak, F. Temurtas, “A comparative study on diabetes disease diagnosis using neural networks,” Expert Syst Appl, vol. 36, no. 4, pp. 8610-8615, 2009.
[13]K. Kayaer, T. Yildirim, “Medical diagnosis on Pima Indian diabetes using general regression neural networks,” In: Proceedings of the International Conference on Artificial Neural Networks and Neural Information, pp. 181-184, 2003.
[14]S. Sadri, A. Maleki, R. Hashemi, Z. Panbechi, K. Chalabi, “Comparison of data mining algorithms in the diagnosis of type II diabetes,” International Journal on Computational Science & Applications, vol. 5, no. 5, pp. 1-12, 2015.
[15]O.O. Ebenezer, A. Khashman, “Onset diabetes diagnosis using artificial neural network,” International Journal of Scientific & Engineering Research, vol. 5, no. 10, pp. 754-759, 2014.
[16]G.A. Carpenter, N. Markuzon, “ARTMAP-IC and medical diagnosis: Instance counting and inconsistent cases,” Neural Networks, vol. 11, no. 2, pp. 323-336, 1998.
[17]Y. Huang, P. McCullagh, N. Black, R. Harper, “Feature selection and classification model construction on type 2 diabetic patients data,” Artif Intell Med, vol. 41, no. 3, pp. 251-262, 2007.
[18]M.R. Bozkurt, N. Yurtay, Z. Yılmaz, C. Sertkaya, “Comparison of different methods for determining diabetes,” Turk J Elec Eng & Comp Sci, vol. 22, pp. 1044-1055, 2014.
[19]M. Seera, C.P. Lim, “A Hybrid intelligent system for medical data classification,” Expert Syst Appl, vol. 41, no. 5, pp. 2239-2249, 2014.
[20]E. Dogantekin, A. Dogantekin, D. Avci, L. Avci, “An intelligent diagnosis system for diabetes on linear discriminant analysis and adaptive network based fuzzy inference system: LDA– ANFIS,” Digit Signal Process, vol. 20, no. 4, pp. 1248-1255, 2010.
[21]P. Lukka, “Feature selection using fuzzy entropy measures with similarity classifier,” Expert Syst Appl, vol. 38, no. 4, pp. 4600-4607, 2011.
[22]M.W. Aslam, Z. Zhu, A.K. Nandi, “Feature generation using genetic programming with comparative partner selection for diabetes classification,” Expert Syst Appl, vol. 40, no. 13, pp. 5402-5412, 2013.
[23]K. Selvakuberan, D. Kayathiri, B. Harini, M.I. Devi, “An efficient feature selection method for classification in health care systems using machine learning techniques,” 3rd International Conference on Electronics Computer Technology; 8-10 Apr 2011.
[24]N. Barakat, A.P. Bradley and M.N.H. Barakat, “Intelligible support vector machines for diagnosis of diabetes mellitus,” IEEE Transactions on Information Technology in Biomedicine; vol. 4, pp. 1114-1120, 12 Jan 2010.
[25]Internet: UCI machine learning, Pima Indians diabetes dataset, University of California Irvine,, CA, USA, http://archive.ics.uci.edu/ml/machine-learning-databases/pima-indians-diabetes, 2017.
[26]W.C. Knowler, P.H. Bennett, R.F. Hammam and M. Miller, “Diabetes incidence and prevalence in Pima Indians: a 19-fold greater incidence than in Rochester, Minnesota,” Am J Epidemiol, vol. 108, no.6, pp. 497-504, 1978.
[27]A. Krosnick, “The diabetes and obesity epidemic among the Pima Indians,” N J Med, vol. 97, no. 8, pp. 31-37, 2000.
[28]L.J. Baier, R.L. Hanson, “Genetic studies of the etiology of type 2 diabetes in Pima Indians: hunting for pieces to a complicated puzzle,” Diabetes, vol. 53, no. 5, pp. 1181-1186, 2004.
[29]D. Dabelea, R.L. Hanson, P.H. Bennett, J. Roumain, W.C. Knowler and D.J. Pettitt, “Increasing prevalence of type II diabetes in American Indian children,” Diabetologia, vol. 41, no. 8, pp. 904-910, 1998.
[30]F. Mordelet, J. Horton, A.J. Hartemink, B.E. Engelhardt and R. Gordan, “Stability selection for regression-based models of transcription factor–DNA binding specificity,” Bioinformatics, vol. 29, no. 13, pp. 117-125, 2013.
[31]L. Breiman, “Random forests,” Mach Learn, vol. 45, pp. 5-32, 2001.
[32]O. Akar and O. Gungor, “Classification of multispectral images using Random Forest algorithm,” Journal of Geodesy and Geoinformation, vol. 1, no. 2, pp. 139-146, 2012.
[33]S. Lemeshow and D. Hosmer, Applied logistic regression, 2nd ed. New York, USA: Wiley, 2000.
[34]A. Agresti, An introduction to categorical data analysis, 2nd ed. New Jersey, USA: Wiley, 2007.
[35]A. Baratloo, M. Hosseini, A. Negida and G.E. Ashal, “Part 1: Simple Definition and Calculation of Accuracy. Sensitivity and Specificity,” Emerg (Tehran), vol. 3, no. 2, pp. 48-49, 2015.
[36]R. Kohavi, “A study of cross-validation and bootstrap for accuracy estimation and model selection,” Proceedings of the 14th international joint conference on Artificial intelligence, vol. 2, pp. 1137-1143, 1995.
[37]Internet: ROC Curves, https://www.medcalc.org/manual/roc-curves.php, 2017.