NegMiner: An Automated Tool for Mining Negations from Electronic Narrative Medical Documents

Full Text (PDF, 265KB), PP.14-22

Views: 0 Downloads: 0

Author(s)

Hanan Elazhary 1,2,*

1. Computer Science Department, Faculty of Computing & Information Technology, King Abdulaziz University, Jeddah, Saudi Arabia

2. Computers and Systems Department, Electronics Research Institute, Cairo, Egypt

* Corresponding author.

DOI: https://doi.org/10.5815/ijisa.2017.04.02

Received: 6 Jul. 2016 / Revised: 1 Oct. 2016 / Accepted: 2 Dec. 2016 / Published: 8 Apr. 2017

Index Terms

Data Mining, Medical Documents, Natural Language Processing, Negations, NegEx, NLP

Abstract

Mining negations from electronic narrative medical documents is one of the prominent data mining applications. Since medical documents are freely written, it is impossible to consider all possible sentence structures in advance and so frequent update of mining algorithms is inevitable. Unfortunately most of the proposed algorithms in the literature are too complex to be easily updated. Besides, most of them cannot be easily ported to other natural languages. The simple NegEx algorithm utilizes only two regular expressions and sets of terms to mine negations from narrative medical documents and so does not suffer from these shortcomings. Meanwhile, it has shown impressive mining results and so it is the most widely adopted algorithm. This paper proposes the Negation Mining (NegMiner) tool to address some of the shortcomings of the NegEx algorithm. The NegMiner exploits some basic syntactic and semantic information to deal with contiguous and multiple negations. It is a user-friendly tool that facilitates the task of knowledge base update and the task of document analysis through the use of PDF files. This also makes it able to deal with the existence of a medical finding several times in a single sentence. Experimental results have shown the superiority of the mining results of the NegMiner in comparison to the simulated NegEx algorithm.

Cite This Paper

Hanan Elazhary,"NegMiner: An Automated Tool for Mining Negations from Electronic Narrative Medical Documents", International Journal of Intelligent Systems and Applications(IJISA), Vol.9, No.4, pp.14-22, 2017. DOI:10.5815/ijisa.2017.04.02

Reference

[1]F. Chapman, W. Bridewell, P. Hanbury, G. Cooper, and B. Buchanan, "A simple algorithm for identifying negated findings and diseases in discharge summaries," Journal of Biomedical Informatics, vol. 34, pp. 301-310, 2001.
[2]Unified Medical Language System (UMLS), https://www.nlm.nih.gov/research/umls/, [Online; accessed: 2016-06-01].
[3]W. Chapman, D. Hilert, S. Velupillai, M. Kvist, M. Skeppstedt, B. Chapman, M. Conway, M. Tharp, D. Mowery, and L. Deleger, "Extending the NegEx lexicon for multiple languages," Studies in Health Technology and Informatics, vol. 192, pp. 677-681, 2013.
[4]Negex, https://code.google.com/p/negex/, [Online; accessed: 2016-06-01].
[5]R. Morante, "Descriptive analysis of negation cues in biomedical texts," Proceedings of the 7th Conference on International Language Resources and Evaluation, Valletta, Malta, pp. 1429-1436, 2010.
[6]Y. Huang and H. Lowe, "A grammar-based classification of negations in clinical radiology reports," Proceedings of AMIA Symposium, Washington, DC, USA, p. 988, 2005.
[7]V. Vincze, G. Szarvas, R. Farkas, G. Mora, and J. Csirik, "The BioScope Corpus: Biomedical texts annotated for uncertainty, negation and their scopes," BMC Bioinformatics, vol. 9, no. 11, 2008.
[8]M. Skeppstedt, "Negation detection in Swedish clinical text: An adaption of NegEx to Swedish," Journal of Biomedical Semantics, vol. 2, no. 3, 2011.
[9]L. Deleger and C. Grouin, "Detecting negation of medical problems in French clinical notes," Proceedings of the 2nd ACM SIGHIT International Health Informatics Symposium, Miami, Florida, pp. 697-702, 2012.
[10]D. Aronow, F. Fangfang, and W. Croft, "Ad hoc classification of radiology reports," Journal of the American Medical Informatics Association, vol. 6, no. 5, pp. 393-411, 1999.
[11]S. Goryachev, M. Sordo, Q. Zeng, and L. Ngo, "Implementation and evaluation of four different methods of negation detection," Technical Report, Harvard Medical School, Boston, USA, 2006.
[12]K. Mitchell, M. Becich, J. Berman, W. Chapman, J. Gilbertson, D. Gupta, J. Harrison, E. Legowski, and R. Crowley, "Implementation and evaluation of a negation tagger in a pipeline-based system for information extraction from pathology reports," Proceedings of MEDINFO, pp. 663-667, 2004.
[13]S. Meystre and P. Haug, "Natural language processing to extract medical problems from electronic clinical documents: Performance evaluation," Journal of Biomedical Informatics, vol. 39, pp. 589-599, 2006.
[14]W. Chapman, W. Bridewell, P. Hanbury, G. Cooper, and B. Buchanan, "Evaluation of negation phrases in narrative clinical reports," Proceedings of the AMIA Symposium, Washington, DC, USA, pp. 105-109, 2001.
[15]W. Chapman, D. Chu, and J. Dowling, "ConText: An algorithm for identifying contextual features from clinical text," Proceedings of the Workshop on BioNLP 2007: Biological, Translational and Clinical Language Processing, Prague, Czech Republic, pp. 81-88, 2007.
[16]H. Harkema, J. Dowling, T. Thornblade, and W. Chapman, "ConText: An algorithm for determining negation, experiencer, and temporal status from clinical reports," Journal of Biomedical Informatics, vol. 42, pp. 839-851, 2009.
[17]H. Tanushi, H. Dalianis, M. Duneld, M. Kvist, M. Skeppstedt, and S. Velupillai, "Negation scope delimitation in clinical text using three approaches: NegEx, PyConTextNLP and SynNeg," Proceedings of the 19th Nordic Conference of Computational Linguistics, Oslo, Norway, pp. 387-474, 2013.
[18]B. Chapman, W. Wei, and W. Chapman, "The frequency of ConText lexical items in diverse medical texts," Proceedings of the IEEE 2nd Conference on Healthcare Informatics, Imaging and Systems Biology, La Jolla, California, USA, p. 135, 2012.
[19]D. Mowery, H. Harkema, J. Dowling, J. Jonathan, L. Lustgarten, and W. Chapman, "Distinguishing historical from current problems in clinical reports - Which textual features help?" Proceedings of the Workshop on BioNLP, Boulder, Colorado, USA, pp. 10-18, 2009.
[20]B. Chapman, S. Lee, H. Kang, and W. Chapman, "Document-level classification of CT pulmonary angiography reports based on an extension of the ConText algorithm," Journal of Biomedical Informatics, vol. 44, pp. 728-737, 2011.
[21]R. Wilson, W. Chapman, S. DeFries, M. Becich, and B. Chapman, "Automated ancillary cancer history classification for mesothelioma patients from free-text clinical reports," Journal of Pathology Informatics, vol. 1, no. 24, 2010.
[22]S. Velupillai, M. Skeppstedt, M. Kvist, D. Mowery, B. Chapman, H. Dalianis, and W. Chapman, "Porting a rule-based assertion classier for clinical text from English to Swedish," Proceedings of the 4th International Louhi Workshop on Health Document Text Mining and Information Analysis, Sydney, Australia, 2013.
[23]S. Velupillai, M. Skeppstedt, M. Kvist, D. Mowery, B. Chapman, H. Dalianis, and W. Chapman, "Cue-based assertion classification for Swedish clinical text - Developing a lexicon for pyConTextSwe," Artificial Intelligence in Medicine, vol. 61, pp. 137-144, 2014.
[24]Ö. Uzuner, X. Zhang, and T. Sibanda, "Machine learning and rule-based approaches to assertion classification," Journal of the American Medical Informatics Association, vol. 16, no. 1, pp. 109-115, 2009.
[25]P. Mutalik, A. Deshpande, and P. Nadkarni, "Use of general-purpose negation detection to augment concept indexing of medical documents: A quantitative study using the UMLS," Journal of the American Medical Informatics Association, vol. 8, no. 6, pp. 598-609, 2001.
[26]B. Hazlehurst, H. Frost, D. Sittig, and V. Stevens, "MediClass: A system for detecting and classifying encounter-based clinical events in any electronic medical record," Journal of the American Medical Informatics Association, vol. 12, no. 5, pp. 517-529, 2005.
[27]S. Boytcheva, A. Strupchanska, E. Paskaleva, and D. Tcharaktchiev, "Some aspects of negation processing in electronic health records," Proceedings of the International Workshop on Language and Speech Infrastructure for Information Access in the Balkan Countries, Borovets, Bulgaria, 2005.
[28]H. Tolentino, M. Matters, W. Walop, B. Law, W. Tong, F. Liu, P. Fontelo, K. Kohl, and D. Payne, "Concept negation in free text components of vaccine safety reports," Proceedings of AMIA Symposium, Washington, DC, USA, p. 1122, 2006.
[29]Y. Huang and H. Lowe, "A novel hybrid approach to automated negation detection in clinical radiology reports," Journal of the American Medical Informatics Association, vol. 14, no. 3, pp. 304-311, 2007.
[30]S. Gindl, K. Kaiser, and S. Miksch, "Syntactical negation detection in clinical practice guidelines," Studies in Health Technology and Informatics, vol. 136, pp. 187-192, 2008.
[31]Q. Zhu, J. Li, H. Wang, and G. Zhou, "A unified framework for scope learning via simplified shallow semantic parsing," Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, MIT, Massachusetts, USA, pp. 714-724, 2010.
[32]M. Ballesteros, V. Francisco, A. Diaz, J. Herrera, and P. Gervas, "Inferring the scope of negation in biomedical documents," Proceedings of the 13th International Conference on Intelligent Text Processing and Computational Linguistics, New Delhi, India, 2012.
[33]E. Velldal, L. Øvrelid, J. Read, and S. Oepen, "Speculation and negation: Rules, rankers, and the role of syntax," Computational Linguistics, vol. 38, no. 2, pp. 369-410, 2012.
[34]Z. Jia, H. Li, M. Ju, Y. Zhang, Z. Huang, C. Ge, and H. Duan, "A finite-state automata based negation detection algorithm for Chinese clinical documents," Proceedings of International Conference on Progress in Informatics and Computing, Shanghai, China, pp. 128-132, 2014.
[35]I. Goldin and W. Chapman, "Learning to detect negation with ‘not’ in medical texts," Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development on Information Retrieval, Toronto, Canada, 2003.
[36]R. Morante, A. Liekens, and W. Daelemans, "Learning the scope of negation in biomedical texts," Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, Honolulu, Hawaii, USA, pp. 715-724, 2008.
[37]L. Rokach, R. Romano, and O. Maimon, "Negation recognition in medical narrative reports," Information Retrieval, vol. 11, pp. 499-538, 2008.
[38]R. Morante and W. Daelemans, "A Metalearning approach to processing the scope of negation," Proceedings of the 13th Conference on Computational Natural Language Learning, Boulder, Colorado, USA, pp. 21-29, 2009.
[39]S. Agarwal and H. Yu, "Biomedical negation scope detection with conditional random fields," Journal of the American Medical Informatics Association, vol. 17, pp. 696-701, 2010.
[40]K. Fujikawa, K. Seki, and K. Uehara, "NegFinder: A web service for identifying negation signals and their scopes," Technical Report, IPSJ SIG, 2013.
[41]S. Gindl, "Negation detection in automated medical applications: A Survey," Technical Report TR-2006-1, Institute of Software Technology & Interactive Systems, Vienna University of Technology, 2006.
[42]I. Khan and M. Haleem, "Managing Lexical Ambiguity in the Generation of Referring Expressions," International Journal of Intelligent Systems and Applications, no. 8, pp. 33-39, 2013.