Sanjay B. Ankali

Work place: KLE College of Engineering and Technology, Chikodi-India-591201

E-mail: sanjayankali123@gmail.com

Website:

Research Interests: Software Engineering

Biography

Sanjay Ankali is a research scholar at VTU-RRC, Belagavi-590018 and working as Assistant Professor in the Department of CSE at KLECET, Chikodi, India-591201. His research interest is in the field of Software Engineering, Software clone detection and code plagiarism detection.

Author Articles
Detection and Classification of Cross-language Code Clone Types by Filtering the Nodes of ANTLR-generated Parse Tree

By Sanjay B. Ankali Latha Parthiban

DOI: https://doi.org/10.5815/ijisa.2021.03.05, Pub. Date: 8 Jun. 2021

A complete and accurate cross-language clone detection tool can support software forking process that reuses the more reliable algorithms of legacy systems from one language code base to other. Cross-language clone detection also helps in building code recommendation system. This paper proposes a new technique to detect and classify cross-language clones of C and C++ programs by filtering the nodes of ANTLR-generated parse tree using a common grammar file, CPP14.g4. Parsing the input files using CPP14.g4 provides all the lexical and semantic information of input source code. Selective filtering of nodes performs serialization of two parse trees. Vector representation using term frequency inverse document frequency (TF-IDF) of the resultant tree is given as an input to cosine similarity to classify the clone types. Filtered parse tree of C and C++ increases the precision from 51% to 61%, and matching based on renaming the input/output expressions provides average precision of 91.97% and 95.37% for small scale and large scale repositories respectively. The proposed cross-language clone detection exhibits the highest precision of 95.37% in finding all types of clones (1, 2, 3 and 4) for 16,032 semantically similar clone pairs of C and CPP codes.

[...] Read more.
A Methodology for Reliable Code Plagiarism Detection Using Complete and Language Agnostic Code Clone Classification

By Sanjay B. Ankali Latha Parthiban

DOI: https://doi.org/10.5815/ijmecs.2021.03.04, Pub. Date: 8 Jun. 2021

Code clone detection plays a vital role in both industry and academia. Last three decades have seen more than 250 clone detection techniques with lack of single framework that can detect and classify all 4 basic types of code clones with high precision. This serious lack of clone classification impacts largely on the universities and online learning platforms that fail to validate the projects or coding assignments submitted online. In this paper, we propose a complete and language agnostic technique to detect and classify all 4 clone types of C, C++, and Java programs. The method first generates the parse tree then extracts the functional tree to eliminate the need for the preprocessing stage employed by previous clone detection techniques. The generated parse tree contains all the necessary information for detecting code clones. We employ TF-IDF cosine similarity for the proper classification of clone types. The proposed technique achieves incredible precision rate of 100% in detecting the first two types of clones and 98% precision in detecting type-3 and type-4 clones for small codes of C, C++, and Java containing an average line count of 5. The proposed technique outperforms the existing tree-based clone detection tools by providing the average precision of 98.07% on the C, C++, and Java programs crawled from Github with an average line count of 15 which signifies that cosine similarity measure on ANTLR functional tree accurately detects all 4 types of small clones and act as proper validation tools for identifying the learning level in the submitted programming assignment.

[...] Read more.
Other Articles