International Journal of Information Technology and Computer Science (IJITCS)

IJITCS Vol. 13, No. 4, Aug. 2021

Cover page and Table of Contents: PDF (size: 166KB)

Table Of Contents

REGULAR PAPERS

Combining Fuzzy Logic and k-Nearest Neighbor Algorithm for Recommendation Systems

By Paul Dayang Cyrille Sepele Petsou Damien Wohwe Sambo

DOI: https://doi.org/10.5815/ijitcs.2021.04.01, Pub. Date: 8 Aug. 2021

Recommendation systems are a type of systems that are able to help users finding relevant and personalized content in a wide variety of possibilities. To help computers perform recommendations, there are several approaches used nowadays such as the Content-based approach, the Collaborative filtering approach and the Hybrid recommendation approach. However, these approaches are sometimes inappropriate for use cases where there is no prior large datasets of users’ feedbacks or ratings needed for training Machine Learning models. Thus, in this work, we proposed a novel approach based on the combination of Fuzzy Logic and the k-Nearest neighbor algorithm (KNN). The proposed approach can be applied without any prior collected feedbacks of users and performs good recommendations. Moreover, our proposal uses Fuzzy Logic to infer values based on inputs and a set of rules. Furthermore, the KNN uses the output values of the Fuzzy Logic system to do some retrieval tasks based on existing distance measures. In order to evaluate our approach, we considered an expert system of food recommendation for people suffering from the two deadliest diseases in Cameroon: HIV/AIDS and Malaria. The obtained results are closed to the recommendation made by nutritionists. These results demonstrate how effective our approach can be used to solve a real nutrition problem for people suffering from Malaria or HIV/AIDS. Furthermore, this approach can be extended to other fields and even be used to perform any recommendation task where there is no prior collected user’s feedback or ratings by using the proposed approach as a framework.

[...] Read more.
Fish Image Classification by XgBoost Based on Gist and GLCM Features

By Prashengit Dhar Sunanda Guha

DOI: https://doi.org/10.5815/ijitcs.2021.04.02, Pub. Date: 8 Aug. 2021

Classification of fish image is a complex issue in the field of pattern recognition. Fish classification is a complicated task. Physical shape, size, orientation etc. made it complex to classify. Selection of appropriate feature is also a great issue in image classification. Classification of fish image is very important in fishing service and agricultural field, fish industry, survey applications of fisheries and in other related area. For the assessment and counting of fishes, classification of fish image is also necessary as it can save time. This paper presents a fish image classification method with the robust Gist feature and Gray Level Co-occurrence Matrix (GLCM) feature. Noise removal and resizing of image is applied as pre-processing task. Gist and GLCM feature are combined to make a better feature matrix. Features are also tested separately. But combined feature vector performs better than individual. Classification is made on ten types of raw images of fish from two datasets -QUT and F4K dataset. The feature set is trained with different machine learning models. Among them, XgBoost performs with 90.2% and 98.08% accuracy for QUT and F4K dataset respectively.

[...] Read more.
Credit Card Fraud Detection System Using Machine Learning

By Angela Makolo Tayo Adeboye

DOI: https://doi.org/10.5815/ijitcs.2021.04.03, Pub. Date: 8 Aug. 2021

The security of any system is a key factor toward its acceptability by the general public. We propose an intuitive approach to fraud detection in financial institutions using machine learning by designing a Hybrid Credit Card Fraud Detection (HCCFD) system which uses the technique of anomaly detection by applying genetic algorithm and multivariate normal distribution to identify fraudulent transactions on credit cards. An imbalance dataset of credit card transactions was used to the HCCFD and a target variable which indicates whether a transaction is deceitful or otherwise. Using F-score as performance metrics, the model was tested and it gave a prediction accuracy of 93.5%, as against artificial neural network, decision tree and support vector machine, which scored 84.2%, 80.0% and 68.5% respectively, when trained on the same data set. The results obtained showed a significant improvement as compared with the other widely used algorithms.

[...] Read more.
An Optimization of Feature Selection for Classification using Modified Bat Algorithm

By V. Yasaswini Santhi Baskaran

DOI: https://doi.org/10.5815/ijitcs.2021.04.04, Pub. Date: 8 Aug. 2021

Data mining is the action of searching the large existing database in order to get new and best information. It plays a major and vital role now-a-days in all sorts of fields like Medical, Engineering, Banking, Education and Fraud detection. In this paper Feature selection which is a part of Data mining is performed to do classification. The role of feature selection is in the context of deep learning and how it is related to feature engineering. Feature selection is a preprocessing technique which selects the appropriate features from the data set to get the accurate result and outcome for the classification. Nature-inspired Optimization algorithms like Ant colony, Firefly, Cuckoo Search and Harmony Search showed better performance by giving the best accuracy rate with less number of features selected and also fine f-Measure value is noted. These algorithms are used to perform classification that accurately predicts the target class for each case in the data set. We propose a technique to get the optimized feature selection to perform classification using Meta Heuristic algorithms. We applied new and recent advanced optimized algorithm named Modified Bat algorithm on University of California Irvine datasets that showed comparatively equal results with best performed existing firefly but with less number of features selected. The work is implemented using JAVA and the Medical dataset has been used. These datasets were chosen due to nominal class features. The number of attributes, instances and classes varies from chosen dataset to represent different combinations. Classification is done using J48 classifier in WEKA tool. We demonstrate the comparative results of the presently used algorithms with the existing algorithms thoroughly. The significance of this research is it will show a great impact in selecting the best features out of all the existing features which gives best accuracy rates which helps in extracting the information from raw data in Data Mining Domain. The Value of this research is it will manage main fields like medical and banking which gives exact and proper results in their respective field. The best quality of the research is to optimize the selection of features to achieve maximum predictive accuracy of the data sets which solves both single variable and multi-variable functions through the generation of binary structuring of features in the dataset and to increase the performance of classification by using nature inspired and Meta Heuristic algorithms.

[...] Read more.
SBIoT: Scalable Broker Design for Real Time Streaming Big Data in the Internet of Things Environment

By Halil ARSLAN Mustafa YALCIN Yasin SAHAN

DOI: https://doi.org/10.5815/ijitcs.2021.04.05, Pub. Date: 8 Aug. 2021

Thanks to the recent development in the technology number of IoT devices increased dramatically. Therefore, industries have been started to use IoT devices for their business processes. Many systems can be done automatically thanks to them. For this purpose, there is a server to process sensors data. Transferring these data to the server without any loss has crucial importance for the accuracy of IoT applications. Therefore, in this thesis a scalable broker for real time streaming data is proposed. Open source technologies, which are NoSql and in-memory databases, queueing, full-text index search, virtualization and container management orchestration algorithms, are used to increase efficiency of the broker. Firstly, it is planned to be used for the biggest airport in Turkey to determine the staff location. Considering the experiment analysis, proposed system is good enough to transfer data produced by devices in that airport. In addition to this, the system can adapt to device increase, which means if number of devices increasing in time, number of nodes can be increased to capture more data.

[...] Read more.