IJEM Vol. 14, No. 6, Dec. 2024
Cover page and Table of Contents: PDF (size: 463KB)
REGULAR PAPERS
Indonesian Sign Language (BISINDO) is one of the visual-based alternative languages used by people with hearing impairments. There are hundreds of thousands of Indonesian vocabularies that sign language gestures can represent. However, because the number of deaf people in Indonesia is only seven million or 3% of the population, sign language has become unfamiliar and challenging for some normal or laypeople to understand. This study aims to classify and detect gestures in sign language vocabulary directly based on mobile. Classification learning techniques are needed to recognize variations in gestures, such as machine learning with supervised learning techniques. The development of this research uses the convolutional neural network method with the help of techniques from the single shot detector architecture as the object of detection and the MobileNet architecture for classification. The object is 32 gestural vocabularies from the lyrics of the song 'Bidadari Tak Bersayap' with a dataset of 17,600 images. Then the images are divided into two parts of the model based on the nature of the biased and non-biased data, amounting to 8 and 24 classes, respectively. The research results in a biased model prediction of 15 out of 16, while a non-biased model of 36 out of 48 correct predictions with a total accuracy of real-time based testing on mobile of 93.75% and 75%, respectively.
[...] Read more.Malaria remains a pervasive global health challenge, affecting millions of lives daily. Traditional diagnostic methods, involving manual blood smear examination, are time-consuming and prone to errors, especially in large-scale testing. Although promising, automated detection techniques often fail to capture the intricate spatial features of malaria parasites leading to inconsistent performance. In order to close these gaps, this work suggest an improved technique that combines a Self-Attention Mechanism and a Dilated Convolutional Neural Network (D-CNN) to allow the model to effectively and precisely classify malaria parasites as infected or uninfected. Both local and global spatial information are captured by dilated convolutions, and crucial features are given priority by the attention mechanism for accurate detection in complex images. We also examine batch size variation and find that it plays a crucial role in maximizing generalization, accuracy, and resource efficiency. A batch size of 64 produced superior results after testing six different sizes, yielding an AUC of 99.12%, F1-Score of 96, precision of 97.63%, recall of 93.99%, and accuracy of 96.08%. This batch size balances efficient gradient updates and stabilization, reducing overfitting and improving generalization, especially on complex medical datasets. Our approach was benchmarked against existing competitors using the same publicly available malaria dataset, demonstrating a 2-3% improvement in AUC and precision over state-of-the-art models, such as traditional CNNs and machine learning methods. This highlights its superior ability to minimize false positives and negatives, particularly in complex diagnostic cases. These advancements enhance the reliability of large-scale diagnostic systems, improve clinical decision-making, and address key challenges in automated malaria detection.
[...] Read more.Multi CNN has recently gained popularity in image classification applications. In particular, Computer vision has acquired a lot of attraction due to its numerous potential uses in food quality management. Among all the dry fruits available in India, the cashew nut is a significant crop. Specifically high-quality cashew nuts are quite popular on the worldwide market. Although there are a variety of approaches for automatically identifying cashew nuts, the majority of them concentrate on a single view image of the cashew nut. The fundamental issue with current methods for recognizing whole and split cashew nuts is that a single view image of a cashew nut cannot encompass the entire view of a cashew nut, resulting in low classification accuracy. We proposed Multi-view CNN to provide a novel framework for classifying three types of cashew nuts. Images of the sample cashew nuts are taken from three distinct angles (top, left, and right) and fed into the proposed modified CNN architecture. For categorization, the modified CNN extracts and combines many elements from these three images and obtains the accuracy of 98.87%.
[...] Read more.Diabetic Retinopathy is a severe eye condition originating as a result of long term diabetes mellitus. Timely detection is essential to prevent it from progressing to more advanced stages. Manual detection of DR is labor-intensive and time-consuming, requiring expertise and extensive image analysis. Our research aims to develop a robust and automated deep learning model to assist healthcare professionals by streamlining the detection process and improving diagnostic accuracy. This research proposes a multi-classification framework using Transfer Learning for diabetic retinopathy grading among diabetic patients. An image based dataset, APTOS 2019 Blindness Detection, is utilized for our model training and testing. Our methodology involves three key preprocessing steps: 1) Cropping to remove extraneous background regions, 2) Contrast enhancement using CLAHE (Contrast Limited Adaptive Histogram Equalization) and 3) Resizing to a consistent dimension of 224x224x3. To address class imbalance, we applied SMOTE (Synthetic Minority Over-sampling Technique) for balancing the dataset. Data augmentation techniques such as rotation, zooming, shifting, and brightness adjustment are used to further enhance the model's generalization. The dataset is split to a 70:10:20 ratios for training, validation and testing. For classification, EfficientNetB3 and Xception, two transfer learning models, are used after fine-tuning which includes addition of dense, dropout and fully connected layers. Hyper parameters such as batch size, no. of epochs, optimizer etc were adjusted prioir model training. The performance of our model is evaluated using various performance metrics including accuracy, specificity, sensitivity and others. Results reveal the highest test accuracy of 95.16% on the APTOS dataset for grading diabetic retinopathy into five classes using the EfficientNetB3 model followed by a test accuracy of 92.66% using Xception model. Our top-performing model, EfficientNetB3, was compared against various state-of-the-art approaches, including DenseNet-169, hybrid models, and ResNet-50, where our model outperformed all these methodologies.
[...] Read more.This paper investigates the application of EfficientNetV2, an advanced variant of EfficientNet, in diabetic retinopathy (DR) detection, a critical area in medical image analysis. Despite the extensive use of deep learning models in this domain, EfficientNetV2’s potential remains largely unexplored. The study conducts comprehensive experiments, comparing EfficientNetV2 with established models like AlexNet, GoogleNet, and various ResNet architectures. A dataset of 3662 images was used to train the models. Results indicate that EfficientNetV2 achieves competitive performance, particularly excelling in sensitivity, a crucial metric in medical image classification. With a high area under the curve (AUC) value of 98.16%, EfficientNetV2 demonstrates robust discriminatory ability. These findings underscore its potential as an effective tool for DR diagnosis, suggesting broader applicability in medical image analysis. Moreover, EfficientNetV2 contains more layers than AlexNet, GoogleNet, and ResNet architecture, which makes EfficientNetV2 the superior deep learning model for DR detection. Future research could focus on optimizing the model for specific clinical contexts and validating its real-world effectiveness through large-scale clinical trials.
[...] Read more.