IJEM Vol. 13, No. 6, Dec. 2023
Cover page and Table of Contents: PDF (size: 469KB)
REGULAR PAPERS
Optical Character Recognition Systems (OCR) is a tool that helps computers read text from pictures of papers. It makes it easier for machines to understand what the words say without needing a person to read it out loud. It allows for easy digitizing of historical documents, archival material, and medical records thereby saving on their retrieval times. However, the accuracy of OCR systems heavily relies on the quality of the input images. To negate the contribution of the quality of input images to the accuracy of OCR systems, in this paper, we propose an integrated image pre-processing pipeline integrated with the OCR systems that enhances the quality of input images for efficient image to text conversion. This method results in an easily understandable text output with a lower Character Error Rate (CER) in comparison to the current methods. In addition, we explore a technique for converting text from a document or image into machine-readable form and then converting it to audio output using gTTS, a Python library that interfaces with Google Translate's text-to-speech API. We assess the effectiveness of this approach and illustrate that it substantially enhances OCR precision when compared to other existing methods. This paper presents a clear overview of the growth phases and significant obstacles, accompanied by compelling comparisons of results achieved through various methods.
[...] Read more.Potatoes play a vital role as a staple crop worldwide, making a significant contribution to global food security. However, the susceptibility of potato plants to various leaf diseases poses a threat to crop yield and quality. Detecting these diseases accurately and at an early stage is crucial for the effective management and protection of crops. Recent advancements in Convolutional Neural Networks (CNNs) have demonstrated potential in image categorization applications. Therefore, the goal of this work is to investigate the potential of CNNs in detecting potato leaf diseases. As neural networks have become part of agriculture, numerous researchers have worked on improving the early detection of potato blight using different machine and deep learning methods. However, there are persistent problems related to accuracy and the time it takes for these methods to work. In response to these challenges, we tailored a convolutional neural network (CNN) to enhance accuracy while reducing the trainable parameters, computational time and information loss. To conduct this research, we compiled a diverse dataset consisting of images of potato leaves. The dataset encompassed both healthy leaves and leaves infected with common diseases such as late blight and early blight. We took great care in curating and preprocessing the dataset to ensure its quality and consistency. Our focus was to develop a specialized CNN architecture tailored specifically for disease detection. To improve the performance of the network, we employed techniques like data augmentation and transfer learning during the training phase. The experimental outcomes demonstrate the efficacy of our proposed customized CNN model in accurately identifying and classifying potato leaf diseases. Our model's overall accuracy was an astounding 99.22%, surpassing the performance of existing methods by a significant margin. Furthermore, we evaluated precision, recall, and F1-score to evaluate the model's effectiveness on individual disease classes. To give an additional understanding of the model's behavior and its capacity to distinguish between various disease types, we utilized visualization techniques such as confusion matrices and sample output images. The results of this study have implications for managing potato diseases by offering an automated and reliable solution for early detection and diagnosis. Future research directions may include expanding the dataset, exploring different CNN architectures, and investigating the generalizability of the model across different potato varieties and growing conditions.
[...] Read more.In today’s world, security has become the most difficult task. With increasing urbanization and the growth of big cities, the crime graph is also on the rise. In order to ensure the security and safety of our home while we are away, we propose the use of Raspberry Pi to implement an IOT-based burglar detection and alert system. IoT involves the improvement of networks to efficiently acquire and inspect statistics from different sensors and actuators, then send the statistics via Wi-Fi connection to a personal smartphone or laptop. The concept of antitheft devices has been around for decades, but most are only CCTVs, IP cameras, or magnetic doorbells. There is a limited amount of work devoted to face recognition and weapon detection. The design of anti-theft protection devices relies primarily on face recognition and remote tracking. Here, our objective is to improve this system by incorporating weapon detection feature by image processing. The system uses Raspberry Pi, in which a person is only permitted access to the house if his/her face is recognized by the proposed system, and if he/she does not carry any weapons. From the standpoint of security, this system is more reliable and efficient. The proposed system is intended to develop a secure access control application based on face recognition along with weapon detection. By using the Telegram app, the proprietor can monitor the digital camera mounted on the door frame. As a means of improving the accuracy and efficiency of our system, we use the Python language and the Open CV library.
[...] Read more.This work focused on the development of a 120kg load lifting capacity scissor elevator platform (SEP) with a horizontally positioned rack and pinion gear actuating mechanism which is driven by a DC motor. The time of lift to an elevated height of 0.9m is 30s. Simulation of a typical SEP structure in the 3D workspace of a Computer Aided Design (CAD) software package was carried out to investigate the balance of the SEP structure, the stresses experienced, the efficiency, and safety of operations. A prototype was also fabricated for the physical demonstration of SEP. The SEP can be used for a range of engineering applications such as making an adjustable workbench for workshop use, solving the problem of table adjustment for height-challenged personnel, or used as a load-transferring device if mobile to transfer loads between two or more elevated locations during construction or maintenance work. Calculated results give the platform weight as 136.693N, the scissor arms weight as 188.205N, the total structure weight as 1502.098N, the stress in the scissor arm at maximum platform elevation as 1.702MPa, the stress in the scissor arm at minimum platform elevation as 4.928MPa, the maximum actuation force as 4126.980N, and the power required to drive the mechanism as 26.963W. Autodesk Inventor Pro simulation results show that a wide range of data can be sourced when one considers the real-time behavior of SEP. The results also indicated the values of the reaction forces, reaction moments, stresses, strains, and displacements developed at every joint, link, hinged support, and every other point in a 3D workspace.
[...] Read more.Semantic segmentation is an essential tool for autonomous vehicles to comprehend their surroundings. Due to the need for both effectiveness and efficiency, semantic segmentation for autonomous driving is a difficult task. Present-day models’ appealing performances typically come at the cost of extensive computations, which are unacceptable for self-driving vehicles. Deep learning has recently demonstrated significant performance improvements in terms of accuracy. Hence, this work compares U-Net architectures such as UNet-VGG19, UNet-ResNet101, and UNet-EfficientNetb7, combining the effectiveness of compound-scaled VGG19, ResNet101, and EfficientNetb7 as the encoders for feature extraction. And, U-Net decoder is used for regenerating the fine-grained segmentation map. Combining both low-level spatial information and high-level feature information allows for precise segmentation. Our research involves extensive experimentation on diverse datasets, including the CamVid (Cambridge-driving Labeled Video Database) and Cityscapes (a comprehensive road scene understanding dataset). By implementing the UNet-EfficientNetb7 architecture, we achieved notable mean Intersection over Union (mIoU) values of 0.8128 and 0.8659 for the CamVid and Cityscapes datasets, respectively. These results outshine alternative contemporary techniques, underscoring the superior precision and effectiveness of the UNet-EfficientNetb7 model. This study contributes to the field by addressing the crucial challenge of efficient yet accurate semantic segmentation for autonomous driving, offering insights into a model that effectively balances performance and computational demands.
[...] Read more.