IJIGSP Vol. 16, No. 5, Oct. 2024
Cover page and Table of Contents: PDF (size: 664KB)
REGULAR PAPERS
Low-light scenes are characterized by the loss of illumination, the noise, the color distortion and serious information degradation. The low-light image enhancement is a significant part of computer vision technology. The low-light image enhancement methods aim to an image recover to a normal-light image from dark one, a noise-free image from a noisy one, a clear image from distorting one. In this paper, the low-light image enhancement technology based on Retinex-based deep network combined with the image processing-based module is proposed. The proposed technology combines the use of traditional and deep learning methodologies, designed within a simple yet efficient architectural framework that focuses on essential feature extraction. The proposed preprocessing module of low-light image enhancement is centered on the unique knowledge and features of an image. The choice of a color model and a technique of an image transformation depends on an image dynamic range to ensure high results in terms of transfer a color, detail integrity and overall visual quality. The proposed Retinex-based deep network has been trained and tested on transformed images by means of preprocessing module that leads to an effective supervised approach to low-light image enhancement and provide superior performance. The proposed preprocessing module is implemented as an independent image enhancement module in a computer system of an image analysis and as the component module in a neural network system of an image analysis. Experimental results on the low light paired dataset show that the proposed method can reduce noise and artifacts in low-light images, and can improve contrast and brightness, demonstrating its advantages. The proposed approach injects new ideas into low light image enhancement, providing practical applications in challenging low-light scenarios.
[...] Read more.Deep learning based speech enhancement approaches provides better perceptual quality and better intelligibility. But most of the speech enhancement methods available in literature estimates enhanced speech using processed amplitude, energy, MFCC spectrum, etc along with noisy phase. Because of difficult in estimating clean speech phase from noisy speech the noisy phase is still using in reconstruction of enhanced speech. Some methods are developed for estimating clean speech phase and it is observed that it is complex for estimation. To avoid difficulty and for better performance rather than using Discrete Fourier Transform (DFT) the Discrete Cosine Transform (DCT) and Discrete Sine Transform (DST) based convolution neural networks are proposed for better intelligibility and improved performance. However, the algorithms work either features of time domain or features of frequency domain. To have advantage of both time domain and frequency domain here the fusion of DCT and time domain approach is proposed. In this work DCT Dense Convolutional Recurrent Network (DCTDCRN), DST Convolutional Gated Recurrent Neural Network (DSTCGRU), DST Convolution Long Short term Memory (DSTCLSTM) and DST Convolutional Gated Recurrent Neural Network (DSTDCRN) are proposed for speech enhancement. These methods are providing superior performance and less processing difficulty when compared to the state of art methods. The proposed DCT based methods are used further in developing joint time and magnitude based speech enhancement method. Simulation results show superior performance than baseline methods for joint time and frequency based processing. Also results are analyzed using objective performance measures like Signal to Noise Ratio (SNR), Perceptual Evaluation of Speech Quality (PESQ) and Short-Time Objective Intelligibility (STOI).
[...] Read more.The research article presents a robust solution to detect surgical masks using a combination of deep learning techniques. The proposed method utilizes the SAM to detect the presence of masks in images, while EfficientNet is employed for feature extraction and classification of mask type. The compound scaling method is used to distinguish between surgical and normal masks in the data set of 2000 facial photos, divided into 60% training, 20% validation, and 20% testing sets. The machine learning model is trained on the data set to learn the discriminative characteristics of each class and achieve high accuracy in mask detection. To handle the variability of mask types, the study applies various versions of EfficientNet, and the highest accuracy of 97.5% is achieved using EfficientNetV2L, demonstrating the effectiveness of the proposed method in detecting masks of different complexities and designs.
[...] Read more.The article is devoted to the modified multidimensional Kalman filter with Chebyshev points development to solve the task of diagnosing and parring off failures in the measurement channels of complex dynamic objects automatic control system, which will provide a more accurate and reliable assessment of system state in the presence of outliers in the data. An implementation of the proposed modified multidimensional Kalman filter with Chebyshev points is proposed in the form of a modified recurrent neural network containing a failure diagnostics layer, a failure parry layer, a filtering and smoothing layer, and a results aggregation layer. This structure of the modified recurrent neural network made it possible to solve the main problems of the method of diagnosing and parring off failures of the measuring channels of complex dynamic objects automatic control system, such as diagnosing failures with an accuracy of 0.99802, fending off failures with an accuracy of 0.99796, and assessing the state of the system with an accuracy of 0.99798. It is proposed to use a modified loss function of a recurrent neural network as a general loss function for diagnostics, fault restoring and system state assessment, which makes it possible to avoid retraining when there are a large number of parameters or insufficient data. It has been experimentally proven that the loss function remains stable on both the training and validation data sets for 1000 training epochs and does not go beyond –2.5 % to +2.5 %, which indicates a low-risk overtraining or undertraining of the model. It has been experimentally confirmed that the use of a modified recurrent neural network in solving the task of diagnosing and parring off failures of the measuring channels of complex dynamic objects automatic control system is appropriate in comparison with a radial basis functions neural network and a multidimensional Kalman filter without a neural network implementation, based on metrics such as the root mean square deviation, mean absolute error, mean absolute percentage error, coefficient of determination for the accuracy of reproducing previous data, and coefficient of determination for the accuracy of predicting future values. For example, the value of the standard deviation of the modified recurrent neural network is 0.00226, which is 1.65 times less than the radial basis function neural network and 2.20 times less than the multidimensional Kalman filter without a neural network implementation.
[...] Read more.Image enhancement in the pre-processing stage of biometric systems is a crucial task in image analysis. Image degradation significantly impacts the biometric system’s performance, which occurs during biometric image capturing, and demands an appropriate enhancement technique. Generally, biometric images are mixed with full of noise and deformation due to the image capturing process, pressure with sensor surface, and photometric transformations. Therefore, these systems highly demand pure discriminative features for identification, and the system’s performance heavily depends on such quality features. Hence, enhancement techniques are typically applied in captured images before go into the feature extraction stage in any biometrics recognition pipeline. In palmprint biometrics, contact-based palmprints consist of several ridges, creases, skin wrinkles, and palm lines, leading to several spurious minutiae during feature extraction. Therefore, selecting an appropriate enhancement technique to make them smooth becomes a significant task. The feature extraction process necessitates a completely pre-processed image to locate key features, which significantly influences the identification performance. Thus, the palmprint system’s performance can be enhanced by exploiting competent enhancement filters. Palmprints have reported a lack of novelty in enhancement techniques rather than more centering on feature encoding and matching techniques. Some enhancement techniques in fingerprints were adopted for palmprints in the past. However, there is no clear evidence of their impact on image quality, and to what extent they affect the quality in specific applications. Further, frequency level filters such as the Gabor and Fourier transforms exploited in fingerprints would not be practically feasible for palmprints due to the computational cost for a larger surface area. Thus, it opens an investigation for utilising enhancement techniques in degraded palmprints in a different direction. This work delves into a preliminary investigation of the usage of existing enhancement techniques utilised for pre-processing of contact fingerprint images and biomedical images. Several enhancement filters were experimented on severely degraded palmprints, and the image quality was measured using image quality metrics. The High-boost filter comparatively performed better peak-signal-to-noise ratio, while other filters affected the image quality. The experiment is further extended to compare the identification performance of degraded palmprints in the presence and absence of enhanced images. The results reveal that the enhanced images with the filter that has the highest peak signal-to-noise ratio (High boost filter) only show an increased genuine accept rate compared to the ground truth value. The High-boost filter slightly decreases the system’s equal error rate, indicating the potential of exploiting a pre-enhancement technique on degraded prints with an appropriate filter without compromising the raw image quality. Optimised enhancement techniques could be another initiative for addressing the severity of image degradation in contact handprints. Doing so they could be successfully exploited in civilian applications like access control along with other applications. Further, utilising appropriate enhancement filters for degraded palmprints can enhance the existing palmprint system’s performance in forensics, and make it more reliable for legal outcomes.
[...] Read more.Cyclones, with their high-speed winds and enormous quantities of rainfall, represent severe threats to global coastal regions. The ability to quickly and accurately identify cyclonic cloud formations is critical for the effective deployment of disaster preparedness measures. Our study focuses on a unique technique for precise delineation of cyclonic cloud regions in satellite imagery, concentrating on images from the Indian weather satellite INSAT-3D. This novel approach manages to achieve considerable improvements in cyclone monitoring by leveraging the image capture capabilities of INSAT-3D. It introduces a refined image processing continuum that extracts cloud attributes from infrared imaging in a comprehensive manner. This includes transformations and normalization techniques, further augmenting the pursuit of accuracy. A key feature of the study's methodology is the use of an adaptive threshold to correct complications related to luminosity and contrast; this enhances the detection accuracy of the cyclonic cloud formations substantially. The study further improves the preciseness of cloud detection by employing a modified contour detection algorithm that operates based on predefined criteria. The methodology has been designed to be both flexible and adaptable, making it highly effective while dealing with a wide array of environmental conditions. The utilization of INSAT-3D satellite images maximizes the performing capability of the technique in various situational contexts.
[...] Read more.Static weather conditions like fog, haze, and mist in hilly and urban areas cause reduced road visibility. Due to different weather conditions, autonomous vehicles cannot identify objects, traffic signs, and signals. So, this leads to many accidents, endangering living beings’ lives. The significance of this work lies in its aim to develop a model that can provide clear visibility for autonomous vehicles during bad weather conditions. Image restoration is one of the important issues in the image processing field as the images may be of low contrast and quality due to restricted visibility and, the development of a model that reduces the halos and artifacts produced in the image using the Median Channel based Image Restoration (MCIR) technique has significant research value. In this technique, the image restoration is done by calculating the atmospheric light and the transmission map using the MCIR technique and patching the pixels for different patch sizes. The Dark Channel Prior (DCP) method and the MCIR technique are compared for different patch sizes by evaluating the output images using the PSNR, SSIM, and MSE metrics. The results show that MCIR technique provides better Peak Signal-to-Noise Ratio (PSNR), Mean Square Error (MSE), and Structural Similarity Index Measure (SSIM) values than the DCP method with reduced halos and artifacts. This result highlights the effectiveness of the MCIR technique for image restoration. The software model developed can be applied to autonomous vehicles and surveillance cameras for the restoration of the images, which can improve their performance and safety.
[...] Read more.In this paper, we developed a new approach to solve the problem of infective endocarditis (IE) diagnostics based on intelligent analysis of patients’ echocardiography images. The approach is based on echocardiography segmentation results and detection of valvular anomalies (namely vegetations). In this article for the first time investigates CNNs and Visual Transformers (ViT) based segmentation methods within the framework of the vegetation segmentation task on echocardiography images. Additionally, ensemble methods for combining segmentation models using a new method of models competition for data points were proposed. Furthermore, we investigated methods for aggregating the results of the ensemble based on a new meta-model, pointwise weighted aggregation, which weighs the results of each model pixel by pixel. The last proposed step was to automatically calculate the volume of segmented vegetation to determine the degree of disease and the need for urgent surgical intervention. For the studied and proposed methods, the following ensemble segmentation accuracy was achieved on the test dataset: iou 0.7822, dice score 0.886. The proposed empirical algorithm for calculating the volume of vegetations provided the basis for further improvements of the studied approach. The results obtained indicate the great potential of the developed approaches in clinical practice.
[...] Read more.