V. Jayachandra Naidu

Work place: Department of ECE, Sri venkateswara College of Engineering & Technology, Chittoor, India

E-mail: jayachandra.velur@gmail.com

Website: https://orcid.org/0009-0006-4965-7793

Research Interests: Signal Processing, Computer Vision, Machine Learning

Biography

V. Jayachandra Naidu received his M.Tech in Communication Engineering from Vellore Institute of Technology, Vellore, Tamil Nadu, India in 2004. Currently, he is working as an Associate Professor in the Department of ECE at Sri Venkateswara College of Engineering and Technology (Autonomous), Chittoor, Andhra Pradesh, India. His research interests include Computer Vision, Machine Learning, Signal Processing, Embedded Systems and IoT.

Author Articles
Speech Enhancement Using Joint Time and DCT Processing for Real Time Applications

By Ravi Kumar Kandagatla V. Jayachandra Naidu P. S. Sreenivasa Reddy Sivaprasad Nandyala

DOI: https://doi.org/10.5815/ijigsp.2024.05.02, Pub. Date: 8 Oct. 2024

Deep learning based speech enhancement approaches provides better perceptual quality and better intelligibility. But most of the speech enhancement methods available in literature estimates enhanced speech using processed amplitude, energy, MFCC spectrum, etc along with noisy phase. Because of difficult in estimating clean speech phase from noisy speech the noisy phase is still using in reconstruction of enhanced speech. Some methods are developed for estimating clean speech phase and it is observed that it is complex for estimation. To avoid difficulty and for better performance rather than using Discrete Fourier Transform (DFT) the Discrete Cosine Transform (DCT) and Discrete Sine Transform (DST) based convolution neural networks are proposed for better intelligibility and improved performance. However, the algorithms work either features of time domain or features of frequency domain. To have advantage of both time domain and frequency domain here the fusion of  DCT and time domain approach is proposed.  In this work DCT Dense Convolutional Recurrent Network (DCTDCRN), DST Convolutional Gated Recurrent Neural Network (DSTCGRU), DST Convolution Long Short term Memory (DSTCLSTM) and DST Convolutional Gated Recurrent Neural Network (DSTDCRN) are proposed for speech enhancement. These methods are providing superior performance and less processing difficulty when compared to the state of art methods. The proposed DCT based methods are used further in developing joint time and magnitude based speech enhancement method. Simulation results show superior performance than baseline methods for joint time and frequency based processing. Also results are analyzed using objective performance measures like Signal to Noise Ratio (SNR), Perceptual Evaluation of Speech Quality (PESQ) and Short-Time Objective Intelligibility (STOI).

[...] Read more.
Other Articles