P. S. Sreenivasa Reddy

Work place: Department of ECE, Nalla Narasimha Reddy Education Society’s Group of Institutions, Telangana, India

E-mail: sreenivaspanati@gmail.com

Website: https://orcid.org/0009-0006-6607-2041

Research Interests:

Biography

P. S. Sreenivas Reddy is working as an Associate Professor, Electronics and Communication Engineering, Nalla Narasimha Reddy educational societies group of institutions, Telengana. He has graduated M.E Applied Electronics in the year 2000 and B.E Electronics and Communication Engineering in the year 1998. He has Twenty one years of Academic Experience at under graduate and Post graduate level. His research interests include Nanomaterials, Micro Electronics, and VLSI Design. He is presently doing Ph.D degree in Electronics and Communication Engineering, Dr. M.G.R. Educational and Research Institute, University.

Author Articles
Speech Enhancement Using Joint Time and DCT Processing for Real Time Applications

By Ravi Kumar Kandagatla V. Jayachandra Naidu P. S. Sreenivasa Reddy Sivaprasad Nandyala

DOI: https://doi.org/10.5815/ijigsp.2024.05.02, Pub. Date: 8 Oct. 2024

Deep learning based speech enhancement approaches provides better perceptual quality and better intelligibility. But most of the speech enhancement methods available in literature estimates enhanced speech using processed amplitude, energy, MFCC spectrum, etc along with noisy phase. Because of difficult in estimating clean speech phase from noisy speech the noisy phase is still using in reconstruction of enhanced speech. Some methods are developed for estimating clean speech phase and it is observed that it is complex for estimation. To avoid difficulty and for better performance rather than using Discrete Fourier Transform (DFT) the Discrete Cosine Transform (DCT) and Discrete Sine Transform (DST) based convolution neural networks are proposed for better intelligibility and improved performance. However, the algorithms work either features of time domain or features of frequency domain. To have advantage of both time domain and frequency domain here the fusion of  DCT and time domain approach is proposed.  In this work DCT Dense Convolutional Recurrent Network (DCTDCRN), DST Convolutional Gated Recurrent Neural Network (DSTCGRU), DST Convolution Long Short term Memory (DSTCLSTM) and DST Convolutional Gated Recurrent Neural Network (DSTDCRN) are proposed for speech enhancement. These methods are providing superior performance and less processing difficulty when compared to the state of art methods. The proposed DCT based methods are used further in developing joint time and magnitude based speech enhancement method. Simulation results show superior performance than baseline methods for joint time and frequency based processing. Also results are analyzed using objective performance measures like Signal to Noise Ratio (SNR), Perceptual Evaluation of Speech Quality (PESQ) and Short-Time Objective Intelligibility (STOI).

[...] Read more.
Other Articles