A. Pramod Reddy

Work place: TKR College of Engineering and Technology, Hyderabad, 500097, India

E-mail: pramod@tkrcet.com

Website: https://orcid.org/0000-0002-3912-3302

Research Interests: Computer systems and computational processes, Image Compression, Image Manipulation, Image Processing, Speech Recognition, Speech Synthesis, Data Structures and Algorithms

Biography

A. Pramod Reddy received his Ph.D from Scool of computer science and engineering at VIT Vellore in 2022, TN, INDIA. He received his masters form and bachelor degree form JNTU, Hyderabad campus. His area of interest is speech processing, image processing and machine learning and currently working as Associate Professor at TKR College of Engineering and Technology, TS, INDIA.

Author Articles
Estimating the Effects of Voice Quality and Speech Intelligibility of Audio Compression in Automatic Emotion Recognition

By A. Pramod Reddy Dileep kumar Ravikanti Rakesh Betala K. Venkatesh Sharma K. Shirisha Reddy

DOI: https://doi.org/10.5815/ijigsp.2023.03.06, Pub. Date: 8 Jun. 2023

This paper projects, the impact & accuracy of speech compression on AER systems. The effects of various codecs like MP3, Speex, and Adaptive multi-rate(NB & WB) are compared with the uncompressed speech signal. Loudness enlistment, or a steeper-than-normal increase in perceived loudness with presentation level, is associated with sensorineural hearing loss. Amplitude compression is frequently used to compensate for this abnormality, such as in a hearing aid. As an alternative, one may enlarge these by methods of expansion as speech intelligibility has been represented as the perception of rapid energy changes, may make communication more understandable. However, even if these signal-processing methods improve speech understanding, their design and implementation may be constrained by insufficient sound quality. Therefore, syllabic compression and temporal envelope expansion were assessed for in speech intelligibility and sound quality. An adaptive technique based on brief, commonplace words either in noise or with another speaker competing was used to assess the speech intelligibility. Speech intelligibility was tested in steady-state noise with a single competing speaker using everyday sentences. The sound quality of four artistic excerpts and quiet speech was evaluated using a rating scale. With a state-of-art, spectral error, compression error ratio, and human labeling effects, The experiments are carried out using the Telugu dataset and well-known EMO-DB. The results showed that all speech compression techniques resulted in reduce of emotion recognition accuracy. It is observed that human labeling has better recognition accuracy. For high compression, it is advised to use the overall mean of the unweighted average recall for the AMR-WB and SPEEX codecs with 6.6 bit rates to provide the optimum quality for data storage.

[...] Read more.
Other Articles