Spectral Subtractive-Type Algorithms for Enhancement of Noisy Speech: An Integrative Review

Full Text (PDF, 329KB), PP.13-22

Views: 0 Downloads: 0

Author(s)

Navneet Upadhyay 1,* Abhijit Karmakar 2

1. Department of Electrical & Electronics Engineering, Birla Institute of Technology and Science, Pilani 333031, India

2. Integrated Circuit Design Group, CSIR - Central Electronics Engineering Research Institute, Pilani 333031, India

* Corresponding author.

DOI: https://doi.org/10.5815/ijigsp.2013.11.02

Received: 23 May 2013 / Revised: 4 Jul. 2013 / Accepted: 31 Jul. 2013 / Published: 8 Sep. 2013

Index Terms

Speech enhancement, additive background noise, noise estimation, spectral subtractive-type algorithms, remnant musical noise

Abstract

The spectral subtraction method is a classical approach for enhancement of speech degraded by additive background noise. The basic principle of this method is to estimate the short-time spectral magnitude of speech by subtracting estimated noise spectrum from the noisy speech spectrum. This is also achieved by multiplying the noisy speech spectrum with a gain function and later combining it with the phase of the noisy speech. Besides reducing the background noise, this method introduces an annoying perceptible tonal characteristic in the enhanced speech and affects the human listening, known as remnant musical noise. Several variations and implementations of this method have been adopted in past decades to address the limitations of spectral subtraction method. These variations constitute a family of subtractive-type algorithms and operate in frequency domain. The objective of this paper is to provide an extensive overview of spectral subtractive-type algorithms for enhancement of noisy speech. After the review, this paper is concluded by mentioning a future direction of speech enhancement research from spectral subtraction perspective.

Cite This Paper

Navneet Upadhyay, Abhijit Karmakar,"Spectral Subtractive-Type Algorithms for Enhancement of Noisy Speech: An Integrative Review", IJIGSP, vol.5, no.11, pp.13-22, 2013. DOI: 10.5815/ijigsp.2013.11.02

Reference

[1]Y. Ephraim, H. L. Ari, and W. Roberts, "A brief survey of speech enhancement," in the Electrical Engineering Handbook, 3rd ed. Boca Raton, FL: CRC, 2006.

[2]Y. Ephraim, and I. Cohen, "Recent advancements in speech enhancement," in the Electrical Engineering Handbook, CRC press, ch. 5, pp. 12 – 26, 2006.

[3]Y. Ephraim, "Statistical-model-based speech enhancement systems," in Proceedings of the IEEE, vol. 80, no. 10, pp. 1526 – 1555, Oct. 1992. 

[4]Yifan Gong, "Speech recognition in noisy environments: A survey," Speech Communication, vol. 16, no. 3, pp. 261 – 291, April 1995.

[5]J. S. Lim, and A. V. Oppenheim, "Enhancement and bandwidth compression of noisy speech," in Proceedings of the IEEE, Dec. 1979, vol. 67, no. 12, pp. 1586 – 1604. 

[6]L. W. David, and J. S. Lim, "The unimportance of phase in speech enhancement," IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 30, no. 4, pp. 679 – 681, Aug. 1982.

[7]S. F. Boll, "Suppression of noise in speech using the saber method," in Proceedings of International Conference on Acoustics, Speech, and Signal Processing, April 1978, vol. 3, pp. 606 – 609.

[8]Y. Ephraim, and D. Malah, "Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator," IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 32, no. 6, pp. 1109 – 1121, Dec. 1984.

[9]S. F. Boll, "Suppression of acoustic noise in speech using spectral subtraction," IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 27, no. 2, pp. 113 – 120, 1979.

[10]S. F. Boll, "A spectral subtraction algorithm for suppression of acoustic noise in speech," in Proceedings of International Conference on Acoustics, Speech, and Signal Processing, April 1979, vol. 4, pp. 200 – 203.

[11]P. C. Loizou, Speech Enhancement: Theory and Practice, Boca Raton, FL: CRC, 2007.

[12]Kuldip Paliwal, Kamil Wo´jcicki, and Belinda Schwerin, "Single channel speech enhancement using spectral subtraction in the short-time modulation domain," Speech Communication, vol. 52, no. 5, pp. 450 – 475, May 2010.

[13]S. V. Vaseghi, Advanced Digital Signal Processing and Noise Reduction, IInd ed. NY, USA:Wiley, 2000.

[14]Leigh D. Alsteris, and Kuldip K. Paliwal, "Short-time phase spectrum in speech processing: A review and some experimental results," Digital Signal Processing, vol. 17, no. 3, pp. 578 – 616, May 2007.

[15]M. Berouti, R. Schwartz, and J. Makhoul, "Enhancement of speech corrupted by acoustic noise," in Proceedings of International Conference on Acoustics, Speech, and Signal Processing, Washington DC, April 1979, vol. 4, pp. 208 – 211.

[16]B. L. Sim, Y. C. Tong, J. S. Chang, and C. T. Tan, "A parametric formulation of the generalized spectral subtraction method," IEEE Transactions on Speech, and Audio Processing, vol. 6, no.4, pp. 328 – 337, July 1998.

[17]Yi. Hu, M. Bhatnagar, and P. C. Loizou, "A cross-correlation technique for enhancing speech corrupted with correlated noise," in Proceedings of International Conference on Acoustics, Speech, and Signal Processing, May 2001, vol. 1, pp. 673 – 676.

[18]P. Lockwood, and J. Boudy, "Experiments with a non-linear spectral subtractor (NSS), Hidden Markov Models and the projection, for robust speech recognition in cars," Speech Communication , vol. 11, no. 2-3, pp. 215 – 228, 1992. 

[19]S. Kamath, and P. C. Loizou, "A multi-band spectral subtraction method for enhancing speech corrupted by colored noise," in Proceedings of International Conference on Acoustics, Speech, and Signal Processing, Orlando, USA, May 2002, vol. 4, pp. 4160 – 4164. 

[20]M. A. Abd El-Fattah, M. I. Dessouky, S. M. Diaband F. E. Abd El-samie, "Speech enhancement using an adaptive wiener filtering approach," Progress In Electromagnetics Research M., vol. 4, pp. 167 – 184, 2008.

[21]S. Ogata, and T. Shimamura, "Reinforced spectral subtraction method to enhance speech signal," in Proceedings of International Conference on Electrical and Electronic Technology, 2001, vol. 1, pp. 242 – 245. 

[22]P. Sovka, P. Pollak, and J. Kibic, "Extended spectral subtraction," in Proceedings of European Conference on Speech Process Communication, Sept. 1996, pp. 963 – 966. 

[23]N. Virag, "Single-channel speech enhancement based on masking properties of the human auditory system," IEEE Transactions on Speech, and Audio Processing, vol. 7, pp. 126 – 137, March 1999.

[24]R. Martin, "Spectral subtraction based on minimum statistics," in Proceedings of European Conference on Signal Processing, U.K., Sept. 1994, pp. 1182 – 1185.

[25]R. Martin, "Noise power spectral density estimation based on optimal smoothing and minimum statistics," IEEE Transactions on Speech, and Audio Processing, vol. 9, no. 5, pp. 504 – 512, 2001.

[26]R. M. Uderea, N. Vizireanu, S. Ciochina, and S. Halunga, "Non-linear spectral subtraction method for colored noise reduction using multi-band Bark scale," Signal Processing, vol. 88, pp. 1299 – 1303, 2008. 

[27]Sheng Li, Jian Qi Wang, and Xi Jing Jing, "The application of non-linear spectral subtraction method on millimeter wave conducted speech enhancement," Mathematical Problems in Engineering, pp. 1 – 12, 2010.

[28]H. Tasmaz, and E. Ercelebi, "Speech enhancement based on un-decimated wavelet packet perceptual filterbanks and MMSE-STSA estimation in various noise environments," Digital Signal processing, vol. 18, no. 5, pp. 797 – 812, Sept. 2008.

[29]Chao Li, and Wen-Ju Liu, "A novel multi-band spectral subtraction method based on phase modification and magnitude compensation," in Proceedings of International Conference on Acoustics, Speech, and Signal Processing, Prague, Czech Republic, May 22 – 27, 2011, pp. 4760-4763.

[30]L. Singh, and S. Sridharan, "Speech enhancement using critical band spectral subtraction," in Proceedings of International Conference on Spoken Language Processing, Sydney, Australia, Dec. 1998, pp. 2827 – 2830.

[31]Y. Ghanbari, M. R. Karmi-Mollaei, and B. Amelifard, "Improved multi-band spectral subtraction method for speech enhancement," in Proceedings of International Conference of Signal, and Image Processing, Hawaii, USA, Aug. 23 - 25, 2004. 

[32]K. Yamashita, S. Ogata, and T. Shimamura, "Improved spectral subtraction utilizing iterative processing," Electronics and Communications, Japan, vol. 90, no. 4, pp. 39 – 51, 2007.

[33]K. Yamashita, S. Ogata, and T. Shimamura, "Spectral subtraction iterated with weighting factors," in Proceedings of IEEE Speech Coding Workshop, Oct. 6 - 9, 2002, pp.138 – 140.

[34]Sheng Li, Jian-Qi Wang, Ming Niu, Xi-Jing Jing, and Tian Liu, "Iterative spectral subtraction method for millimeter wave conducted speech enhancement," Journal of Biomedical Science and Engineering, vol. 3, no. 2, pp. 187 – 192, Feb. 2010.

[35]J. D. Johnston, "Transform coding of audio signals using perceptual noise criteria," IEEE Journal on Selected Areas of Communications, vol. 6, no. 2, pp. 314 – 323, Feb. 1988.

[36]R. M. Uderea, N. D. Vizireanu, and S. Ciochina, "An improved spectral subtraction method for speech enhancement using a perceptual weighting filter," Digital Signal Processing, vol. 18, pp. 581 – 587, 2008.

[37]J. Lim, "Evaluation of a correlation subtraction method for enhancing speech degraded by additive noise," IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 37, no. 6, pp. 471 – 472, 1978.