Bandwidth Extension of Speech Signals: A Comprehensive Review

Full Text (PDF, 355KB), PP.45-52

Views: 0 Downloads: 0

Author(s)

N.Prasad 1,* T. Kishore Kumar 1

1. National Institute of Technology Warangal, Warangal-506004, India

* Corresponding author.

DOI: https://doi.org/10.5815/ijisa.2016.02.06

Received: 10 Jun. 2015 / Revised: 20 Sep. 2015 / Accepted: 1 Dec. 2015 / Published: 8 Feb. 2016

Index Terms

Speech bandwidth extension, Data hiding, Model-based techniques, Non-Model- based techniques, Speech quality, Speech intelligibility, wideband speech coding

Abstract

Telephone systems commonly transmit narrowband (NB) speech with an audio bandwidth limited to the traditional telephone band of 300-3400 Hz. To improve the quality and intelligibility of speech degraded by narrow bandwidth, researchers have tried to standardize the telephonic networks by introducing wideband (50-7000 Hz) speech codecs. Wideband (WB) speech transmission requires the transmission network and terminal devices at both ends to be upgraded to the wideband that turns out to be time-consuming. In this situation, novel Bandwidth extension (BWE) techniques have been developed to overcome the limitations of NB speech. This paper discusses the basic principles, realization, and applications of BWE. Challenges and limitations of BWE are also addressed.

Cite This Paper

N.Prasad, T. Kishore Kumar, "Bandwidth Extension of Speech Signals: A Comprehensive Review", International Journal of Intelligent Systems and Applications(IJISA), Vol.8, No.2, pp.45-52, 2016. DOI:10.5815/ijisa.2016.02.06

Reference

[1]P.Jax, “Enhancement of bandlimited speech signals: Algorithms and theoretical bounds,” Ph.D. dissertation, RWTH Aachen University, Aachen, Germany, 2002.
[2]Laura Laaksonen, “Artificial bandwidth extension of narrowband speech - enhanced speech quality and intelligibility in mobile devices,” Ph.D. dissertation, Aalto University, Finland, 2013.
[3]Hannu Pulakka, “Development and evaluation of artificial bandwidth extension methods for narrowband telephone speech,” Ph.D. dissertation, Aalto University, Finland, 2013.
[4]Y.Yoshida and M.Abe, “An algorithm to reconstruct wideband speech from narrowband speech based on codebook mapping,” In Pro. of ICSLP, pages 1591–1594, 1994.
[5]Y.Qian and P.Kabal, “Wideband speech recovery from narrowband speech using classified codebook mapping,”In Pro.of AICSST, pages 106–111, Australia, 2002.
[6]J.Epps and W.H.Holmes, “A new technique for wideband enhancement of coded narrowband speech,” In Pro. of the IEEE Workshop on Speech Coding, pages 174–176, Finland, 1999.
[7]R.Hu et al., “Speech bandwidth extension by improved codebook mapping towards increased phonetic classification,” In Pro.of Interspeech, pages 1501–1504, Portugal, 2005.
[8]Y.Nakatoh et al., “Generation of broadband speech from narrowband speech using piecewise linear mapping,” In Proc.of EUROSPEECH, Pages 1643-1646, Greece, 1997.
[9]S.Chennoukh et al., “Speech enhancement via frequency bandwidth extension using line spectral frequencies,” In Proc.of ICASSP, Pages 665-668, USA, 2001.
[10]K.Y.Park and H.S.Kim, “Narrowband to wideband conversion of speech using GMM based transformation,” In Proc. of ICASSP, pages 1843–1846, Turkey, 2000.
[11]A.H.NourEldin and P.Kabal, “Mel frequency cepstral coefficient based bandwidth extension of narrowband speech,” In Proc.of Interspeech, pages 53–56, Australia, 2008.
[12]A.H.NourEldin and P.Kabal, “Combining frontend based memory with MFCC features for bandwidth extension of narrowband speech,” In Proc.of ICASSP, pages 4001–4004, Taiwan, 2009.
[13]P.Jax and P.Vary, “On artificial bandwidth extension of telephone speech,” Signal Process., pages 1707–1719, 2003.
[14]P.Bauer and T.Fingscheidt, “An HMM based artificial bandwidth extension evaluated by cross-language training and test,” In Proc.of ICASSP, pages 4589–4592, USA, 2008.
[15]G.B.Song and P.Martynovich, “A study of HMM based bandwidth extension of speech signals,” Signal Process., pages 2036–2044, 2009.
[16]D.Zaykovskiy and B.Iser, “Comparison of neural
networks and linear mapping in an application for bandwidth extension,” In pro.of SPECOM, Greece, 2005.
[17]B.Iser and G.Schmidt, “Neural networks versus codebooks in an application for bandwidth extension of
speech signals,” In Proc. of EUROSPEECH, pages 565–568, Switzerland, 2003.
[18]J.Makhoul and M.Berouti, “High-frequency regeneration in speech coding systems,” in Proc.of ICASSP, pages 428–431, USA, 1979.
[19]A.deCheveigne and H.Kawahara, “YIN, a fundamental frequency estimator for speech and music,” J. Acoust. Soc. Am., pages 1917–1930, Apr.2002.
[20]C.F.Chan and W.K.Hui, “Wideband resynthesis of narrowband CELP coded speech using multiband excitation model,” In Proc.of ICSLP, pages 322–325, USA, 1996.
[21]J. Epps, “Wideband Extension of Narrowband Speech for Enhancement and Coding,” Ph.D. dissertation, School of Electrical Engineering and Telecommunications, The University of New South Wales, 2000.
[22]Y.Qian and P.Kabal, “Dual-mode wideband speech recovery from narrowband speech,” In Pro.of EUROSPEECH, pages 1433–1436, Switzerland, 2003.
[23]M. Nilsson and W. B. Kleijn, “Avoiding over-estimation in bandwidth extension of telephony speech,” in Proc.of ICASSP, pp. 869–872,USA, May 2001.
[24]J. Epps and W. H. Holmes, “Speech enhancement using STC based bandwidth extension,” in Proc. of ICSLP, pp. 519–522, Australia, Nov. 1998.
[25]M. R. P. Thomas, J. Gudnason, P. A. Naylor, B. Geiser, and P. Vary, “Voice source estimation for artificial bandwidth extension of telephone speech,” in Proc.of ICASSP, pp. 4794–4797, USA, Mar. 2010.
[26]H.Tolba, “On the application of the AM-FM model for the recovery of missing frequency bands of telephone speech,” In Proc.of ICSLP, Australia, 1998.
[27]D.Bansal et al., “Bandwidth expansion of narrowband speech using non-negative matrix factorization,” In Proc. Interspeech, pages 1505–1508, Portugal, 2005.
[28]H.Gustafsson et al., “Low-complexity feature mapped speech bandwidth extension,” IEEE Trans. Audio, Speech, Language Process., pages 577–588, 2006.
[29]N.I.Park et al., “Artificial bandwidth extension of narrowband speech signals for the improvement of perceptual speech communication quality,” In Proc.of FGCN, pages 143–153, Korea, 2011.
[30]J.A.Fuemmeler et al., “Techniques for the regeneration of wideband speech from narrowband speech,” EURASIP J. Appl.Signal Process., pages 266–274, 2001.
[31]P.Jax and P.Vary, “An upper bound on the quality of artificial bandwidth extension of narrowband speech signals,” In Proc. of ICASSP, pages 237–240, USA, 2002.
[32]P. Jax and P. Vary, “Bandwidth extension of speech signals: A catalyst for the introduction of wideband speech coding?,” IEEE Commun.Mag., vol. 44, no. 5, pp. 106–111, 2006.
[33]B. Geiser et al., “Artificial Bandwidth Extension of Speech Supported by Watermark-Transmitted Side Information,” in Proc. Of INTERSPEECH, Lisbon, Portugal,Sept. 2005.
[34]P.Jax, “Backwards Compatible Wideband Telephony,” Chapter 9, Advances in Digital Speech Transmission, Eds., Wiley, 2008.
[35]Siyue Chen et al., “Telephony speech enhancement by data hiding,” IEEE Trans.instrumentation and measurement, pages 63-74, 2007.
[36]S Chen and H Leung, “Artificial bandwidth extension of telephony speech by data hiding,” in pro.of ISCAS, pages 3151–3154, Kobe, May 2005.
[37]S Chen and H Leung, “Speech bandwidth extension by data hiding and phonetic classification,” in Proc.of ICASSP, pages 593-596, 2007.
[38]Zhe Chen et al., “An audio watermark based speech bandwidth extension method,” in pro.of EURASIP J.audio, speech and music processing, 8 pages, 2013.
[39]A.Sagi and D.Malah, “Bandwidth extension of telephone speech aided by data embedding,” in pro.of EURASIP J. Advances in Signal Process., 16 pages, 2007.
[40]B.Iser and G.Schmidt, “Receive side processing for automotive hands free systems,” in Proc. of HSCMA, pages 236–239, Italy, 2008.
[41]Y.Hu and P.C.Loizou, “Effects of introducing low frequency harmonics in the perception of vocoded telephone speech,” J. Acoust. Soc. Am., pages 1280–1289, 2010.
[42]C.Liu et al., “Effect of bandwidth extension to telephone speech recognition in cochlear implant users,” J. Acoust. Soc. Am., pages EL77–EL83, 2009.
[43]F.Mustiere et.al, “Bandwidth extension for speech enhancement,” in Proc.of CCECE, 4 pages, Canada, 2010.
[44]V.Sunny dayal et al., “A Survey On Statistical Based Single Channel Speech Enhancement Techniques,” In Proc.of IJISA ,pages 69–85, 2014.
[45]D.Macho, “Narrowband to wideband feature expansion for robust multilingual ASR,” In Proc.of Interspeech, pages 1118–1121, Belgium, 2007.