山东大学学报(理学版) ›› 2017, Vol. 52 ›› Issue (3): 82-90.doi: 10.6040/j.issn.1671-9352.2.2016.209
项慨1,2,陈世鸿2*
XIANG Kai1,2, CHEN Shi-hong2*
摘要: 针对恶劣移动音频传输环境下突发连续大量丢帧问题,本文提出一种基于HMM的丢帧隐藏方法,通过分析语音信号在更大范围的上下文关系的统计学变化来选择合适的丢帧隐藏策略。当包丢失时,基于HMM的恢复方法使用状态和密度函数信息,计算丢失帧参数的估计值。实验结果表明:提出的方法相比AVS-P10标准的语音编码器原有方法,客观语音测试PESQ平均分提高约0.33分,主观语音测试CMOS平均分能够提高约0.05分。
中图分类号:
[1] PERKINS C, HODSON O, HARDMAN V. A survey of packet loss recovery techniques for streaming audio[J]. IEEE Network, 1998, 12(8):40-48. [2] HARDMAN V, SASSE A. Reliable audio for use over the internet[C] // Proceedings of the International Networking Conference. Hawaii: IEEE, 1995: 1-8. [3] RAMSEY J L. Realization of optimum interleavers[J]. IEEE Transaction on Information Theory, 1970, IT(16):338-345. [4] FLOYD S. A reliable multicast framework for light-weight sessions and applications level framing[J]. IEEE Transactions on Networking, 1997, 5(6):784-803. [5] GOODMAN D, GORDON B, WASEM O, et al. Waveform substitution techniques for recovering missing speech segments in packet voice communications[J]. IEEE Transactions on Acoustics, Speech and Signal Processing, 1986, 34(6):1440-1448. [6] PARIKH V N. Frame erasure concealment using sinusoidal analysis-synthesis and its application to MDCT-based codecs[C] // Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Istanbul: IEEE, 2000: 905-908. [7] RABINER L R. A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition[J]. Proceedings of the IEEE, 1989, 77(2):257-285. [8] STEWART W J. Introduction to the numerical solution of Markov chains[M]. New Jersey: Princeton University Press, 1994. [9] RABINER L R, JUANG B H. An introduction to HMM[J]. IEEE ASSP Magazine, 1986, 3:4-16. [10] BERNARD A, ALWAN A. Low-bitrate distributed speech recognition for packet-based and wireless communication[J]. IEEE Transactions on Speech Audio Process, 2002, 10(8):570-579. [11] JELINEK F, BAHL L, MERCER R. Design of a linguistic statistical decoder for the recognition of continuous speech[J]. IEEE Transactions of Information Theory, 1975, 21(3):250-256. [12] RAZA D G, CHAN C F. Quality Enhancement of CELP Coded Speech by Using a Voicing Gaussian Mixture Model[C] //Proceedings of the International Conference on Signal Processing, Beijing: IEEE, 2002: 452-455. [13] PEINADO A M, GOMEZ A M, SANCHEZ V E, et al. Packet Loss Concealment Based on VQ Replicas and MMSE Estimation Applied to Distributed Speech[C] //Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Philadelphia: IEEE, 2005: 329-332. [14] HARTIGAN J A, WONG M A. A K-means clustering algorithm[J]. Applied Statistics, 2013, 28(1):100-108. [15] YANG Y H, HU R M, ZHANG Y. Analysis and application of perceptual weighting for AVS-M audio coder[C] // Proceedings of the IEEE International Conference on Wireless Communications, Networking and Mobile Computing, Shanghai: IEEE, 2007: 2923-2926. |
No related articles found! |
|