您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(理学版)》

山东大学学报(理学版) ›› 2017, Vol. 52 ›› Issue (3): 82-90.doi: 10.6040/j.issn.1671-9352.2.2016.209

• • 上一篇    下一篇

基于HMM的移动音频编码丢帧隐藏方法

项慨1,2,陈世鸿2*   

  1. 1.湖北经济学院信息管理与统计学院, 湖北 武汉 430205;2.武汉大学计算机学院, 湖北 武汉 430072
  • 收稿日期:2016-08-18 出版日期:2017-03-20 发布日期:2017-03-20
  • 通讯作者: 陈世鸿(1949— ),男,教授,博士生导师,研究方向为多媒体通信与信息处理,软件工程. E-mail:xksuckx@gmail.com E-mail:xk@hbue.edu.cn
  • 作者简介:项慨(1977— ),男,博士研究生,副教授,研究方向为多媒体通信与信息处理推荐系统. E-mail:xk@hbue.edu.cn
  • 基金资助:
    湖北省教育厅重点项目(D20162201);教育部人文社科青年基金项目(16YJC630067)

A packet loss concealment scheme based on HMM for mobile audio coding

XIANG Kai1,2, CHEN Shi-hong2*   

  1. 1. School of Information Management and Statistics, Hubei University of Economics, Wuhan 430205, Hubei, China;
    2. School of Computing, Wuhan University, Wuhan 430072, Hubei, China
  • Received:2016-08-18 Online:2017-03-20 Published:2017-03-20

摘要: 针对恶劣移动音频传输环境下突发连续大量丢帧问题,本文提出一种基于HMM的丢帧隐藏方法,通过分析语音信号在更大范围的上下文关系的统计学变化来选择合适的丢帧隐藏策略。当包丢失时,基于HMM的恢复方法使用状态和密度函数信息,计算丢失帧参数的估计值。实验结果表明:提出的方法相比AVS-P10标准的语音编码器原有方法,客观语音测试PESQ平均分提高约0.33分,主观语音测试CMOS平均分能够提高约0.05分。

关键词: 移动音频编码, 丢帧隐藏, 音频信号处理, 隐马尔科夫模型

Abstract: In order to deal with the problem of unexpected large number of continuous packet loss scenarios occur in poor quality of mobile voice transmission and audio communication service, a packet loss concealment scheme based on HMM for ACELP codec is proposed. The scheme utilizes HMM to decide the appropriate packet loss concealment strategy according to the statistical change of larger range of context in audio signal. When a packet loss occurs, the estimation is involved by recovery through status and probability density function. The experiment results show that the objective listening test PESQ score achieves an increase of 0.33 points and subjective listening test CMOS score achieves an increase of 0.05 points when compared with the original scheme employed in AVS-P10 standard.

Key words: HMM, audio signal processing, mobile audio coding, packet loss concealment

中图分类号: 

  • TN912.3
[1] PERKINS C, HODSON O, HARDMAN V. A survey of packet loss recovery techniques for streaming audio[J]. IEEE Network, 1998, 12(8):40-48.
[2] HARDMAN V, SASSE A. Reliable audio for use over the internet[C] // Proceedings of the International Networking Conference. Hawaii: IEEE, 1995: 1-8.
[3] RAMSEY J L. Realization of optimum interleavers[J]. IEEE Transaction on Information Theory, 1970, IT(16):338-345.
[4] FLOYD S. A reliable multicast framework for light-weight sessions and applications level framing[J]. IEEE Transactions on Networking, 1997, 5(6):784-803.
[5] GOODMAN D, GORDON B, WASEM O, et al. Waveform substitution techniques for recovering missing speech segments in packet voice communications[J]. IEEE Transactions on Acoustics, Speech and Signal Processing, 1986, 34(6):1440-1448.
[6] PARIKH V N. Frame erasure concealment using sinusoidal analysis-synthesis and its application to MDCT-based codecs[C] // Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Istanbul: IEEE, 2000: 905-908.
[7] RABINER L R. A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition[J]. Proceedings of the IEEE, 1989, 77(2):257-285.
[8] STEWART W J. Introduction to the numerical solution of Markov chains[M]. New Jersey: Princeton University Press, 1994.
[9] RABINER L R, JUANG B H. An introduction to HMM[J]. IEEE ASSP Magazine, 1986, 3:4-16.
[10] BERNARD A, ALWAN A. Low-bitrate distributed speech recognition for packet-based and wireless communication[J]. IEEE Transactions on Speech Audio Process, 2002, 10(8):570-579.
[11] JELINEK F, BAHL L, MERCER R. Design of a linguistic statistical decoder for the recognition of continuous speech[J]. IEEE Transactions of Information Theory, 1975, 21(3):250-256.
[12] RAZA D G, CHAN C F. Quality Enhancement of CELP Coded Speech by Using a Voicing Gaussian Mixture Model[C] //Proceedings of the International Conference on Signal Processing, Beijing: IEEE, 2002: 452-455.
[13] PEINADO A M, GOMEZ A M, SANCHEZ V E, et al. Packet Loss Concealment Based on VQ Replicas and MMSE Estimation Applied to Distributed Speech[C] //Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Philadelphia: IEEE, 2005: 329-332.
[14] HARTIGAN J A, WONG M A. A K-means clustering algorithm[J]. Applied Statistics, 2013, 28(1):100-108.
[15] YANG Y H, HU R M, ZHANG Y. Analysis and application of perceptual weighting for AVS-M audio coder[C] // Proceedings of the IEEE International Conference on Wireless Communications, Networking and Mobile Computing, Shanghai: IEEE, 2007: 2923-2926.
No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!