G729 详细使用文档

https://tools.ietf.org/html/rfc4749

git://git.linphone.org/linphone-android.git

http://stackoverflow.com/questions/31099875/how-to-integrate-g-729-codec-with-pjsip-project

doc for the API to use bcg729 is directly in the source code, see include/bcg729/encoder.h and decoder.h
In the msbcg729 folder you can found, in bcg729_enc.c and bcg729_dec.c, an example of bcg729 library usage as a mediastreamer2 plugin.

CSIPSIMPLE 源代码中也有相关的实际使用代码,不过不是 bcg729 库,但使用方法应该大致相同吧。

========================

G.711
    也称为PCM(脉冲编码调制),是国际电信联盟订定出来的一套语音压缩标准,主要用于电话。它主要用脉冲编码调制对音频采样,采样率为8k每秒。它利用一个 64Kbps 未压缩通道传输语音讯号。 起压缩率为1:2, 即把16位数据压缩成8位。G.711是主流的波形声音编解码器。
    G.711 标准下主要有两种压缩算法。一种是µ-law algorithm (又称often u-law, ulaw, mu-law),主要运用于北美和日本;另一种是A-law algorithm,主要运用于欧洲和世界其他地区。其中,后者是特别设计用来方便计算机处理的。这两种算法都使用一个采样率为8kHz的输入来创建64Kbps的数字输出。G.711采用一种称为分组丢失隐藏(PLC, Packet Loss Concealment)的技术来减少丢包带来的实际影响。有效的信号带宽在静默期间通过语音活动检测(VAD)这一过程被减小。

8KHz采样率 X  16bit量化 = 128Kbps , 2:1压缩后为64Kbps


G.729
   G.729是ITU-T定义的音频编码算法,frame 只有10ms,G.729点到点的时延为25ms,G.729基于CELP模式,用CS-ACELP(Conjugate-Structure Algebraic Code Excited Linear Prediction)方法以8Kbps的波特率对语音进行编码。
    G.729 几乎都使用于 Voice over IP(VoIP),G.711编解码器可将VoIP压缩至 65Kb/s,使之可以在局域网上传输。
 
    G.729采用的是共轭结构的代数码激励线性预测算法(Conjugate Structure Algebraic Code Excited Linear Prediction,CS-ACELP),这是一种基于CELP编码模型的算法。由于G.729编码器能够实现很高的语音质量(MOS分4.1)和很低的算法延时,被广泛地应用于数据通信的各个领域,如IP Phone和H.323系统等。G.729是对8KHz采样16bit量化的线性PCM语音信号进行编码,压缩后数据速率为8Kbps,具备16:1的高压缩率。

8KHz采样率 X  16bit量化 = 128Kbps , 16:1压缩后为8Kbps

DESCRIPTION

The Adaptive Digital Technologies' G.729 voice coder software is an implementation of ITU Recommendation G.729 8 kbit/s CS-ACELP Speech Codec.

G.729 is an umbrella of vocoder standards. The G.729 vocoders perform voice compression at bit rates that vary between 6.4 and 12.4 kbps. The figure below shows an example of the G.729 vocoder connected to a digital communication channel. The input speech is fed into the G.729 encoder as a stream of16-bit PCM samples, sampled at a rate of 8000 samples/second. The G.729 encoder compresses the data into the Encode Stream. The encoder also outputs the DTX status, which is discussed later in this data sheet. The digital channel carries the data stream and DTX status to the decoder, which regenerates a representation of the original speech, and outputs it as the output speech – again as 16-bit PCM at a sampling rate of 8000 samples/second. Since G.729 is a uses lossy compression, the output speech is not identical to the input speech.

The decoder is also fed a frame erase flag, which is an indication that the decode stream has temporarily been corrupted. The decoder is able to “smooth over” the output , doing its best to conceal the loss of data and minimize the loss in voice quality. This process is known as packet loss concealment (PLC). It works surprisingly well even under high packet loss rates.

Adaptive Digital's G.729AB voice compression algorithm is a highly optimized version of the G.729 ITU Annex A and Annex B standard. G.729 AB offers toll quality speech at a reasonably low bit rate of 8Kbps. The G.729 AB codec uses Discontinuous Transmission (DTX), Voice Activity Detection (VAD), and Comfort Noise Generation (CNG) to reduce bandwidth usage. G.729AB is used in wireless voice, voice-over-packet-networks, multimedia, and voice circuit multiplexing applications.

Data sheet .pdf  

Click here for G.729 information on the following topics. G.729 comparisons and differences.

VAD:Voice Activity Detection,对当前活动帧进行检测,提供一个静音压缩算法。

CNG:舒适噪音生成,在静音的时候产生一定的语音,一直静音下,人的感官觉得不适。

DTX:非连续传输,噪音帧传输。

BFI:Bad Frame Idicator,坏帧指示,静音帧标识。

SID: Silence Insertion Descriptor,2个字节表示。

原文地址:https://www.cnblogs.com/welhzh/p/6884515.html