The Pros and Cons of Various Speech Codecs

January 7, 2016

Author: Debora Hlee

This article offers a detailed comparison of some of the most commonly used speech codecs by developers. Go through the article for information that will help you make the right selection.

Using speech codecs allow developers to compress and decompress audio files that contain speech data and stream speech formats. With a variety of speech codecs available in the market that can be implemented on different algorithms, each one comes with specific applications in various fields. Here is a low down on some of the most popular speech codecs available:

G.711

This is a pulse code modulation of voice frequencies on a 64 kbps channel. This codec makes use of a sampling rate of 8000 samples every second and a non-uniform quantization along with 8 bits is utilized for the purpose of representing every sample. This eventually results in a bit rate of 64 kbits per second.

One of the benefits of this codec is that it is designed to deliver extremely precise transmission of speech. In addition to this, it also comes with relatively lower processing overheads. That said, many developers complain of an unsatisfactory network efficiency. This codec also lacks missing pack interpolation and requires 128 kbps bandwidth in each direction.

G.722

This is a ITU standard wideband speech codec that operates between 48 and 64 kbits per second. The technology of this codec is designed on the basis of split band ADPCM. This codec is extremely worthwhile when used in a fixed network VoIP application, where the required bandwidth does not have limitations. Furthermore, this codec also offers significant improvement in speech quality over other older narrowband codecs. That said, this codec is not ideal for broadcast remotes. You will also be able to find other versions in the market such as the G.722.1 and the G.722.2, the latter of which is referred to as AMR-WB.

G.723 and G.723.1

When you read up information on G723 codec, you will realize that it is completely different from the G.723.1 variant. The G.723 codec is an ITU standard for speech codecs that relies on the ADPCM method and offers great quality audio between 24 and 40 kbps. This codec is most commonly used for Digital Circuit Multiplication Equipment (DCME) applications.

The G.723.1 speech codec compresses voice audio in 30 ms frames. With an algorithmic look-ahead of 7.5 ms, duration results in a total delay of 37.5 ms. This codec offers extremely high compression without compromising on the quality of the audio. Furthermore, on fast computers, it facilitates simultaneous encoding and decoding and proves to be a lot more effective in the audio of video conferencing systems.

That said, this codec requires a lot of processing power and may not be ideal for sound effects and music. Many developers find that it has lower quality than some other codecs falling under the similar data rate category. Before finalizing on a codec to use, make sure you read up information on G723 codec among the others, to make a smart decision resulting in enhanced functionality.