The AMBE Vocoder: How Voice Becomes DMR Data

Category: BasicsDifficulty: ★★☆~9 min

When you key the PTT, the radio doesn't "record" your voice like a dictaphone. Instead, a specialized algorithm — the vocoder — builds a mathematical model of your speech in a fraction of a second and sends a tiny stream of data over the air. On the receiving end, that same model is expanded back into sound. This is exactly why DMR sounds clean where analog FM is already hissing: the chief enemy of digital is not a weak signal as such, but bit errors.

What a vocoder is and where MBE fits in

The word vocoder is short for voice coder. Unlike audio codecs (MP3, AAC), which "photograph" the signal, a vocoder builds a parametric model of speech production: how the vocal cords work, how the shape of the mouth changes timbre. Transmitting those parameters is far cheaper than the sound itself.

AMBE (Advanced Multi-Band Excitation) is a family of vocoders from DVSI (Digital Voice Systems, Inc.), built on the MBE (Multi-Band Excitation) method. The method splits the speech signal into harmonic bands and, for each one, estimates how "voiced" (tonal) or "unvoiced" (noise-like) it is. This preserves speech intelligibility at very low bit rates.

AMBE+2: the version for DMR

The ETSI standard for DMR (TS 102 361-1) fixed AMBE+2 as its voice codec — the latest generation of the family at the time the standard was adopted. The key numbers:

Speech stream: 2 450 bit/s — the speech parameters themselves;
FEC (Forward Error Correction): 1 150 bit/s — redundancy for error recovery;
Total stream per slot: 3 600 bit/s;
Frame size: 20 ms — every 20 ms a fragment of speech is "sliced" from the microphone and turned into a single AMBE frame;
DMR frame: 30 ms, containing three AMBE frames of 72 bits each (216 bits of payload per voice frame in total).

Why 3 600 and not 2 450? FEC adds about 47% redundancy. With burst errors over the air, this allows a frame to be recovered without a retransmission. If there are too many errors, the frame is "masked" by interpolating from neighboring frames; you hear a characteristic "bubbling" artifact rather than the white noise of analog FM.

Where the vocoder physically lives

In professional and amateur radios, AMBE+2 is implemented in one of two ways:

A DVSI hardware chip — for example, the AMBE-3000 or AMBE-2020. This is a dedicated DSP chip that performs encoding/decoding in real time with minimal latency. It is used in most portable DMR radios for working use (TYT, AnyTone, Hytera, Motorola).
A software implementation — running on a sufficiently powerful embedded processor. This is usually found in IP communicators (PoC devices) and smartphone apps, where the available compute power means no separate chip is needed.

In both cases the vocoder is part of the radio, not the network. The server infrastructure only ever sees the already-encoded frames.

Encoding latency One AMBE frame = 20 ms. The typical end-to-end latency in a DMR hotspot with a server is on the order of 100–300 ms. That is comparable to IP telephony and unnoticeable in an ordinary PTT conversation.

A patented algorithm: what that means in practice

AMBE/AMBE+2 is a patented DVSI technology. The radio manufacturer pays DVSI a licensing fee. For the end user, this has several important consequences:

There is no full free (open-source) implementation of AMBE+2 in the public domain — a software decoder would require a license.
Existing "open" tools for decoding DMR voice (for example, in SDR scanners) use reverse-engineered approximations or require a DVSI hardware dongle.
The alternative vocoder Codec 2 (fully open) is used in other modes (FreeDV, M17), but by definition is not compatible with DMR ETSI.

Important Software AMBE implementations for personal use exist in a "gray area" of patent law. Commercial use without a DVSI license is illegal. In amateur radio, the situation varies by jurisdiction.

How voice travels through the DMRhub network

In a pure DMR network, voice travels the entire path in one and the same format — encoded AMBE+2 frames. This is a fundamental difference from an analog repeater, which "hears" and re-emits the sound.

You key the PTT — the chip in the radio encodes your voice into a stream of AMBE+2 frames.
The frames go over the air to the MMDVM hotspot.
The hotspot wraps them in IP packets (the MMDVM/Homebrew protocol) and sends them to the DMRhub master server.
The master server relays those same packets to every subscriber on the active talkgroup.
The recipients' radios decode the AMBE+2 back into sound.

Along this entire path there is no transcoding. The master server acts as a switch: it routes ready-made voice frames without ever expanding them into PCM. This keeps the infrastructure lightweight — transcoding requires neither DVSI hardware chips on the server nor significant compute resources.

Transcoding: when it is needed after all

The problem arises at cross-mode gateways. D-STAR uses an older AMBE codec (without the "+2"); C4FM/YSF technically uses the same AMBE+2 but in a different "wrapper" with different signaling. If you want to link DMR and D-STAR, or DMR and YSF, "by voice," the gateway must:

Decode the DMR AMBE+2 frames into PCM audio;
Re-encode the PCM back into the format the other network needs.

For this, the gateway needs either a DVSI hardware USB dongle (AMBE-3000 or equivalent) or a licensed software library. This is exactly why cross-mode reflectors are a distinct class of infrastructure. In our network, which operates in the DMR standard only, no such equipment is required.

C4FM and AMBE+2 — "relatives," but not compatible Motorola, Hytera (DMR) and Yaesu (C4FM/YSF) all use the very same AMBE+2 vocoder in their systems. But DMR and C4FM have different modulation, framing and signaling protocols. Direct "native" reception of one on the other is impossible without transcoding.

Audio quality and the limits of what's possible

AMBE+2 sounds better than its predecessor AMBE (used in D-STAR) thanks to an improved algorithm and built-in FEC. With a BER below about 2–3%, the quality stays stable and intelligible. Under heavy errors the vocoder masks the damaged frames — you hear a "blub-blub" or a brief dropout, after which speech recovers.

A fundamental limitation of a parametric vocoder: it is optimized for human speech. Music, whistles and strong microphone interference come through AMBE+2 sounding recognizably "synthetic" — the algorithm models the vocal tract, not an arbitrary sound signal.

Voice in DMRhub goes end-to-end with no transcoding

In our network the vocoder works only in the radio. The master server passes the already-encoded AMBE+2 frames as-is — no re-encoding, no DVSI chips on the server. To get on the air, all you need is a DMR-capable radio, your own DMR ID, and a hotspot or a nearby repeater.

Build a hotspot image Get a DMR ID

DMR from scratchTime slots, talkgroups, Color Code DMR vs other digital modesD-STAR, C4FM, P25 — how they differ Private vs public networkWhy run your own infrastructure PoC devicesA radio in your phone: how it works

Sources

DVSI AMBE+2 Vocoder Technology (official product page) — dvsinc.com
ETSI TS 102 361-1: DMR Air Interface protocol / 216-bit voice frame, 3 × 72-bit AMBE+2 — cartoonman.github.io (DMR frame)
Multi-Band Excitation — Wikipedia (history of MBE/IMBE/AMBE/AMBE+2, licensing) — en.wikipedia.org
What are Transcoding Reflectors (Silvercreek ARA) — DMR/YSF/D-STAR transcoding — w8wky.org