theonlineoasis

For further information please see

H.264/MPEG-4 AVC defines several different profiles, which specify which coding methods and parameters are allowed in a stream. The Main and High profiles allow the use of context adaptive binary arithmetic coding (CABAC), which offers improved compression performance (around 10% bit saving) compared to the context adaptive variable length coding (CAVLC) method which is available as an alternative, though it is more expensive to implement.

Overview of the entropy coding process

The entropy encoder takes as input a sequence of symbols representing samples and control information and maps this onto a binary bitstream which is output into the container. In contrast with earlier compression stages, the entropy coding is lossless; the decoder will reproduce the exact sequence of symbols which was input to the entropy encoder during compression.

H.264's implementation of CABAC creates the bitstream in three stages.

  1. In binarization, each symbol to be output is uniquely mapped onto a binary string, called a bin string. Each bit position in the bin string is called a bin. Each bin is then passed to one of two coding modes: in regular coding mode, the next step, context modelling, is applied and the resulting context model and bin value are passed to the binary arithmetic coding engine; in bypass mode, context modelling is skipped and the bin is passed directly to a bypass coding engine, skipping the context modelling stage.
  2. In context modelling (only used for regular coding mode) a bin is categorised for coding under a particular probability model. Each probability model has its state represented by a context variable which is a pair (most probable symbol in {0, 1}, probability of less probable symbol). Arithmetic coding is applied using the chosen context model and updates its context variable.
  3. In binary arithmetic coding the value of the bin is used to update the context variable if applicable, and bits are output into the bitstream.

Binarization

The input to this process is a symbol (syntax element) to be coded, such as a quantization transform coefficient, macroblock type specifier or a motion vector component. The mapping onto bin strings should be close to a minimum redundancy code.

Main types of binarization

Four main types of binarization are defined:

There are also five unstructed binary trees defined manually for coding macroblock and submacroblock types.

Concatenated binarizations

The codes can also be concatenated. There are three situations where concatenations of the four basic types are used:

Context modelling

The context modelling stage associates a context model with each bin output by the binarization stage.

There are four basic types of context modelling, which associate a probability with each bin based on previously coded values or other symbols in the neighbourhood:

The context modelling process only ever references past values within the same slice.

Each syntax element may use one of a range of models, each of which is denoted by a context index. The possible models for each syntax element are given in Table 9-11 of the standard, which specifies the allowable values for the context index γ for each element. The range of allowed values of context index for mb_type, sub_mb_type and mb_skip_flag depends on the slice type being coded (SI/I, SP/P or B).

Each probability model (uniquely associated with a context index) consists of a pair of two values: a 6-bit probability state index σγ and a single bit which is the most probable symbol (MPS). Each model is therefore represented by a 7-bit value.

Macroblock type, submacroblock type, spatial and temporal prediction modes, slice- and macroblock-based control information syntax elements all use context indices between 0 and 72. The context index is calculated as γ = ΓS + χS where ΓS is the context index offset, which is the lowest value in the allowable range for the syntax element's context index, and χS is a context index increment, which specifies the offset within the range. χS may either depend only on the bin index (giving a fixed assignment of probability model to each bin), or it may specify one of the first two context modelling types above.

Context indices in the range 73 to 398 are used for coding residual data (except for γ = 276 which is associated with the end of slice flag).

significant_coeff_flag and last_significant_coeff_flag use different models depending on whether they are in frame or field mode. Not all context models are used in frame-only/field-only pictures.

The model for coded_block_pattern is specified using γ = ΓS + χS. All other syntax elements of residual data use the relation γ = ΓS + ΔS(ctx_cat) + χS, where the context category dependent offset ΔS. Table 9-40 in the standard specifies the value of this offset, in terms of the context category which is given for each block type in Table 9-42.

Binary arithmetic coding

Arithmetic coding works by representing an interval within [0, 1] by two values: a lower bound L and a range R, and recursively subdividing this interval using the probability and value of each input bit: on reception of a more probably symbol (MPS), with probability pMPS, the interval is updated to have width RMPS := RpMPS (the corresponding operation for a less probable symbol would update the interval to have width RLPS := RpLPS then update the lower bound L := L + R - RLPS).

H.264/MPEG-4 AVC CABAC uses a modulo-coder (M coder) as its binary arithmetic coding implementation. It avoids the multiplication above by quantizing the value of the interval width R onto a small set of values Q = {Q0, Q1, …, QK - 1}, and the probability range of the the less probable symbol pLPS (0, 0.5] onto another set of values P = {p0, p1, …, pN - 1}. The tradeoff chosen for H.264 was K = 4 quantized range values and N = 64 probability vaulues.

For the bypass-mode coding engine, the probability estimation stage is omitted.