This part of ISO/IEC TR 14496 describes the desired joint behavior of MPEG-4 Systems (MPEG-4 File
Format) and MPEG-4 Audio codecs. It is desired that MPEG-4 Audio encoders and decoders permit finite
length signals to be encoded to a file (particularly MPEG-4 files) and decoded again to obtain the identical
signal, subject to codec distortions. This will allow the use of audio in systems implementations (particularly
MPEG-4 Systems), perhaps with other media such as video, in a deterministic fashion. Most importantly, the
decoded signal will have nothing “extra” at the beginning or “missing” at the end.
This permits:
a) an exact ‘round trip’ from raw audio to encoded file back to raw audio (excepting encoding artifacts);
b) predictable synchronization between audio and other media such as video;
c) correct behavior when performing random access as well as when starting at the beginning of a
stream;
d) identical behavior when edits are applied in the raw domain and the encoded domain (again,
excepting encoding artifacts).
It is also required that there be predictable interoperability between encoders (as represented by files) and
decoders. There are two kinds of audio ‘offsets’ (or ‘delay’ in the context of transmission): those that result
from the encoding process, and those that result from the decoding process. This document is primarily
concerned with the latter.
These issues are resolved by the following:
• The handling of composition time stamps for audio composition units is specified. Special care is
taken in the case of compressed data, like HE-AAC coded audio, that can be decoded in a backward
compatible fashion as well as in an enhanced fashion.
• Examples are given that show how finite length signals can be encoded to an MPEG-4 file and
decoded again to obtain the identical signal, excepting codec distortions. Most importantly, the
decoded signal has nothing “extra” at the beginning or “missing” at the end.