to top

Audio Terminology

In this document

This document provides a glossary of audio-related terminology, including a list of widely used, generic terms and a list of terms that are specific to Android.

Generic Terms

These are audio terms that are widely used, with their conventional meanings.

Digital Audio

The study of the mechanical properties of sound, for example how the physical placement of transducers such as speakers and microphones on a device affects perceived audio quality.
A multiplicative factor less than or equal to 1.0, applied to an audio signal to decrease the signal level. Compare to "gain."
An audiophile is an individual who is concerned with a superior music reproduction experience, especially someone willing to make tradeoffs (of expense, component size, room design, etc.) beyond what an ordinary person might choose.
bits per sample or bit depth
Number of bits of information per sample.
A single stream of audio information, usually corresponding to one location of recording or playback.
To decrease the number of channels, e.g. from stereo to mono, or from 5.1 to stereo. This can be accomplished by dropping some channels, mixing channels, or more advanced signal processing. Simple mixing without attenuation or limiting has the potential for overflow and clipping. Compare to "upmixing."
Direct Stream Digital, a proprietary audio encoding based on pulse-density modulation. Whereas PCM encodes a waveform as a sequence of individual audio samples of multiple bits, DSD encodes a waveform as a sequence of bits at a very high sample rate. For DSD, there is no concept of "samples" in the conventional PCM sense. Both PCM and DSD represent multiple channels by independent sequences. DSD is better suited to content distribution than as an internal representation for processing, as it can be difficult to apply traditional DSP algorithms to DSD. DSD is used in Super Audio CD (SACD) and in DSD over PCM (DoP) for USB. See the Wikipedia article Digital Stream Digital for more information.
To temporarily reduce the volume of one stream, when another stream becomes active. For example, if music is playing and a notification arrives, then the music stream could be ducked while the notification plays. Compare to "mute."
A hardware module or software data structure that implements First In, First Out queueing of data. In the context of audio, the data stored in the queue are typically audio frames. A FIFO can be implemented by a circular buffer.
A set of samples, one per channel, at a point in time.
frames per buffer
The number of frames handed from one module to the next at once; for example the audio HAL interface uses this concept.
A multiplicative factor greater than or equal to 1.0, applied to an audio signal to increase the signal level. Compare to "attenuation."
HD audio
High-Definition audio, a synonym for "high-resolution audio." Not to be confused with Intel High Definition Audio.
The units for sample rate or frame rate.
high-resolution audio
There is no standard definition, but high-resolution usually means any representation with greater bit-depth and sample rate than CDs (which are stereo 16-bit PCM at 44.1 kHz), and with no lossy data compression applied. Equivalent to "HD audio." See the Wikipedia article high-resolution audio for more information.
Time delay as a signal passes through a system.
A lossless data compression algorithm preserves bit accuracy across encoding and decoding. The result of decoding any previously encoded data is equivalent to the original data. Examples of lossless audio content distribution formats include CDs, PCM within WAV, and FLAC. Note that the authoring process may reduce the bit depth or sample rate from that of the masters. Distribution formats that preserve the resolution and bit accuracy of masters are the subject of "high-resolution audio."
A lossy data compression algorithm attempts to preserve the most important features of media across encoding and decoding. The result of decoding any previously encoded data is perceptually similar to the original data, but it is not identical. Examples of lossy audio compression algorithms include MP3 and AAC. As analog values are from a continuous domain, whereas digital values are discrete, ADC and DAC are lossy conversions with respect to amplitude. See also "transparency."
One channel.
See "surround sound." Strictly, since stereo is more than one channel, it is also "multi" channel. But that usage would be confusing.
To (temporarily) force volume to be zero, independently from the usual volume controls.
An audible glitch caused by failure to accept supplied data in sufficient time. See Wikipedia article buffer underrun [sic; the article for "buffer overrun" describes an unrelated failure]. Compare to "underrun."
To direct a signal to a desired position within a stereo or multi-channel field.
Pulse Code Modulation, the most common low-level encoding of digital audio. The audio signal is sampled at a regular interval, called the sample rate, and then quantized to discrete values within a particular range depending on the bit depth. For example, for 16-bit PCM, the sample values are integers between -32768 and +32767.
To gradually increase or decrease the level of a particular audio parameter, for example volume or the strength of an effect. A volume ramp is commonly applied when pausing and resuming music, to avoid a hard audible transition.
A number representing the audio value for a single channel at a point in time.
sample rate or frame rate
Number of frames per second; note that "frame rate" is thus more accurate, but "sample rate" is conventionally used to mean "frame rate."
The use of sound to express feedback or information, for example touch sounds and keyboard sounds.
Two channels.
stereo widening
An effect applied to a stereo signal, to make another stereo signal which sounds fuller and richer. The effect can also be applied to a mono signal, in which case it is a type of upmixing.
surround sound
Various techniques for increasing the ability of a listener to perceive sound position beyond stereo left and right.
The ideal result of lossy data compression, as stated in the Transparency Wikipedia article. A lossy data conversion is said to be transparent if it is perceptually indistinguishable from the original by a human subject.
An audible glitch caused by failure to supply needed data in sufficient time. See Wikipedia article buffer underrun. Compare to "overrun."
To increase the number of channels, e.g. from mono to stereo, or from stereo to surround sound. This can be accomplished by duplication, panning, or more advanced signal processing. Compare to "downmixing."
An effect that attempts to spatialize audio channels, such as trying to simulate more speakers, or give the illusion that various sound sources have position.
Loudness, the subjective strength of an audio signal.

Hardware and Accessories

These terms are related to audio hardware and accessories.

Inter-device interconnect

These technologies connect audio and video components between devices, and are readily visible at the external connectors. The HAL implementor may need to be aware of these, as well as the end user.

A short range wireless technology. The major audio-related Bluetooth profiles and Bluetooth protocols are described at these Wikipedia articles:
Digital display interface by VESA.
High-Definition Multimedia Interface, an interface for transferring audio and video data. For mobile devices, either a micro-HDMI (type D) or MHL connector is used.
Intel HDA
Intel High Definition Audio (commonly shortened to HDA) is a specification for, among other things, a front-panel connector. Not to be confused with generic "high-definition audio" or "high-resolution audio."
Mobile High-Definition Link is a mobile audio/video interface, often over micro-USB connector.
phone connector
A mini or sub-mini phone connector connects a device to wired headphones, headset, or line-level amplifier.
An adapter from micro-USB to HDMI.
Sony/Philips Digital Interface Format is an interconnect for uncompressed PCM. See Wikipedia article S/PDIF.
Thunderbolt is a multimedia interface that competes with USB and HDMI for connecting to high-end peripherals.
Universal Serial Bus. See Wikipedia article USB.

Intra-device interconnect

These technologies connect internal audio components within a given device, and are not visible without disassembling the device. The HAL implementor may need to be aware of these, but not the end user.

See these Wikipedia articles:

Audio Signal Path

These terms are related to the signal path that audio data follows from an application to the transducer, or vice-versa.

Analog to digital converter, a module that converts an analog signal (continuous in both time and amplitude) to a digital signal (discrete in both time and amplitude). Conceptually, an ADC consists of a periodic sample-and-hold followed by a quantizer, although it does not have to be implemented that way. An ADC is usually preceded by a low-pass filter to remove any high frequency components that are not representable using the desired sample rate. See Wikipedia article Analog-to-digital converter.
Application processor, the main general-purpose computer on a mobile device.
Coder-decoder, a module that encodes and/or decodes an audio signal from one representation to another. Typically this is analog to PCM, or PCM to analog. Strictly, the term "codec" is reserved for modules that both encode and decode, however it can also more loosely refer to only one of these. See Wikipedia article Audio codec.
Digital to analog converter, a module that converts a digital signal (discrete in both time and amplitude) to an analog signal (continuous in both time and amplitude). A DAC is usually followed by a low-pass filter to remove any high frequency components introduced by digital quantization. See Wikipedia article Digital-to-analog converter.
Digital Signal Processor, an optional component which is typically located after the application processor (for output), or before the application processor (for input). The primary purpose of a DSP is to off-load the application processor, and provide signal processing features at a lower power cost.
Pulse-density modulation is a form of modulation used to represent an analog signal by a digital signal, where the relative density of 1s versus 0s indicates the signal level. It is commonly used by digital to analog converters. See Wikipedia article Pulse-density modulation.
Pulse-width modulation is a form of modulation used to represent an analog signal by a digital signal, where the relative width of a digital pulse indicates the signal level. It is commonly used by analog to digital converters. See Wikipedia article Pulse-width modulation.
A transducer converts variations in physical "real-world" quantities to electrical signals. In audio, the physical quantity is sound pressure, and the transducers are the loudspeaker and microphone. See Wikipedia article Transducer.

Sample Rate Conversion

To resample, where sink sample rate < source sample rate.
Nyquist frequency
The Nyquist frequency, equal to 1/2 of a given sample rate, is the maximum frequency component that can be represented by a discretized signal at that sample rate. For example, the human hearing range is typically assumed to extend up to approximately 20 kHz, and so a digital audio signal must have a sample rate of at least 40 kHz to represent that range. In practice, sample rates of 44.1 kHz and 48 kHz are commonly used, with Nyquist frequencies of 22.05 kHz and 24 kHz respectively. See Nyquist frequency and Hearing range for more information.
Synonym for sample rate converter.
The process of converting sample rate.
sample rate converter
A module that resamples.
The output of a resampler.
The input to a resampler.
To resample, where sink sample rate > source sample rate.

Android-Specific Terms

These are terms specific to the Android audio framework, or that may have a special meaning within Android beyond their general meaning.

Advanced Linux Sound Architecture. As the name suggests, it is an audio framework primarily for Linux, but it has influenced other systems. See Wikipedia article ALSA for the general definition. As used within Android, it refers primarily to the kernel audio framework and drivers, not to the user-mode API. See tinyalsa.
audio device
Any audio I/O end-point that is backed by a HAL implementation.
An API and implementation framework for output (post-processing) effects and input (pre-processing) effects. The API is defined at
The sound server implementation for Android. AudioFlinger runs within the mediaserver process. See Wikipedia article Sound server for the generic definition.
audio focus
A set of APIs for managing audio interactions across multiple independent apps. See Managing Audio Focus and the focus-related methods and constants of
The module within AudioFlinger responsible for combining multiple tracks and applying attenuation (volume) and certain effects. The Wikipedia article Audio mixing (recorded music) may be useful for understanding the generic concept. But that article describes a mixer more as a hardware device or a software application, rather than a software module within a system.
audio policy
Service responsible for all actions that require a policy decision to be made first, such as opening a new I/O stream, re-routing after a change, and stream volume management.
The primary low-level client API for receiving data from an audio input device such as microphone. The data is usually in pulse-code modulation (PCM) format. The API is defined at
The module within AudioFlinger responsible for sample rate conversion.
audio source
An audio source is an enumeration of constants that indicates the desired use case for capturing audio input. As of API level 21 and above, audio attributes are preferred.
The primary low-level client API for sending data to an audio output device such as a speaker. The data is usually in PCM format. The API is defined at
An audio utility library for features such as PCM format conversion, WAV file I/O, and non-blocking FIFO, which is largely independent of the Android platform.
Usually same as application or app, but sometimes the "client" of AudioFlinger is actually a thread running within the mediaserver system process. An example of that is when playing media that is decoded by a MediaPlayer object.
Hardware Abstraction Layer. HAL is a generic term in Android. With respect to audio, it is a layer between AudioFlinger and the kernel device driver with a C API, which replaces the earlier C++ libaudio.
A thread within AudioFlinger that sends audio data to lower latency "fast tracks" and drives the input device when configured for reduced latency.
A thread within AudioFlinger that receives and mixes audio data from lower latency "fast tracks" and drives the primary output device when configured for reduced latency.
fast track
An AudioTrack or AudioRecord client with lower latency but fewer features, on some devices and routes.
A higher-level client API than AudioTrack, for playing either encoded content, or content which includes multimedia audio and video tracks.
An AudioFlinger debugging feature, available in custom builds only, for logging audio events to a circular buffer where they can then be dumped retroactively when needed.
An Android system process that contains a number of media-related services, including AudioFlinger.
An abstraction for "non-blocking" audio input/output ports used within AudioFlinger. The name can be misleading, as some implementations of the NBAIO API actually do support blocking. The key implementations of NBAIO are for pipes of various kinds.
normal mixer
A thread within AudioFlinger that services most full-featured AudioTrack clients, and either directly drives an output device or feeds its sub-mix into FastMixer via a pipe.
An audio API standard by The Khronos Group. Android versions since API level 9 support a native audio API that is based on a subset of OpenSL ES 1.0.1.
silent mode
A user-settable feature to mute the phone ringer and notifications, without affecting media playback (music, videos, games) or alarms.
A higher-level client API than AudioTrack, used for playing sampled audio clips. It is useful for triggering UI feedback, game sounds, etc. The API is defined at
See Media.
A module within AudioFlinger responsible for synchronizing state among threads. Whereas NBAIO is used to pass data, StateQueue is used to pass control information.
A grouping of stream types with similar behavior, used by the audio policy service.
stream type
An enumeration that expresses a use case for audio output. The audio policy implementation uses the stream type, along with other parameters, to determine volume and routing decisions. Specific stream types are listed at
tee sink
See the separate article on tee sink in Audio Debugging.
A small user-mode API above ALSA kernel with BSD license, recommended for use in HAL implementations.
A higher-level client API than AudioTrack, used for playing DTMF signals. See the Wikipedia article Dual-tone multi-frequency signaling, and the API definition at
An audio stream, controlled by the AudioTrack or AudioRecord API.
volume attenuation curve
A device-specific mapping from a generic volume index to a particular attenuation factor for a given output.
volume index
A unitless integer that expresses the desired relative volume of a stream. The volume-related APIs of operate in volume indices rather than absolute attenuation factors.