Starting March 27, 2025, we recommend using android-latest-release instead of aosp-main to build and contribute to AOSP. For more information, see Changes to AOSP.

Audio terminology

This glossary of audio-related terminology includes widely used generic terms and Android-specific terms. See the central Android Platform Glossary for the canonical definitions of terms.

Generic terms

Generic audio-related terms have conventional meanings.

Digital audio

Digital audio terms relate to handling sound using audio signals encoded in digital form. For details, refer to Digital audio.

AC-3: An audio codec by Dolby. For details, refer to Dolby Digital.
acoustics: Study of the mechanical properties of sound, such as how the physical placement of transducers (for example, speakers, microphones) on a device affects perceived audio quality.
attenuation: Multiplicative factor less than or equal to 1.0, applied to an audio signal to decrease the signal level. Compare to gain.
audiophile: Person concerned with a superior music reproduction experience, especially willing to make substantial tradeoffs (for example, expense, component size, room design) for sound quality. For details, refer to Audiophile.
AVB: A standard for real-time transmission of digital audio over Ethernet. For details, refer to Audio Video Bridging.
bits per sample or bit depth: Number of bits of information per sample.
channel: Single stream of audio information, usually corresponding to one location of recording or playback.
downmixing: Decreasing the number of channels, such as from stereo to mono or from 5.1 to stereo. Accomplished by dropping channels, mixing channels, or more advanced signal processing. Simple mixing without attenuation or limiting has the potential for overflow and clipping. Compare to upmixing.
DSD: Direct Stream Digital. Proprietary audio encoding based on pulse-density modulation. While pulse-code modulation (PCM) encodes a waveform as a sequence of individual audio samples of multiple bits, DSD encodes a waveform as a sequence of bits at a very high sample rate (without the concept of samples). Both PCM and DSD represent multiple channels by independent sequences. DSD is better suited to content distribution than as an internal representation for processing as it can be difficult to apply traditional digital signal processing (DSP) algorithms to DSD. DSD is used in Super Audio CD (SACD) and in DSD over PCM (DoP) for USB. For details, refer to Direct Stream Digital.
duck: Temporarily reduce the volume of a stream when another stream becomes active. For example, if music is playing when a notification arrives, the music ducks while the notification plays. Compare to mute.
FIFO: First in, first out. Hardware module or software data structure that implements FIFO queueing of data. In an audio context, the data stored in the queue are typically audio frames. FIFO can be implemented by a circular buffer.
frame: Set of samples, one per channel, at a point in time.
frames per buffer: Number of frames handed from one module to the next at one time. The audio HAL interface uses the concept of frames per buffer.
gain: Multiplicative factor greater than or equal to 1.0, applied to an audio signal to increase the signal level. Compare to attenuation.
HD audio: High-definition audio. Synonym for high-resolution audio (but different from Intel High Definition Audio).
headphones: Loudspeakers that fit over the ears, without a microphone. Compare with headset.
headset: Headphones with microphone. Compare with headphones.
Hz: Units for sample rate or frame rate.
high-resolution audio: Representation with greater bit depth and sample rate than CDs (stereo 16 bit PCM at 44.1 kHz) and without lossy data compression. Equivalent to HD audio. For details, refer to high-resolution audio.
interleaved: A representation for multichannel digital audio that alternates data among channels. For example, stereo digital audio expressed in interleaved format alternates left, right, left, right.
latency: Time delay as a signal passes through a system.
lossless: A lossless data compression algorithm that preserves bit accuracy across encoding and decoding, where the result of decoding previously encoded data is equivalent to the original data. Examples of lossless audio content distribution formats include CDs, PCM within WAV, and FLAC. The authoring process can reduce the bit depth or sample rate from that of the masters; distribution formats that preserve the resolution and bit accuracy of masters are the subject of high-resolution audio.
lossy: A lossy data compression algorithm that attempts to preserve the most important features of media across encoding and decoding where the result of decoding previously encoded data is perceptually similar to the original data but not identical. Examples of lossy audio compression algorithms include MP3 and AAC. As analog values are from a continuous domain and digital values are discrete, ADC and DAC are lossy conversions with respect to amplitude. See also transparency.
mono: One channel.
multichannel: See surround sound. In strict terms, stereo is more than one channel and could be considered multichannel; however, such usage is confusing and thus avoided.
mute: Temporarily force volume to be zero, independent from the usual volume controls. Compare to duck.
overrun: Audible glitch caused by failure to accept supplied data in sufficient time. For details, refer to buffer underrun. Compare to underrun.
panning: Directing a signal to a desired position within a stereo or multichannel field.
PCM: Pulse-code modulation. Most common low-level encoding of digital audio. The audio signal is sampled at a regular interval, called the sample rate, then quantized to discrete values within a particular range depending on the bit depth. For example, for 16-bit PCM, the sample values are integers between -32768 and +32767.
ramp: Gradually increase or decrease the level of a particular audio parameter, such as the volume or the strength of an effect. A volume ramp is commonly applied when pausing and resuming music to avoid a hard audible transition.
sample: Number representing the audio value for a single channel at a point in time.
sample rate or frame rate: Number of frames per second. While frame rate is more accurate, sample rate is conventionally used to mean frame rate.
sonification: Use of sound to express feedback or information, such as touch sounds and keyboard sounds.
SPL: Sound pressure level, a relative measurement of sound pressure.
stereo: Two channels. Compare to multichannel.
stereo widening: Effect applied to a stereo signal to make another stereo signal that sounds fuller and richer. The effect can also be applied to a mono signal, where it's a type of upmixing.
surround sound: Technique for increasing the ability of a listener to perceive sound position beyond stereo left and right.
transparency: Ideal result of lossy data compression. Lossy data conversion is transparent if it's perceptually indistinguishable from the original by a human subject. For details, refer to Transparency.
underrun: Audible glitch caused by failure to supply needed data in sufficient time. For details, refer to buffer underrun. Compare to overrun.
upmixing: Increasing the number of channels, such as from mono to stereo or from stereo to surround sound. Accomplished by duplication, panning, or more advanced signal processing. Compare to downmixing.
USAC: Unified Speech and Audio Coding. An audio codec for low bit-rate apps. For details, refer to Unified Speech and Audio Coding.
virtualizer: Effect that attempts to spatialize audio channels, such as trying to simulate more speakers or give the illusion that sound sources have position.
volume: Loudness, the subjective strength of an audio signal.

Interdevice interconnect

Interdevice interconnection technologies connect audio and video components between devices and are readily visible at the external connectors. The HAL implementer and end user should be aware of these terms.

Bluetooth: Short-range wireless technology. For details on the audio-related Bluetooth profiles and Bluetooth protocols, refer to A2DP for music, SCO for telephony, and Audio/Video Remote Control Profile (AVRCP).
DisplayPort: Digital display interface by the Video Electronics Standards Association (VESA).
dongle: A small gadget, especially one that hangs off another device. For details, refer to Dongle.
FireWire: See IEEE 1394.
HDMI: High-Definition Multimedia Interface. Interface for transferring audio and video data. For mobile devices, a micro-HDMI (type D) or MHL connector is used.
IEEE 1394: A serial bus used for real-time low-latency apps such as audio. Also called FireWire. For details, refer to IEEE 1394.
Intel HDA: Intel High Definition Audio (don't confuse with generic high-definition audio or high-resolution audio). Specification for a front-panel connector. For details, refer to Intel High Definition Audio.
interface: An interface converts a signal from one representation to another. Common interfaces include a USB audio interface and MIDI interface.
line level: The strength of an analog audio signal that passes between audio components, not transducers. For details, refer to Line level.
MHL: Mobile High-Definition Link. Mobile audio-video interface, often over micro-USB connector.
phone connector: Mini or sub-mini component that connects a device to wired headphones, headset, or line-level amplifier.
SlimPort: Adapter from micro-USB to HDMI.
S/PDIF: Sony/Philips Digital Interface Format. Interconnect for uncompressed PCM and IEC 61937. For details, refer to S/PDIF. S/PDIF is the consumer grade variant of AES3.
Thunderbolt: Multimedia interface that competes with USB and HDMI for connecting to high-end peripherals. For details, refer to Thunderbolt.
TOSLINK: An optical audio cable used with S/PDIF. For details, refer to TOSLINK.
USB: Universal serial bus. For details, refer to USB.

Intradevice interconnect

Intradevice interconnection technologies connect internal audio components within a given device and aren't visible without disassembling the device. The HAL implementer might need to be aware of these, but not the end user. For details on intradevice interconnections, refer to the following articles:

GPIO
I²C, for control channel
I²S, for audio data, simpler than SLIMbus
McASP
SLIMbus
SPI
AC'97
Intel HDA
SoundWire
TDM

In ALSA system on chip (ASoC), these are collectively called digital audio interfaces (DAIs).

Audio signal path

Audio signal path terms relate to the signal path that audio data follows from an app to the transducer or the transducer to an app.

ADC: Analog-to-digital converter. Module that converts an analog signal (continuous in time and amplitude) to a digital signal (discrete in time and amplitude). Conceptually, an ADC consists of a periodic sample-and-hold followed by a quantizer, although it doesn't have to be implemented that way. An ADC is usually preceded by a low-pass filter to remove any high frequency components that aren't representable using the desired sample rate. For details, refer to Analog-to-digital converter.
AP: App processor. Main general-purpose computer on a mobile device.
codec: Coder-decoder. Module that encodes and decodes an audio signal from one representation to another (typically analog to PCM or PCM to analog). In strict terms, codec is reserved for modules that both encode and decode but can be used loosely to refer to only one of these. For details, refer to Audio codec.
DAC: Digital-to-analog converter. Module that converts a digital signal (discrete in time and amplitude) to an analog signal (continuous in time and amplitude). Often followed by a low-pass filter to remove high-frequency components introduced by digital quantization. For details, refer to Digital-to-analog converter.
DSP: Digital signal processor. Optional component typically located after the app processor (for output) or before the app processor (for input). Primary purpose is to offload the app processor and provide signal processing features at a lower power cost.
PDM: Pulse-density modulation. Form of modulation used to represent an analog signal by a digital signal, where the relative density of 1s versus 0s indicates the signal level. Commonly used by digital to analog converters. For details, refer to Pulse-density modulation.
PWM: Pulse-width modulation. Form of modulation used to represent an analog signal by a digital signal, where the relative width of a digital pulse indicates the signal level. Commonly used by analog-to-digital converters. For details, refer to Pulse-width modulation.
transducer: Converts variations in physical real-world quantities to electrical signals. In audio, the physical quantity is sound pressure, and the transducers are the loudspeaker and microphone. For details, refer to Transducer.

Sample rate conversion

Sample rate conversion terms relate to the process of converting from one sampling rate to another.

downsample: Resample, where sink sample rate < source sample rate.
Nyquist frequency: Maximum frequency component that can be represented by a discretized signal at 1/2 of a given sample rate. For example, the human hearing range extends to approximately 20 kHz, so a digital audio signal must have a sample rate of at least 40 kHz to represent that range. In practice, sample rates of 44.1 kHz and 48 kHz are commonly used, with Nyquist frequencies of 22.05 kHz and 24 kHz respectively. For details, refer to Nyquist frequency and Hearing range.
resampler: Synonym for sample rate converter.
resampling: Process of converting sample rate.
sample rate converter: Module that resamples.
sink: Output of a resampler.
source: Input to a resampler.
upsample: Resample, where sink sample rate > source sample rate.

Telephony

AEC: Acoustic echo cancellation. A means to reduce echo from a signal. For details, see Echo suppression and cancellation.
ANC: Active noise control. A means to improve the quality of a primary signal by actively adding the inverse of an unwanted secondary signal. For details, see Active noise control.
dialer: The app that provides the user interface for telephony.
HCO: Hearing carry over. A a TTY mode in which a message is sent as text and received as speech.
sidetone: Audible feedback from the local microphone into the local earpiece. For details, see Sidetone.
TDD: Telecommunications device for the deaf. A specific kind of teletypewriter (TTY) for people with impaired hearing or speech.
TTY: Teletypewriter. Often used interchangeably with TDD.
UE: User equipment. The consumer phone device.
UMTS: Universal Mobile Telecommunications System. A type of mobile cellular system.
VCO: Voice carry over. A TTY mode in which a message is sent as audio and received as text.

Android-specific terms

Android-specific terms include terms used only in the Android audio framework and generic terms that have special meaning within Android.

ALSA

Advanced Linux Sound Architecture. An audio framework for Linux that has also influenced other systems. For a generic definition, refer to ALSA. In Android, ALSA refers to the kernel audio framework and drivers and not to the user-mode class. See also TinyALSA.

audio device

Audio I/O endpoint backed by a HAL implementation.

AudioEffect, AudioEffect

Implementation framework and class for output (postprocessing) effects and input (preprocessing) effects. The class is defined at android.media.audiofx.AudioEffect.

AudioFlinger

Android sound server implementation. AudioFlinger runs within the mediaserver process. For a generic definition, refer to Sound server.

audio focus

Set of APIs for managing audio interactions across multiple independent apps. For details, see Handling changes in audio output and the focus-related methods and constants of android.media.AudioManager.

AudioMixer

Module in AudioFlinger responsible for combining multiple tracks and applying attenuation (volume) and effects. For a generic definition, refer to Audio mixing (recorded music) (discusses a mixer as a hardware device or software app, rather than a software module within a system).

audio policy

Service responsible for all actions that require a policy decision to be made first, such as opening a new I/O stream, rerouting after a change, and stream volume management.

AudioRecord

Primary low-level client class for receiving data from an audio input device such as a microphone. The data is usually PCM format. The class is defined at android.media.AudioRecord.

AudioResampler

Module in AudioFlinger responsible for sample rate conversion.

audio source, AudioSource

An enumeration of constants that indicates the desired use case for capturing audio input. The class is defined at android.media.MediaRecorder.AudioSource. As of API level 21 and above, audio attributes are preferred.

AudioTrack

Primary low-level client class for sending data to an audio output device such as a speaker. The data is usually in PCM format. The class is defined at android.media.AudioTrack.

audio_utils

Audio utility library for features such as PCM format conversion, WAV file I/O, and nonblocking FIFO, which is largely independent of the Android platform.

client

Usually an app or app client. However, an AudioFlinger client can be a thread running within the mediaserver system process, such as when playing media decoded by a MediaPlayer object.

HAL

Hardware abstraction layer. HAL is a generic term in Android; in audio, it's a layer between AudioFlinger and the kernel device driver with a C API (which replaces the C++ libaudio).

FastCapture

Thread within AudioFlinger that sends audio data to lower-latency fast tracks and drives the input device when configured for reduced latency.

FastMixer

Thread within AudioFlinger that receives and mixes audio data from lower latency fast tracks and drives the primary output device when configured for reduced latency.

fast track

AudioTrack or AudioRecord client with lower latency but fewer features on some devices and routes.

MediaPlayer

Higher-level client class than AudioTrack. Plays encoded content or content that includes multimedia audio and video tracks. The class is defined at android.media.MediaPlayer.

media.log

AudioFlinger debugging feature available in custom builds only. Used for logging audio events to a circular buffer where the events can then be retroactively dumped when needed.

mediaserver

Android system process that contains media-related services, including AudioFlinger.

NBAIO

Nonblocking audio input and output. Abstraction for AudioFlinger ports. The term can be misleading as some implementations of the NBAIO API support blocking. The key implementations of NBAIO are for different types of pipes.

normal mixer

Thread within AudioFlinger that services most full-featured AudioTrack clients. Directly drives an output device or feeds its sub-mix into FastMixer using a pipe.

OpenSL ES

Audio API standard by The Khronos Group. Android versions with API level 9 and higher support a native audio API that is based on a subset of OpenSL ES 1.0.1.

pro audio

Abbreviation for the feature flag android.hardware.audio.pro. The requirements are documented in section 5.10 Professional Audio of the Android CDD. The pro in feature android.hardware.audio.pro refers to the level of predictable real-time performance, not the intended user.

real time (noun), real-time (adjective)

Real-time computing systems guarantee a response to relevant events within a required time limit. Device implementation support for real-time computing is a necessary, but insufficient, prerequisite for meeting the requirements of the android.hardware.audio.pro feature described in pro audio.

Real-time performance also has benefits in other fields beyond audio, such as gaming, graphics, camera, video, sensor processing, virtual reality (VR), and augmented reality (AR).

silent mode

User-settable feature to mute the phone ringer and notifications without affecting media playback (music, videos, games) or alarms.

SoundPool

Higher-level client class than AudioTrack. Plays sampled audio clips. Useful for triggering things such as UI feedback and game sounds. The class is defined at android.media.SoundPool.

Stagefright

A media playback engine. See Media.

StateQueue

Module within AudioFlinger responsible for synchronizing state among threads. Whereas NBAIO is used to pass data, StateQueue is used to pass control information.

strategy

Group of stream types with similar behavior. Used by the audio policy service.

stream type

Enumeration that expresses a use case for audio output. The audio policy implementation uses the stream type, along with other parameters, to determine volume and routing decisions. For a list of stream types, see android.media.AudioManager.

tee sink

See Audio debugging.

TinyALSA, tinyalsa

TinyALSA is a small user-mode API above ALSA kernel with BSD license. tinyalsa is the name of a package in the TinyALSA library. The library is recommended for HAL implementations.

ToneGenerator

Higher-level client class than AudioTrack. Plays dual-tone multi-frequency (DTMF) signals. For details, refer to Dual-tone multi-frequency signaling and the class definition at android.media.ToneGenerator.

track

Audio stream. Controlled by the AudioTrack or AudioRecord class.

volume attenuation curve

Device-specific mapping from a generic volume index to a specific attenuation factor for a given output.

volume index

Unitless integer that expresses the desired relative volume of a stream. The volume-related API elements of android.media.AudioManager operate in volume indexes rather than absolute attenuation factors.

Audio terminology Stay organized with collections Save and categorize content based on your preferences.