自 2025 年 3 月 27 日起,我们建议您使用 android-latest-release
而非 aosp-main
构建 AOSP 并为其做出贡献。如需了解详情,请参阅 AOSP 的变更。
避免优先级倒置
使用集合让一切井井有条
根据您的偏好保存内容并对其进行分类。
本文介绍了 Android 的音频系统如何尝试避免优先级倒置,还重点介绍了您可以使用的技术。
对于高性能音频应用的开发者、原始设备制造商 (OEM) 和要实现音频 HAL 的 SoC 提供商而言,这些技术可能很有用。请注意,实现这些技术不能保证一定不会出现错误或其他故障,尤其在音频环境之外使用时。使用不同的技术,获得的结果可能也会不同,您需要自己进行评估和测试。
背景
Android AudioFlinger 音频服务器和 AudioTrack/AudioRecord 客户端实现正在进行架构调整,以缩短延迟时间。这项工作从 Android 4.1 开始,在 4.2、4.3、4.4 和 5.0 中得到了进一步改进。
为了缩短延迟时间,需要对整个系统进行大量更改。其中一项重要更改是采用预测性更高的调度策略将 CPU 资源分配给对时间要求严格的线程。可靠的调度可减小音频缓冲区的大小和数目,同时仍可避免欠载和溢出。
优先级倒置
优先级倒置是实时系统的一种经典故障模式,是指优先级较高的任务因等待优先级较低的任务释放资源(如受互斥保护的共享状态)而无限长时间地受阻。
在音频系统中,优先级倒置通常表现为音频错误(咔嗒声、爆裂声、音频丢失)、音频重复(使用环形缓冲区时),或对命令的响应出现延迟。
优先级倒置的常见解决方法是增加音频缓冲区大小。不过,这种方法会增加延迟时间,仅仅将问题隐藏,而非解决问题。最好的方法是了解并防止优先级倒置,如下所示。
在 Android 音频实现中,优先级倒置最有可能发生在以下位置。因此,您应该重点关注这些位置:
-
AudioFlinger 中的常规混合器线程和快速混合器线程之间
-
快速 AudioTrack 的应用回调线程和快速混合器线程之间(它们的优先级都较高,但略有不同)
-
快速 AudioRecord 的应用回调线程和快速捕获线程之间(与上一种情况类似)
-
在音频硬件抽象层 (HAL) 实现(例如用于电话或回声消除的实现)中
-
在内核的音频驱动程序中
-
AudioTrack 或 AudioRecord 回调线程和其他应用线程(这不在我们的控制范围内)之间
常用解决方法
一般采用的解决方法包括:
停用中断在 Linux 用户空间中不可行,且不适用于对称多处理器 (SMP)。
优先级继承 Futex(快速用户区互斥)在音频系统中未被使用,因为它们属于相对重量级的应用,并且它们依赖于可信客户端。
Android 使用的技术
实验从“尝试锁定”和超时锁定开始。它们是互斥锁定操作的非阻塞和有界阻塞变体。尝试锁定和超时锁定的效果相当不错,但容易受到几个罕见故障模式的影响:如果客户端繁忙,将不能保证服务器一定能够获取共享状态;如果有一个较长序列的不相关锁定都已超时,累积的超时时间可能会过长。
我们还使用原子操作,例如:
所有这些操作均返回之前的值,并包含必要的 SMP 屏障。缺点在于它们可能需要无限次重试。在实践中,我们发现重试并不是问题。
注意:原子操作及其与内存屏障的互动遭到非常严重的误解和误用。我们在此处提供这些方法,是为了提供完整的信息;不过,建议您同时阅读 SMP Primer for Android 这篇文章,以了解更多信息。
我们仍然保有和使用上述大多数工具,并在最近添加了以下技术:
-
针对数据使用非阻塞单读取器单写入器 FIFO 队列。
-
尝试在高优先级模块和低优先级模块之间复制状态,而非共享状态。
-
如果确实需要共享状态,请将状态的大小限定为在不重试的情况下,可以在单总线操作中通过原子方式获取的字的最大大小。
-
对于复杂的多字状态,请使用状态队列。状态队列基本上只是用于状态(而非数据)的非阻塞单读取器单写入器 FIFO 队列,写入器将相邻推送合为单个推送的情况除外。
-
注意内存屏障,以保证 SMP 的准确性。
-
信任,但要验证。在进程之间共享“状态”时,请勿假定状态的格式正确无误。例如,检查索引是否在范围内。在同一个进程中的线程之间,以及在相互信任的进程(通常具有相同的 UID)之间,不需要进行验证。此外,也无需验证共享的“数据”,例如出现非继发性损坏的 PCM 音频。
非阻塞算法
非阻塞算法是我们最近一直在研究的一项内容。不过,除了单读取器单写入器 FIFO 队列之外,我们发现此类算法既复杂,又容易出错。
从 Android 4.2 开始,您可以在以下位置找到我们的非阻塞单读取器/写入器类:
-
frameworks/av/include/media/nbaio/
-
frameworks/av/media/libnbaio/
-
frameworks/av/services/audioflinger/StateQueue*
它们是专为 AudioFlinger 设计的,并不通用。非阻塞算法因其难以调试而臭名昭著。您可以将此代码视为一种模型。不过请注意,其中可能有 bug,并且我们不能保证这些类一定适用于其他用途。
对于开发者来说,应该更新部分示例 OpenSL ES 应用代码,以使用非阻塞算法或参照非 Android 开放源代码库。
我们发布了一个示例非阻塞 FIFO 实现,该实现专为应用代码设计。请查看位于平台源目录 frameworks/av/audio_utils
中的以下文件:
据我们所知,目前没有用于找出优先级倒置(尤其是事先找出优先级倒置)的自动工具。一些研究型静态代码分析工具如果能够访问整个代码库,就能找出优先级倒置。当然,如果涉及到任意用户代码(这里指应用)或用户代码是一个大型代码库(如 Linux 内核和设备驱动程序),进行静态分析可能会不切实际。最重要的是,请务必认真阅读代码,并充分理解整个系统和各种交互操作。systrace 和 ps -t -p
等工具有助于在优先级倒置发生后及时发现,但并不会提前告知您。
总结
经过上述讨论后,请不要害怕互斥。一般情况下,互斥会是您的得力助手,例如,在时间不紧急的一般使用情形中正确使用和实现互斥时。但在高优先级任务和低优先级任务之间以及在时间敏感型系统中,互斥更有可能会导致出现问题。
本页面上的内容和代码示例受内容许可部分所述许可的限制。Java 和 OpenJDK 是 Oracle 和/或其关联公司的注册商标。
最后更新时间 (UTC):2025-03-26。
[[["易于理解","easyToUnderstand","thumb-up"],["解决了我的问题","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["没有我需要的信息","missingTheInformationINeed","thumb-down"],["太复杂/步骤太多","tooComplicatedTooManySteps","thumb-down"],["内容需要更新","outOfDate","thumb-down"],["翻译问题","translationIssue","thumb-down"],["示例/代码问题","samplesCodeIssue","thumb-down"],["其他","otherDown","thumb-down"]],["最后更新时间 (UTC):2025-03-26。"],[],[],null,["# Avoid priority inversion\n\nThis article explains how the Android's audio system attempts to avoid\npriority inversion,\nand highlights techniques that you can use too.\n\n\nThese techniques may be useful to developers of high-performance\naudio apps, OEMs, and SoC providers who are implementing an audio\nHAL. Please note implementing these techniques is not\nguaranteed to prevent glitches or other failures, particularly if\nused outside of the audio context.\nYour results may vary, and you should conduct your own\nevaluation and testing.\n\nBackground\n----------\n\n\nThe Android AudioFlinger audio server and AudioTrack/AudioRecord\nclient implementation are being re-architected to reduce latency.\nThis work started in Android 4.1, and continued with further improvements\nin 4.2, 4.3, 4.4, and 5.0.\n\n\nTo achieve this lower latency, many changes were needed throughout the system. One\nimportant change is to assign CPU resources to time-critical\nthreads with a more predictable scheduling policy. Reliable scheduling\nallows the audio buffer sizes and counts to be reduced while still\navoiding underruns and overruns.\n\nPriority inversion\n------------------\n\n\n[Priority inversion](http://en.wikipedia.org/wiki/Priority_inversion)\nis a classic failure mode of real-time systems,\nwhere a higher-priority task is blocked for an unbounded time waiting\nfor a lower-priority task to release a resource such as (shared\nstate protected by) a\n[mutex](http://en.wikipedia.org/wiki/Mutual_exclusion).\n\n\nIn an audio system, priority inversion typically manifests as a\n[glitch](http://en.wikipedia.org/wiki/Glitch)\n(click, pop, dropout),\n[repeated audio](http://en.wikipedia.org/wiki/Max_Headroom_(character))\nwhen circular buffers\nare used, or delay in responding to a command.\n\n\nA common workaround for priority inversion is to increase audio buffer sizes.\nHowever, this method increases latency and merely hides the problem\ninstead of solving it. It is better to understand and prevent priority\ninversion, as seen below.\n\n\nIn the Android audio implementation, priority inversion is most\nlikely to occur in these places. And so you should focus your attention here:\n\n- between normal mixer thread and fast mixer thread in AudioFlinger\n- between application callback thread for a fast AudioTrack and fast mixer thread (they both have elevated priority, but slightly different priorities)\n- between application callback thread for a fast AudioRecord and fast capture thread (similar to previous)\n- within the audio Hardware Abstraction Layer (HAL) implementation, e.g. for telephony or echo cancellation\n- within the audio driver in kernel\n- between AudioTrack or AudioRecord callback thread and other app threads (this is out of our control)\n\nCommon solutions\n----------------\n\n\nThe typical solutions include:\n\n- disabling interrupts\n- priority inheritance mutexes\n\n\nDisabling interrupts is not feasible in Linux user space, and does\nnot work for Symmetric Multi-Processors (SMP).\n\n\nPriority inheritance\n[futexes](http://en.wikipedia.org/wiki/Futex)\n(fast user-space mutexes) are not used in the audio system because they are relatively heavyweight,\nand because they rely on a trusted client.\n\nTechniques used by Android\n--------------------------\n\n\nExperiments started with \"try lock\" and lock with timeout. These are\nnon-blocking and bounded blocking variants of the mutex lock\noperation. Try lock and lock with timeout worked fairly well but were\nsusceptible to a couple of obscure failure modes: the\nserver was not guaranteed to be able to access the shared state if\nthe client happened to be busy, and the cumulative timeout could\nbe too long if there was a long sequence of unrelated locks that\nall timed out.\n\n\nWe also use\n[atomic operations](http://en.wikipedia.org/wiki/Linearizability)\nsuch as:\n\n- increment\n- bitwise \"or\"\n- bitwise \"and\"\n\n\nAll of these return the previous value and include the necessary\nSMP barriers. The disadvantage is they can require unbounded retries.\nIn practice, we've found that the retries are not a problem.\n\n**Note:** Atomic operations and their interactions with memory barriers\nare notoriously badly misunderstood and used incorrectly. We include these methods\nhere for completeness but recommend you also read the article\n[SMP Primer for Android](https://developer.android.com/training/articles/smp.html)\nfor further information.\n\n\nWe still have and use most of the above tools, and have recently\nadded these techniques:\n\n- Use non-blocking single-reader single-writer [FIFO queues](http://en.wikipedia.org/wiki/Circular_buffer) for data.\n- Try to *copy* state rather than *share* state between high- and low-priority modules.\n- When state does need to be shared, limit the state to the maximum-size [word](http://en.wikipedia.org/wiki/Word_(computer_architecture)) that can be accessed atomically in one-bus operation without retries.\n- For complex multi-word state, use a state queue. A state queue is basically just a non-blocking single-reader single-writer FIFO queue used for state rather than data, except the writer collapses adjacent pushes into a single push.\n- Pay attention to [memory barriers](http://en.wikipedia.org/wiki/Memory_barrier) for SMP correctness.\n- [Trust, but verify](http://en.wikipedia.org/wiki/Trust,_but_verify). When sharing *state* between processes, don't assume that the state is well-formed. For example, check that indices are within bounds. This verification isn't needed between threads in the same process, between mutual trusting processes (which typically have the same UID). It's also unnecessary for shared *data* such as PCM audio where a corruption is inconsequential.\n\nNon-blocking algorithms\n-----------------------\n\n\n[Non-blocking algorithms](http://en.wikipedia.org/wiki/Non-blocking_algorithm)\nhave been a subject of much recent study.\nBut with the exception of single-reader single-writer FIFO queues,\nwe've found them to be complex and error-prone.\n\n\nStarting in Android 4.2, you can find our non-blocking,\nsingle-reader/writer classes in these locations:\n\n- frameworks/av/include/media/nbaio/\n- frameworks/av/media/libnbaio/\n- frameworks/av/services/audioflinger/StateQueue\\*\n\n\nThese were designed specifically for AudioFlinger and are not\ngeneral-purpose. Non-blocking algorithms are notorious for being\ndifficult to debug. You can look at this code as a model. But be\naware there may be bugs, and the classes are not guaranteed to be\nsuitable for other purposes.\n\n\nFor developers, some of the sample OpenSL ES application code should be updated to\nuse non-blocking algorithms or reference a non-Android open source library.\n\n\nWe have published an example non-blocking FIFO implementation that is specifically designed for\napplication code. See these files located in the platform source directory\n`frameworks/av/audio_utils`:\n\n- [include/audio_utils/fifo.h](https://android.googlesource.com/platform/system/media/+/android16-release/audio_utils/include/audio_utils/fifo.h)\n- [fifo.c](https://android.googlesource.com/platform/system/media/+/android16-release/audio_utils/fifo.c)\n- [include/audio_utils/roundup.h](https://android.googlesource.com/platform/system/media/+/android16-release/audio_utils/include/audio_utils/roundup.h)\n- [roundup.c](https://android.googlesource.com/platform/system/media/+/android16-release/audio_utils/roundup.c)\n\nTools\n-----\n\n\nTo the best of our knowledge, there are no automatic tools for\nfinding priority inversion, especially before it happens. Some\nresearch static code analysis tools are capable of finding priority\ninversions if able to access the entire codebase. Of course, if\narbitrary user code is involved (as it is here for the application)\nor is a large codebase (as for the Linux kernel and device drivers),\nstatic analysis may be impractical. The most important thing is to\nread the code very carefully and get a good grasp on the entire\nsystem and the interactions. Tools such as\n[systrace](http://developer.android.com/tools/help/systrace.html)\nand\n`ps -t -p`\nare useful for seeing priority inversion after it occurs, but do\nnot tell you in advance.\n\nA final word\n------------\n\n\nAfter all of this discussion, don't be afraid of mutexes. Mutexes\nare your friend for ordinary use, when used and implemented correctly\nin ordinary non-time-critical use cases. But between high- and\nlow-priority tasks and in time-sensitive systems mutexes are more\nlikely to cause trouble."]]