自 2025 年 3 月 27 日起,我們建議您使用 android-latest-release
而非 aosp-main
建構及貢獻 AOSP。詳情請參閱「Android 開放原始碼計畫變更」。
服務品質
透過集合功能整理內容
你可以依據偏好儲存及分類內容。
自 Android 11 起,NNAPI 可讓應用程式指示模型的相對優先順序、準備指定模型的預計最長時間,以及完成指定執行作業的預計最長時間,從而改善服務品質 (QoS)。此外,Android 11 還推出了額外的 NNAPI 錯誤值,讓服務能夠更準確地指出發生錯誤時的問題所在,以便用戶端應用程式做出更佳的回應及復原。
優先順序
在 Android 11 以上版本中,模型會以 NN HAL 1.3 的優先順序準備。這個優先順序與同一個應用程式擁有的其他已準備模型相關。優先順序較高的執行作業可使用比優先順序較低的執行作業更多的運算資源,且可搶先或剝奪優先順序較低的執行作業。
包含 Priority
做為明確引數的 NN HAL 1.3 呼叫為 IDevice::prepareModel_1_3
。請注意,IDevice::prepareModelFromCache_1_3
會在快取引數中隱含地加入 Priority
。
您可以採用多種策略來支援優先順序,具體取決於驅動程式和加速器的功能。以下提供幾種策略:
Android 可讓服務透過使用 AID (Android UID) 區分不同的呼叫應用程式。HIDL 內建機制可透過 ::android::hardware::IPCThreadState::getCallingUid
方法擷取呼叫應用程式的 UID。您可以在 libcutils/include/cutils/android_filesystem_config.h
中找到 AID 清單。
截止日期
從 Android 11 開始,您可以使用 OptionalTimePoint
期限引數啟動模型準備和執行作業。對於可預估任務所需時間的驅動程式,如果驅動程式預估任務無法在期限前完成,則可透過這個期限在任務開始前終止任務。同樣地,驅動程式可利用期限,中斷預估不會在期限前完成的進行中任務。如果工作未在期限前完成,或是期限已過,期限引數不會強制驅動程式中止工作。您可以使用 deadline 引數來釋放驅動程式中的運算資源,並比沒有 deadline 時更快地將控制權交還給應用程式。
以下是包含 OptionalTimePoint
期限做為引數的 NN HAL 1.3 呼叫:
IDevice::prepareModel_1_3
IDevice::prepareModelFromCache_1_3
IPreparedModel::execute_1_3
IPreparedModel::executeSynchronously_1_3
IPreparedModel::executeFenced
如要查看上述各個方法的截止日期功能參考實作項目,請參閱 frameworks/ml/nn/driver/sample/SampleDriver.cpp
中的 NNAPI 範例驅動程式。
錯誤代碼
Android 11 在 NN HAL 1.3 中加入四個錯誤代碼值,以改善錯誤回報功能,讓驅動程式能更妥善地傳達其狀態,並讓應用程式更順利地復原。這些是 ErrorStatus
中的錯誤代碼值。
MISSED_DEADLINE_TRANSIENT
MISSED_DEADLINE_PERSISTENT
RESOURCE_EXHAUSTED_TRANSIENT
RESOURCE_EXHAUSTED_PERSISTENT
在 Android 10 以下版本中,驅動程式只能透過 GENERAL_FAILURE
錯誤代碼指出失敗。自 Android 11 起,兩個 MISSED_DEADLINE
錯誤代碼可用來指出工作負載已因達到期限而中止,或是因為驅動程式預測工作負載不會在期限前完成。這兩個 RESOURCE_EXHAUSTED
錯誤代碼可用來指出驅動程式中的資源限制導致工作失敗,例如驅動程式沒有足夠的記憶體可用於呼叫。
兩個錯誤的 TRANSIENT
版本都表示問題是暫時性的,且日後對同一個工作呼叫的回應可能會在短暫延遲後成功。舉例來說,如果驅動程式忙於處理先前的長期執行或資源密集型工作,就應傳回這個錯誤代碼,但如果驅動程式沒有忙於處理先前的作業,新工作就會順利完成。這兩種錯誤的 PERSISTENT
版本都表示,日後對相同工作呼叫一律會失敗。舉例來說,如果驅動程式預估任務即使在完美條件下也不會在期限前完成,或是模型本身過大而超出驅動程式的資源,就應傳回這個錯誤代碼。
驗證
服務品質功能會在 NNAPI VTS 測試 (VtsHalNeuralnetworksV1_3Target
) 中進行測試。這包括一組驗證測試 (TestGenerated/ValidationTest#Test/
),可確保驅動程式拒絕無效的優先順序,以及一組名為 DeadlineTest
的測試 (TestGenerated/DeadlineTest#Test/
),可確保驅動程式正確處理截止期限。
這個頁面中的內容和程式碼範例均受《內容授權》中的授權所規範。Java 與 OpenJDK 是 Oracle 和/或其關係企業的商標或註冊商標。
上次更新時間:2025-07-27 (世界標準時間)。
[[["容易理解","easyToUnderstand","thumb-up"],["確實解決了我的問題","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["缺少我需要的資訊","missingTheInformationINeed","thumb-down"],["過於複雜/步驟過多","tooComplicatedTooManySteps","thumb-down"],["過時","outOfDate","thumb-down"],["翻譯問題","translationIssue","thumb-down"],["示例/程式碼問題","samplesCodeIssue","thumb-down"],["其他","otherDown","thumb-down"]],["上次更新時間:2025-07-27 (世界標準時間)。"],[],[],null,["# Quality of service\n\n| **Deprecated:** Starting in Android 15, the\n| [NNAPI (NDK API)](https://developer.android.com/ndk/guides/neuralnetworks) is deprecated. The Neural Networks HAL interface\n| continues to be supported.\n|\n| For more information, see the\n| [NNAPI Migration Guide](https://developer.android.com/ndk/guides/neuralnetworks/migration-guide).\n\nStarting from Android 11, the NNAPI offers better\nquality of service (QoS) by allowing an app to indicate the relative priorities\nof its models, the maximum amount of time expected for a given model to be\nprepared, and the maximum amount of time expected for a given execution to be\ncompleted. Further, Android 11 introduces\nadditional NNAPI error values enabling a service to more accurately indicate\nwhat went wrong when a failure occurs so that the client app can better react\nand recover.\n\nPriority\n--------\n\nFor Android 11 or higher, models are prepared with a\npriority in the NN HAL 1.3. This priority is relative to other prepared models\nowned by the same app. Higher-priority executions can use more compute resources\nthan lower-priority executions, and can preempt or starve lower-priority\nexecutions.\n\nThe NN HAL 1.3 call that includes `Priority` as an explicit argument is\n[`IDevice::prepareModel_1_3`](https://cs.android.com/android/platform/superproject/+/android-latest-release:hardware/interfaces/neuralnetworks/1.3/IDevice.hal;l=199).\nNote that\n[`IDevice::prepareModelFromCache_1_3`](https://cs.android.com/android/platform/superproject/+/android-latest-release:hardware/interfaces/neuralnetworks/1.3/IDevice.hal;l=294)\nimplicitly includes `Priority` in the cache arguments.\n\nThere are many possible strategies for supporting priorities depending on the\ncapabilities of the driver and accelerator. Here are several strategies:\n\n- For drivers that have built-in priority support, directly propagate the `Priority` field to the accelerator.\n- Use a per-app priority queue to support different priorities even before an execution reaches the accelerator.\n- Pause or cancel low-priority models that are being executed to\n free the accelerator to execute high-priority models. Do this by either\n inserting *checkpoints* in low-priority models that, when reached, query\n a flag to determine whether the current execution should be halted\n prematurely or by partitioning the model into *submodels* and querying the\n flag between submodel executions. Note that the use of checkpoints or\n submodels in models prepared with a priority can introduce additional\n overhead that isn't present for models without a priority in versions\n lower than NN HAL 1.3.\n\n - To support preemption, preserve the execution context including the next operation or sub-model to be executed and any relevant intermediate operand data. Use this execution context to resume the execution at a later time.\n - Full preemption support isn't necessary, so the execution context doesn't need to be preserved. Because NNAPI model executions are deterministic, executions can be restarted from scratch at a later time.\n\nAndroid enables services to differentiate between different calling apps through\nthe use of an AID (Android UID). HIDL has built-in mechanisms to retrieve the\ncalling app's UID through the method\n`::android::hardware::IPCThreadState::getCallingUid`. A list of AIDs can be\nfound in\n[`libcutils/include/cutils/android_filesystem_config.h`](https://android.googlesource.com/platform/system/core/+/android16-release/libcutils/include/cutils/android_filesystem_config.h).\n\nDeadlines\n---------\n\nStarting from Android 11, model preparation and\nexecutions can be launched with an `OptionalTimePoint` deadline argument. For\ndrivers that can estimate how long a task takes, this deadline allows the driver\nto abort the task before it starts if the driver estimates that the task can't\nbe completed before the deadline. Similarly, the deadline allows the driver to\nabort an ongoing task that it estimates won't be completed before the deadline.\nThe deadline argument doesn't force a driver to abort a task if the task isn't\ncomplete by the deadline or if the deadline has passed. The deadline argument\ncan be used to free up compute resources within the driver and return control\nto the app faster than is possible without the deadline.\n\nThe NN HAL 1.3 calls that include `OptionalTimePoint` deadlines as an argument\nare:\n\n- `IDevice::prepareModel_1_3`\n- `IDevice::prepareModelFromCache_1_3`\n- `IPreparedModel::execute_1_3`\n- `IPreparedModel::executeSynchronously_1_3`\n- `IPreparedModel::executeFenced`\n\nTo see a reference implementation of the deadline feature for each of the above\nmethods, see the NNAPI sample driver at\n[`frameworks/ml/nn/driver/sample/SampleDriver.cpp`](https://cs.android.com/android/platform/superproject/+/android-latest-release:packages/modules/NeuralNetworks/driver/sample_hidl/SampleDriver.cpp?q=SampleDriver.cpp).\n\nError codes\n-----------\n\nAndroid 11 includes four error code values in\nNN HAL 1.3 to improve error reporting, allowing drivers to better communicate\ntheir state and apps to recover more gracefully. These are the error code\nvalues in `ErrorStatus`.\n\n- `MISSED_DEADLINE_TRANSIENT`\n- `MISSED_DEADLINE_PERSISTENT`\n- `RESOURCE_EXHAUSTED_TRANSIENT`\n- `RESOURCE_EXHAUSTED_PERSISTENT`\n\nIn Android 10 or lower, a driver could only indicate a failure through the\n`GENERAL_FAILURE` error code. From Android 11, the\ntwo `MISSED_DEADLINE` error codes can be used to indicate that the workload was\naborted because the deadline was reached or because the driver predicted the\nworkload wouldn't complete by the deadline. The two `RESOURCE_EXHAUSTED` error\ncodes can be used to indicate that the task failed because of a resource\nlimitation within the driver, such as the driver not having enough memory for\nthe call.\n\nThe `TRANSIENT` version of both errors indicates that the problem is temporary,\nand that future calls to the same task might succeed after a short delay. For\nexample, this error code should be returned when the driver is busy with prior\nlong-running or resource-intensive work, but that the new task would complete\nsuccessfully if the driver wasn't busy with the prior work. The `PERSISTENT`\nversion of both errors indicates that future calls to the same task are always\nexpected to fail. For example, this error code should be returned when the\ndriver estimates the task wouldn't complete by the deadline even under perfect\nconditions, or that the model is inherently too large and exceeds the driver's\nresources.\n\nValidation\n----------\n\nThe quality of service functionality is tested in the NNAPI VTS tests\n(`VtsHalNeuralnetworksV1_3Target`). This includes a set of tests for validation\n(`TestGenerated/ValidationTest#Test/`) to ensure that the driver rejects invalid\npriorities and a set of tests called `DeadlineTest`\n(`TestGenerated/DeadlineTest#Test/`) to ensure that the driver handles deadlines\ncorrectly."]]