Starting March 27, 2025, we recommend using android-latest-release instead of aosp-main to build and contribute to AOSP. For more information, see Changes to AOSP.
Stay organized with collections
Save and categorize content based on your preferences.
This page describes best practices for implementing Neural Networks API (NNAPI)
drivers to allow for broad adoption of the NNAPI by app developers.
Keep startup times short
If your driver transforms the weights of a model on first use, make sure the
driver supports compilation caching, which reduces the time used for compilation
when an app starts. This is important as apps might avoid using hardware
acceleration if start-up times are too long. For example, some apps have
more than 100 MB of weights and transforming these each time the app
launches is wasteful.
Reduce minimal latency
To ensure that models use hardware acceleration, it's important to reduce the
minimal latency in drivers. Many apps use small models that are executed
multiple times and if the minimal latency to execute a workload is too high,
such as a few milliseconds, models might run the workload on the CPU, which only
takes one or two milliseconds, instead of
using hardware accelerations. Be careful of costly thread synchronization.
Use the NN HAL SchedTune group
From Android 11 or higher, AOSP includes a dedicated
NN HAL
SchedTune
group that allows interprocess NN HAL processes to use big
cores, similar to same-process implementation within the predefined
top-appcgroup. Using this
SchedTune group reduces driver overhead, especially for small models.
To use the SchedTune group, add the following line to the init.rc file of
the NN HAL process:
writepid /dev/stune/nnapi-hal/tasks
Content and code samples on this page are subject to the licenses described in the Content License. Java and OpenJDK are trademarks or registered trademarks of Oracle and/or its affiliates.
Last updated 2025-06-12 UTC.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-06-12 UTC."],[],[],null,["# NNAPI driver implementation best practices\n\n| **Deprecated:** Starting in Android 15, the\n| [NNAPI (NDK API)](https://developer.android.com/ndk/guides/neuralnetworks) is deprecated. The Neural Networks HAL interface\n| continues to be supported.\n|\n| For more information, see the\n| [NNAPI Migration Guide](https://developer.android.com/ndk/guides/neuralnetworks/migration-guide).\n\nThis page describes best practices for implementing Neural Networks API (NNAPI)\ndrivers to allow for broad adoption of the NNAPI by app developers.\n\nKeep startup times short\n------------------------\n\nIf your driver transforms the weights of a model on first use, make sure the\ndriver supports compilation caching, which reduces the time used for compilation\nwhen an app starts. This is important as apps might avoid using hardware\nacceleration if start-up times are too long. For example, some apps have\nmore than 100 MB of weights and transforming these each time the app\nlaunches is wasteful.\n\nReduce minimal latency\n----------------------\n\nTo ensure that models use hardware acceleration, it's important to reduce the\nminimal latency in drivers. Many apps use small models that are executed\nmultiple times and if the minimal latency to execute a workload is too high,\nsuch as a few milliseconds, models might run the workload on the CPU, which only\ntakes one or two milliseconds, instead of\nusing hardware accelerations. Be careful of costly thread synchronization.\n\nUse the NN HAL SchedTune group\n------------------------------\n\nFrom Android 11 or higher, AOSP includes a dedicated\nNN HAL\n[SchedTune](https://android.googlesource.com/kernel/msm/+/android-msm-marlin-3.18-nougat-dr1/Documentation/scheduler/sched-tune.txt)\ngroup that allows interprocess NN HAL processes to use big\ncores, similar to same-process implementation within the predefined\n`top-app` [cgroup](/docs/core/perf/cgroups). Using this\nSchedTune group reduces driver overhead, especially for small models.\n\nTo use the SchedTune group, add the following line to the `init.rc` file of\nthe NN HAL process: \n\n writepid /dev/stune/nnapi-hal/tasks"]]