Thermal mitigation is needed when a device starts to overheat. As algorithm complexity, system core frequencies, and levels of integration continually increase, with packaging and form-factor sizes decreasing, thermal mitigation has become increasingly important.
Android 10 introduced a thermal system in the Android framework and a new version of the HAL that abstracts the interface to the thermal subsystem hardware devices. The hardware interface includes temperature sensors and thermistors for the skin, battery, GPU, CPU, and USB port. The device skin temperature is the most important one to track to keep the device surface temperature within specified thermal limits.
With the Android framework, device manufacturers and app developers can use
thermal data to ensure a consistent UX if a device begins to overheat. For
example, when a system undergoes thermal stress, jobscheduler
jobs
get throttled, and if necessary, a framework thermal shutdown is initiated.
Apps that receive thermal-stress notifications through a registered callback
(in the PowerManager
class) can gracefully adjust their UX.
Thermal HAL
Android 9 and lower utilized a polling interface defined in Thermal
HAL 1.0
to obtain temperature readings. The legacy Thermal HAL
allowed the Android framework and other trusted clients (such as a device
manufacturer's HAL) to read the current temperature and product-policy specific
throttling and shutdown thresholds for each sensor through the same API.
Thermal HAL 2.0
, introduced in Android 10,
provides multiple clients with thermal sensor readings and associated severity
levels to indicate thermal stress. Figure 1 shows two warning
messages from the Android System UI, created based on
IThermalEventListener
for the USB_PORT and
SKIN sensors, respectively, when they reach the
EMERGENCY
severity level.

The current temperatures are retrieved for the different
types
of thermal sensors through
IThermal HAL. Each function call returns a status value of either
SUCCESS
or FAILURE
. If SUCCESS
is
returned, the process continues. If FAILURE
is returned, an error
message (which must be human-readable) goes to
status.debugMessage
.
Besides being a polling interface that returns the current temperatures, the
HIDL callback
IThermalChangedCallback
can be used with the callback interface from
thermal HAL clients, such as the framework’s thermal service. For example,
RegisterIThermalChangedCallback
and
UnregisterIThermalChangedCallback
register/unregister
severity-changed events. If the thermal severity of a given sensor has changed,
notifyThrottling
sends a thermal throttling event callback to
thermal-event listeners.
In addition to thermal sensor information, a list of the cooling devices
that have undergone mitigation is exposed in
getCurrentCoolingDevices
. The list order is persistent, even if a
cooling device has gone offline. Device manufacturers can use it to collect
statsd
metrics.
For more information, see the Reference implementation. While you may add your own extensions, you must never disable the thermal mitigation function.
Thermal service
In Android 10, the thermal service in the framework provides constant
monitoring using the various mitigation signals from Thermal HAL 2.0
, and gives
throttling severity feedback to its clients. These include internal components
and Android apps. The service utilizes two binder callback interfaces,
IThermalEventListener
and IThermalStatusListener
,
exposed as callbacks. The former is for internal platform and device
manufacturer use, and the latter is for Android apps.
Through the callback interfaces, a device’s
current thermal status is retrievable as an integer value
ranging from 0x00000000
(no throttling) to 0x00000006
(device shutdown). Only a trusted system service, such as an
Android API or device manufacturer API, can access the detailed
thermal sensor and thermal event information. Figure 2
provides a model of the thermal mitigation process flow in
Android 10.

Device manufacturer guidelines
Device manufacturers must implement the HIDL aspect of
Thermal HAL 2.0 (as provided in IThermal.hal
)
to report device temperature sensor and throttling status.
If you’re a developer, use this to enhance app UX under thermal stress.
Anything that throttles device performance, including battery power constraints, must be reported through the thermal HAL. To ensure this happens, put all sensors that may indicate a need for mitigation (based on status changes) into the thermal HAL, and report the severity of any mitigation actions taken. The temperature value returned from a sensor reading doesn’t have to be the actual temperature, so long as it accurately reflects the corresponding severity threshold. For example, you may pass different numerical values instead of your actual temperature threshold values, or you may build guardbanding into threshold specifications to provide hysteresis. However, the severity corresponding to that value must match what’s needed at that threshold. (For example, you may decide to return 72°C as your critical temperature threshold, when in reality the actual temperature is 65°C, and it corresponds to the critical severity you specified.) The severity level must always be accurate for best thermal framework functionality.
To read more about the threshold levels in the framework and how they correspond to mitigation actions, see the usage suggestions for each thermal status code.
Thermal API
Apps can add and remove listeners, and access thermal status information
through the
PowerManager
class. The IThermal
interface provides all
the functionality needed, including returning the thermal status values. The
IThermal binder interface
is wrapped as the OnThermalStatusChangedListener
interface, which apps can use when registering or removing thermal status
listeners.
The Android thermal APIs have both callback and
polling methods for apps to be notified of the thermal
severity levels through status codes, which are
defined
in the PowerManager
class. The methods are
getCurrentThermalStatus()
returns the current thermal status of the device as an integer, unless the device is undergoing throttling.addThermalStatusListener()
adds a listener.removeThermalStatusListener()
removes a previously added listener.
The status codes translate to specific throttling levels, which can be used for gathering data, and for designing an optimal UX. For example, apps may receive a status of 0x0 (NONE), which may later change to 0x1 (LIGHT). Marking the 0x0 state as t0, then measuring the time lapsed from status NONE to status LIGHT (t1) enables device manufacturers to design and test mitigation strategies for specific use cases. You may want to use the thermal status codes in the ways suggested below.
Thermal status code | Description and suggested use |
---|---|
NONE (0x0) | No throttling. Use this status to implement protective actions, such as detecting the start of the time period (t0 to t1) from NONE (0x0) to LIGHT (0x1). |
LIGHT (0x1) | Light throttling. UX isn't impacted. Use gentle device mitigation for this stage. For example, skip boosting or using inefficient frequencies, but only on big cores. |
MODERATE (0x2) | Moderate throttling. UX isn't greatly impacted. Thermal mitigation impacts foreground activities, so apps should reduce power immediately. |
SEVERE (0x3) | Severe throttling. UX is largely impacted. In this stage, device thermal mitigation should limit the system capacity. This may cause side effects, such as display jank and audio jitter. |
CRITICAL (0x4) | Platform has done everything to reduce power. The device thermal mitigation software has placed all components to run at their lowest capacity. |
EMERGENCY (0x5) | Key components in the platform are shutting down due to thermal conditions. Device functionalities are limited. This is the last warning before device shutdown. At this stage some functions, such as the modem, cellular data are turned off completely. |
SHUTDOWN (0x6) | Shutdown immediately. Due to the severity of this stage, apps may not be able to receive this notification. |
API Testing
Device manufacturers must pass the VTS test for thermal HAL, and may
use emul_temp
from the
kernel sysfs
interface to simulate temperature changes.