A/B system updates, also known as seamless updates, ensure a workable booting system remains on the disk during an over-the-air (OTA) update. This approach reduces the likelihood of an inactive device after an update, which means fewer device replacements and device reflashes at repair and warranty centers. Other commercial-grade operating systems such as ChromeOS also use A/B updates successfully.
A/B system updates provide the following benefits:
- OTA updates can occur while the system is running, without interrupting the user (including app optimizations that occur after a reboot). This means users can continue to use their devices during an OTA—the only downtime during an update is when the device reboots into the updated disk partition.
- If an OTA fails, the device boots into the pre-OTA disk partition and remains usable. The download of the OTA can be attempted again.
- Any errors (such as I/O errors) affect only the unused partition set and can be retried. Such errors also become less likely because the I/O load is deliberately low to avoid degrading the user experience.
- Updates can be streamed to A/B devices, removing the need to download the
package before installing it. Streaming means it's not necessary for the
user to have enough free space to store the update package on
- The cache partition is no longer used to store OTA update packages, so there is no need for sizing the cache partition.
- dm-verity guarantees a device will boot an uncorrupted image. If a device doesn't boot due to a bad OTA or dm-verity issue, the device can reboot into an old image. (Android Verified Boot does not require A/B updates.)
About A/B system updates
A/B system updates affect the following:
- Partition selection (slots), the
update_enginedaemon, and bootloader interactions (described below)
- Build process and OTA update package generation (described in Implementing A/B Updates)
Partition selection (slots)
A/B system updates use two sets of partitions referred to as slots (normally slot A and slot B). The system runs from the current slot while the partitions in the unused slot are not accessed by the running system during normal operation. This approach makes updates fault resistant by keeping the unused slot as a fallback: If an error occurs during or immediately after an update, the system can rollback to the old slot and continue to have a working system. To achieve this goal, no partition used by the current slot should be updated as part of the OTA update (including partitions for which there is only one copy).
Each slot has a bootable attribute that states whether the slot contains a correct system from which the device can boot. The current slot is bootable when the system is running, but the other slot may have an old (still correct) version of the system, a newer version, or invalid data. Regardless of what the current slot is, there is one slot that is the active slot (the one the bootloader will boot form on the next boot) or the preferred slot.Each slot also has a successful attribute set by the user space, which is relevant only if the slot is also bootable. A successful slot should be able to boot, run, and update itself. A bootable slot that was not marked as successful (after several attempts were made to boot from it) should be marked as unbootable by the bootloader, including changing the active slot to another bootable slot (normally to the slot running immediately before the attempt to boot into the new, active one). The specific details of the interface are defined in
Update engine daemon
A/B system updates use a background daemon called
to prepare the system to boot into a new, updated version. This daemon can
perform the following actions:
- Read from the current slot A/B partitions and write any data to the unused slot A/B partitions as instructed by the OTA package.
- Call the
boot_controlinterface in a pre-defined workflow.
- Run a post-install program from the new partition after writing all the unused slot partitions, as instructed by the OTA package. (For details, see Post-installation).
update_engine daemon is not involved in the boot process
itself, it is limited in what it can do during an update by the
SELinux policies and features in the
current slot (such policies and features can't be updated until the
system boots into a new version). To maintain a robust system, the update
process should not modify the partition table, the contents of
partitions in the current slot, or the contents of non-A/B partitions that can't
be wiped with a factory reset.
update_engine source is located in
The A/B OTA dexopt files are split between
installd and a package
frameworks/native/cmds/installd/ota* includes the postinstall script, the binary for chroot, the installd clone that calls dex2oat, the post-OTA move-artifacts script, and the rc file for the move script.
OtaDexoptShellCommand) is the package manager that prepares dex2oat commands for applications.
For a working example, refer to
boot_control HAL is used by
possibly other daemons) to instruct the bootloader what to boot from. Common
example scenarios and their associated states include the following:
- Normal case: The system is running from its current slot, either slot A or B. No updates have been applied so far. The system's current slot is bootable, successful, and the active slot.
- Update in progress: The system is running from slot B, so slot B is the bootable, successful, and active slot. Slot A was marked as unbootable since the contents of slot A are being updated but not yet completed. A reboot in this state should continue booting from slot B.
- Update applied, reboot pending: The system is running from slot B, slot B is bootable and successful, but slot A was marked as active (and therefore is marked as bootable). Slot A is not yet marked as successful and some number of attempts to boot from slot A should be made by the bootloader.
- System rebooted into new update: The system is running from slot A for the first time, slot B is still bootable and successful while slot A is only bootable, and still active but not successful. A user space daemon should mark slot A as successful after some checks are made.
Streaming update support
User devices don't always have enough space on
/data to download
the update package. As neither OEMs nor users want to waste space on a
/cache partition, some users go without updates because the device
has nowhere to store the update package. To address this issue, Android 8.0
added support for streaming A/B updates that write blocks directly to the B
partition as they are downloaded, without having to store the blocks on
/data. Streaming A/B updates need almost no temporary storage and
require just enough storage for roughly 100 KiB of metadata.
To enable streaming updates in Android 7.1, cherrypick the following patches:
- Allow to cancel a proxy resolution request
- Fix terminating a transfer while resolving proxies
- Add unittest for TerminateTransfer between ranges
- Cleanup the RetryTimeoutCallback()
These patches are required to support streaming A/B updates in Android 7.1 whether using Google Mobile Services (GMS) or any other update client.
Life of an A/B update
The update process starts when an OTA package (referred to in code as a payload) is available for downloading. Policies in the device may defer the payload download and application based on battery level, user activity, charging status, or other policies. In addition, because the update runs in the background, users might not know an update is in progress. All of this means the update process might be interrupted at any point due to policies, unexpected reboots, or user actions.
Optionally, metadata in the OTA package itself indicates the update can be
streamed; the same package can also be used for non-streaming installation. The
server may use the metadata to tell the client it's streaming so the client will
hand off the OTA to
update_engine correctly. Device manufacturers
with their own server and client can enable streaming updates by ensuring the
server identifies the update is streaming (or assumes all updates are streaming)
and the client makes the correct call to
streaming. Manufacturers can use the fact that the package is of the streaming
variant to send a flag to the client to trigger hand off to the framework side
After a payload is available, the update process is as follows:
|1||The current slot (or "source slot") is marked as successful (if not already
|2||The unused slot (or "target slot") is marked as unbootable by calling the
The update payload is an opaque blob with the instructions to update to the new version. The update payload consists of the following:
|3||The payload metadata is downloaded.|
|4||For each operation defined in the metadata, in order, the associated data (if any) is downloaded to memory, the operation is applied, and the associated memory is discarded.|
|5||The whole partitions are re-read and verified against the expected hash.|
|6||The post-install step (if any) is run. In the case of an error during the execution of any step, the update fails and is re-attempted with possibly a different payload. If all the steps so far have succeeded, the update succeeds and the last step is executed.|
|7||The unused slot is marked as active by calling
|8||Post-installation (described below) involves running a program from the
"new update" version while still running in the old version. If defined in the
OTA package, this step is mandatory and the program must return
with exit code
For every partition where a post-install step is defined,
update_engine mounts the new partition into a specific location and
executes the program specified in the OTA relative to the mounted partition. For
example, if the post-install program is defined as
usr/bin/postinstall in the system partition, this partition from
the unused slot will be mounted in a fixed location (such as
/postinstall_mount) and the
/postinstall_mount/usr/bin/postinstall command is executed.
For post-installation to succeed, the old kernel must be able to:
- Mount the new filesystem format. The filesystem type cannot change unless there's support for it in the old kernel, including details such as the compression algorithm used if using a compressed filesystem (i.e. SquashFS).
- Understand the new partition's post-install program format.
If using an Executable and Linkable Format (ELF) binary, it should be compatible
with the old kernel (e.g. a 64-bit new program running on an old 32-bit kernel
if the architecture switched from 32- to 64-bit builds). Unless the loader
ld) is instructed to use other paths or build a static binary, libraries will be loaded from the old system image and not the new one.
For example, you could use a shell script as a post-install program
(interpreted by the old system's shell binary with a
#! marker at
the top), then set up library paths from the new environment for executing a
more complex binary post-install program. Alternatively, you could run the
post-install step from a dedicated smaller partition to enable the filesystem
format in the main system partition to be updated without incurring backward
compatibility issues or stepping-stone updates; this would allow users to update
directly to the latest version from a factory image.
The new post-install program is limited by the SELinux policies defined in the old system. As such, the post-install step is suitable for performing tasks required by design on a given device or other best-effort tasks (i.e. updating the A/B-capable firmware or bootloader, preparing copies of databases for the new version, etc.). The post-install step is not suitable for one-off bug fixes before reboot that require unforeseen permissions.
The selected post-install program runs in the
SELinux context. All the files in the new mounted partition will be tagged with
postinstall_file, regardless of what their attributes are after
rebooting into that new system. Changes to the SELinux attributes in the new
system won't impact the post-install step. If the post-install program needs
extra permissions, those must be added to the post-install context.
Frequently asked questions
Has Google used A/B OTAs on any devices?
Yes. The marketing name for A/B updates is seamless updates. Pixel
and Pixel XL phones from October 2016 shipped with A/B, and all Chromebooks use
update_engine implementation of A/B. The necessary
platform code implementation is public in Android 7.1 and higher.
Why are A/B OTAs better?
A/B OTAs provide a better user experience when taking updates. Measurements from monthly security updates show this feature has already proven a success: As of May 2017, 95% of Pixel owners are running the latest security update after a month compared to 87% of Nexus users, and Pixel users update sooner than Nexus users. Failures to update blocks during an OTA no longer result in a device that won't boot; until the new system image has successfully booted, Android retains the ability to fall back to the previous working system image.
How did A/B affect the 2016 Pixel partition sizes?
The following table contains details on the shipping A/B configuration versus the internally-tested non-A/B configuration:
|Pixel partition sizes||A/B||Non-A/B|
A/B updates require an increase of only 320 MiB in flash, with a savings of 32MiB from removing the recovery partition and another 100MiB preserved by removing the cache partition. This balances the cost of the B partitions for the bootloader, the boot partition, and the radio partition. The vendor partition doubled in size (the vast majority of the size increase). Pixel's A/B system image is half the size of the original non-A/B system image.
For the Pixel A/B and non-A/B variants tested internally (only A/B shipped), the space used differed by only 320MiB. On a 32GiB device, this is just under 1%. For a 16GiB device this would be less than 2%, and for an 8GiB device almost 4% (assuming all three devices had the same system image).
Why didn't you use SquashFS?
We experimented with SquashFS but weren't able to achieve the performance desired for a high-end device. We don't use or recommend SquashFS for handheld devices.
More specifically, SquashFS provided about 50% size savings on the system partition, but the overwhelming majority of the files that compressed well were the precompiled .odex files. Those files had very high compression ratios (approaching 80%), but the compression ratio for the rest of the system partition was much lower. In addition, SquashFS in Android 7.0 raised the following performance concerns:
- Pixel has very fast flash compared to earlier devices but not a huge number of spare CPU cycles, so reading fewer bytes from flash but needing more CPU for I/O was a potential bottleneck.
- I/O changes that perform well on an artificial benchmark run on an unloaded system sometimes don't work well on real-world use cases under real-world load (such as crypto on Nexus 6).
- Benchmarking showed 85% regressions in some places.
As SquashFS matures and adds features to reduce CPU impact (such as a whitelist of commonly-accessed files that shouldn't be compressed), we will continue to evaluate it and offer recommendations to device manufacturers.
How did you halve the size of the system partition without SquashFS?
Applications are stored in .apk files, which are actually ZIP archives. Each .apk file has inside it one or more .dex files containing portable Dalvik bytecode. An .odex file (optimized .dex) lives separately from the .apk file and can contain machine code specific to the device. If an .odex file is available, Android can run applications at ahead-of-time compiled speeds without having to wait for the code to be compiled each time the application is launched. An .odex file isn't strictly necessary: Android can actually run the .dex code directly via interpretation or Just-In-Time (JIT) compilation, but an .odex file provides the best combination of launch speed and run-time speed if space is available.
Example: For the installed-files.txt from a Nexus 6P running Android 7.1 with a total system image size of 2628MiB (2755792836 bytes), the breakdown of the largest contributors to overall system image size by file type is as follows:
|.so (native C/C++ code)||202162479 bytes||7.3%|
|.oat files/.art images||163892188 bytes||5.9%|
|icu locale data||27468687 bytes||0.9%|
These figures are similar for other devices too, so on Nexus/Pixel
devices, .odex files take up approximately half the system partition. This meant
we could continue to use ext4 but write the .odex files to the B partition
at the factory and then copy them to
/data on first boot. The
actual storage used with ext4 A/B is identical to SquashFS A/B, because if we
had used SquashFS we would have shipped the preopted .odex files on system_a
instead of system_b.
Doesn't copying .odex files to /data mean the space saved on /system is lost on /data?
Not exactly. On Pixel, most of the space taken by .odex files is for apps,
which typically exist on
/data. These apps take Google Play
updates, so the .apk and .odex files on the system image are unused for most of
the life of the device. Such files can be excluded entirely and replaced by
small, profile-driven .odex files when the user actually uses each app (thus
requiring no space for apps the user doesn't use). For details, refer to the
Google I/O 2016 talk The
Evolution of Art.
The comparison is difficult for a few key reasons:
- Apps updated by Google Play have always had their .odex files on
/dataas soon as they receive their first update.
- Apps the user doesn't run don't need an .odex file at all.
- Profile-driven compilation generates smaller .odex files than ahead-of-time compilation (because the former optimizes only performance-critical code).
For details on the tuning options available to OEMs, see Configuring ART.
Aren't there two copies of the .odex files on /data?
It's a little more complicated ... After the new system image has been
written, the new version of dex2oat is run against the new .dex files to
generate the new .odex files. This occurs while the old system is still running,
so the old and new .odex files are both on
/data at the same time.
The code in OtaDexoptService
getAvailableSpace before optimizing each package to avoid
/data. Note that available here is still
conservative: it's the amount of space left before hitting the usual
system low space threshold (measured as both a percentage and a byte count). So
/data is full, there won't be two copies of every .odex file.
The same code also has a BULK_DELETE_THRESHOLD: If the device gets that close
to filling the available space (as just described), the .odex files belonging to
apps that aren't used are removed. That's another case without two copies of
every .odex file.
In the worst case where
/data is completely full, the update
waits until the device has rebooted into the new system and no longer needs the
old system's .odex files. The PackageManager handles this:
frameworks/base/+/nougat-mr1-release/services/core/java/com/android/server/pm/PackageManagerService.java#7215). After the new system has
can remove the .odex files that were used by the old system, returning the
device back to the steady state where there's only one copy.
So, while it is possible that
/data contains two copies of all
the .odex files, (a) this is temporary and (b) only occurs if you had plenty of
free space on
/data anyway. Except during an update, there's only
one copy. And as part of ART's general robustness features, it will never fill
/data with .odex files anyway (because that would be a problem on a
non-A/B system too).
Doesn't all this writing/copying increase flash wear?
Only a small portion of flash is rewritten: a full Pixel system update writes about 2.3GiB. (Apps are also recompiled, but that's true of non-A/B too.) Traditionally, block-based full OTAs wrote a similar amount of data, so flash wear rates should be similar.
Does flashing two system partitions increase factory flashing time?
No. Pixel didn't increase in system image size (it merely divided the space across two partitions).
Doesn't keeping .odex files on B make rebooting after factory data reset slow?
Yes. If you've actually used a device, taken an OTA, and performed a factory
data reset, the first reboot will be slower than it would otherwise be (1m40s vs
40s on a Pixel XL) because the .odex files will have been lost from B after the
first OTA and so can't be copied to
/data. That's the trade-off.
Factory data reset should be a rare operation when compared to regular boot
so the time taken is less important. (This doesn't affect users or reviewers who
get their device from the factory, because in that case the B partition is
available.) Use of the JIT compiler means we don't need to recompile
everything, so it's not as bad as you might think. It's also possible
to mark apps as requiring ahead-of-time compilation using
coreApp="true" in the manifest:
This is currently used by
system_server because it's not allowed to
JIT for security reasons.
Doesn't keeping .odex files on /data rather than /system make rebooting after an OTA slow?
No. As explained above, the new dex2oat is run while the old system image is still running to generate the files that will be needed by the new system. The update isn't considered available until that work has been done.
Can (should) we ship a 32GiB A/B device? 16GiB? 8GiB?
32GiB works well as it was proven on Pixel, and 320MiB out of 16GiB means a reduction of 2%. Similarly, 320MiB out of 8GiB a reduction of 4%. Obviously A/B would not be the recommended choice on devices with 4GiB, as the 320MiB overhead is almost 10% of the total available space.
Does AVB2.0 require A/B OTAs?
No. Android Verified Boot has always required block-based updates, but not necessarily A/B updates.
Do A/B OTAs require AVB2.0?
Do A/B OTAs break AVB2.0's rollback protection?
No. There's some confusion here because if an A/B system fails to boot into the new system image it will (after some number of retries determined by your bootloader) automatically revert to the "previous" system image. The key point here though is that "previous" in the A/B sense is actually still the "current" system image. As soon as the device successfully boots a new image, rollback protection kicks in and ensures that you can't go back. But until you've actually successfully booted the new image, rollback protection doesn't consider it to be the current system image.
If you're installing an update while the system is running, isn't that slow?
With non-A/B updates, the aim is to install the update as quickly as possible because the user is waiting and unable to use their device while the update is applied. With A/B updates, the opposite is true; because the user is still using their device, as little impact as possible is the goal, so the update is deliberately slow. Via logic in the Java system update client (which for Google is GmsCore, the core package provided by GMS), Android also attempts to choose a time when the users aren't using their devices at all. The platform supports pausing/resuming the update, and the client can use that to pause the update if the user starts to use the device and resume it when the device is idle again.
There are two phases while taking an OTA, shown clearly in the UI as Step 1 of 2 and Step 2 of 2 under the progress bar. Step 1 corresponds with writing the data blocks, while step 2 is pre-compiling the .dex files. These two phases are quite different in terms of performance impact. The first phase is simple I/O. This requires little in the way of resources (RAM, CPU, I/O) because it's just slowly copying blocks around.
The second phase runs dex2oat to precompile the new system image. This obviously has less clear bounds on its requirements because it compiles actual apps. And there's obviously much more work involved in compiling a large and complex app than a small and simple app; whereas in phase 1 there are no disk blocks that are larger or more complex than others.
The process is similar to when Google Play installs an app update in the background before showing the 5 apps updated notification, as has been done for years.
What if a user is actually waiting for the update?
The current implementation in GmsCore doesn't distinguish between background updates and user-initiated updates but may do so in the future. In the case where the user explicitly asked for the update to be installed or is watching the update progress screen, we'll prioritize the update work on the assumption that they're actively waiting for it to finish.
What happens if there's a failure to apply an update?
With non-A/B updates, if an update failed to apply, the user was usually left with an unusable device. The only exception was if the failure occurred before an application had even started (because the package failed to verify, say). With A/B updates, a failure to apply an update does not affect the currently running system. The update can simply be retried later.
What does GmsCore do?
In Google's A/B implementation, the platform APIs and
update_engine provide the mechanism while GmsCore provides the
policy. That is, the platform knows how to apply an A/B update and all
that code is in AOSP (as mentioned above); but it's GmsCore that decides
what and when to apply.
If you’re not using GmsCore, you can write your own replacement using the
same platform APIs. The platform Java API for controlling
Callers can provide an
UpdateEngineCallback to be notified of status
Refer to the reference files for the core classes to use the interface.
Which systems on a chip (SoCs) support A/B?
As of 2017-03-15, we have the following information:
|Android 7.x Release||Android 8.x Release|
|Qualcomm||Depending on OEM requests||All chipsets will get support|
|Mediatek||Depending on OEM requests||All chipsets will get support|
For details on schedules, check with your SoC contacts. For SoCs not listed above, reach out to your SoC directly.