Implementing Vulkan

Vulkan is a low-overhead, cross-platform API for high-performance 3D graphics. Like OpenGL ES, Vulkan provides tools for creating high-quality, real-time graphics in applications. Vulkan advantages include reductions in CPU overhead and support for the SPIR-V Binary Intermediate language.

Note: This section describes Vulkan implementation; for details on Vulkan architecture, advantages, API, and other resources, see Vulkan Architecture.

To implement Vulkan, a device:

  • Must include the Vulkan Loader (provided by Android) in the build.
  • Must include a Vulkan driver (provided by SoCs such as GPU IHVs) that implements the Vulkan API. To support Vulkan functionality, the Android device needs capable GPU hardware and the associated driver. Consult your SoC vendor to request driver support.

If a Vulkan driver is available on the device, the device needs to declare FEATURE_VULKAN_HARDWARE_LEVEL and FEATURE_VULKAN_HARDWARE_VERSION system features, with versions that accurately reflect the capabilities of the device.

Vulkan Loader

The primary interface between Vulkan applications and a device's Vulkan driver is the Vulkan loader, which is part of Android Open Source Project (AOSP) (platform/frameworks/native/vulkan) and installed at /system/lib[64]/ The loader provides the core Vulkan API entry points, as well as entry points of a few extensions that are required on Android and always present. In particular, Window System Integration (WSI) extensions are exported by the loader and primarily implemented in it rather than the driver. The loader also supports enumerating and loading layers that can expose additional extensions and/or intercept core API calls on their way to the driver.

The NDK includes a stub library that exports the same symbols as the loader and which is used for linking. When running on a device, applications call the Vulkan functions exported from (the real library, not the stub) to enter trampoline functions in the loader (which then dispatch to the appropriate layer or driver based on their first argument). The vkGetDeviceProcAddr calls return the function pointers to which the trampolines would dispatch (i.e. it calls directly into the core API code), so calling through these function pointers (rather than the exported symbols) is slightly more efficient as it skips the trampoline and dispatch. However, vkGetInstanceProcAddr must still call into trampoline code.

Driver enumeration and loading

Android expects the GPUs available to the system to be known when the system image is built. The loader uses the existing HAL mechanism (see hardware.h) for discovering and loading the driver. Preferred paths for 32-bit and 64-bit Vulkan drivers are:


Where <ro.product.platform> is replaced by the value of the system property of that name. For details and supported alternative locations, refer to libhardware/hardware.c.

In Android 7.0, the Vulkan hw_module_t derivative is trivial; only one driver is supported and the constant string HWVULKAN_DEVICE_0 is passed to open. If support for multiple drivers is added in future versions of Android, the HAL module will export a list of strings that can be passed to the module open call.

The Vulkan hw_device_t derivative corresponds to a single driver, though that driver can support multiple physical devices. The hw_device_t structure can be extended to export vkGetGlobalExtensionProperties, vkCreateInstance, and vkGetInstanceProcAddr functions. The loader can find all other VkInstance, VkPhysicalDevice, and vkGetDeviceProcAddr functions by calling vkGetInstanceProcAddr.

Layer discovery and loading

The Vulkan loader supports enumerating and loading layers that can expose additional extensions and/or intercept core API calls on their way to the driver. Android 7.0 does not include layers on the system image; however, applications may include layers in their APK.

When using layers, keep in mind that Android's security model and policies differ significantly from other platforms. In particular, Android does not allow loading external code into a non-debuggable process on production (non-rooted) devices, nor does it allow external code to inspect or control the process's memory, state, etc. This includes a prohibition on saving core dumps, API traces, etc. to disk for later inspection. Only layers delivered as part of the application are enabled on production devices, and drivers must not provide functionality that violates these policies.

Use cases for layers include:

  • Development-time layers. These layers (validation layers, shims for tracing/profiling/debugging tools, etc.) should not be installed on the system image of production devices as they waste space for users and should be updateable without requiring a system update. Developers who want to use one of these layers during development can modify the application package (e.g. adding a file to their native libraries directory). IHV and OEM engineers who want to diagnose failures in shipping, unmodifiable apps are assumed to have access to non-production (rooted) builds of the system image.
  • Utility layers. These layers almost always expose extensions, such as a layer that implements a memory manager for device memory. Developers choose layers (and versions of those layers) to use in their application; different applications using the same layer may still use different versions. Developers choose which of these layers to ship in their application package.
  • Injected (implicit) layers. Includes layers such as framerate, social network, or game launcher overlays provided by the user or some other application without the application's knowledge or consent. These violate Android's security policies and are not supported.

In the normal state, the loader searches for layers only in the application's native library directory and attempts to load any library with a name matching a particular pattern (e.g. It does not need a separate manifest file as the developer deliberately included these layers and reasons to avoid loading libraries before enabling them don't apply.

Android allows layers to be ported with build-environment changes between Android and other platforms. For details on the interface between layers and the loader, refer to Vulkan Loader Specification and Architecture Overview. Versions of the LunarG validation layers that have been verified to build and work on Android are hosted in the android_layers branch of the KhronosGroup/Vulkan-LoaderAndValidationLayers project on GitHub.

Window System Integration (WSI)

The Window System Integration (WSI) extensions VK_KHR_surface, VK_KHR_android_surface, and VK_KHR_swapchain are implemented by the platform and live in The VkSurfaceKHR and VkSwapchainKHR objects and all interaction with ANativeWindow is handled by the platform and is not exposed to drivers. The WSI implementation relies on the VK_ANDROID_native_buffer extension (described below) which must be supported by the driver; this extension is only used by the WSI implementation and will not be exposed to applications.

Gralloc usage flags

Implementations may need swapchain buffers to be allocated with implementation-defined private gralloc usage flags. When creating a swapchain, the platform asks the driver to translate the requested format and image usage flags into gralloc usage flags by calling:

VkResult VKAPI vkGetSwapchainGrallocUsageANDROID(
    VkDevice            device,
    VkFormat            format,
    VkImageUsageFlags   imageUsage,
    int*                grallocUsage

The format and imageUsage parameters are taken from the VkSwapchainCreateInfoKHR structure. The driver should fill *grallocUsage with the gralloc usage flags required for the format and usage (which are combined with the usage flags requested by the swapchain consumer when allocating buffers).

Gralloc-backed images

VkNativeBufferANDROID is a vkCreateImage extension structure for creating an image backed by a gralloc buffer. This structure is provided to vkCreateImage in the VkImageCreateInfo structure chain. Calls to vkCreateImage with this structure happen during the first call to vkGetSwapChainInfoWSI(.. VK_SWAP_CHAIN_INFO_TYPE_IMAGES_WSI ..). The WSI implementation allocates the number of native buffers requested for the swapchain, then creates a VkImage for each one:

typedef struct {
    VkStructureType             sType; // must be VK_STRUCTURE_TYPE_NATIVE_BUFFER_ANDROID
    const void*                 pNext;

    // Buffer handle and stride returned from gralloc alloc()
    buffer_handle_t             handle;
    int                         stride;

    // Gralloc format and usage requested when the buffer was allocated.
    int                         format;
    int                         usage;
} VkNativeBufferANDROID;

When creating a gralloc-backed image, the VkImageCreateInfo has the following data:

 .imageType           = VK_IMAGE_TYPE_2D
  .format              = a VkFormat matching the format requested for the gralloc buffer
  .extent              = the 2D dimensions requested for the gralloc buffer
  .mipLevels           = 1
  .arraySize           = 1
  .samples             = 1
  .tiling              = VK_IMAGE_TILING_OPTIMAL
  .usage               = VkSwapChainCreateInfoWSI::imageUsageFlags
  .flags               = 0
  .sharingMode         = VkSwapChainCreateInfoWSI::sharingMode
  .queueFamilyCount    = VkSwapChainCreateInfoWSI::queueFamilyCount
  .pQueueFamilyIndices = VkSwapChainCreateInfoWSI::pQueueFamilyIndices

Aquiring images

vkAcquireImageANDROID acquires ownership of a swapchain image and imports an externally-signalled native fence into both an existing VkSemaphore object and an existing VkFence object:

VkResult VKAPI vkAcquireImageANDROID(
    VkDevice            device,
    VkImage             image,
    int                 nativeFenceFd,
    VkSemaphore         semaphore,
    VkFence             fence

This function is called during vkAcquireNextImageWSI to import a native fence into the VkSemaphore and VkFence objects provided by the application (however, both semaphore and fence objects are optional in this call). The driver may also use this opportunity to recognize and handle any external changes to the gralloc buffer state; many drivers won't need to do anything here. This call puts the VkSemaphore and VkFence into the same pending state as vkQueueSignalSemaphore and vkQueueSubmit respectively, so queues can wait on the semaphore and the application can wait on the fence.

Both objects become signalled when the underlying native fence signals; if the native fence has already signalled, then the semaphore is in the signalled state when this function returns. The driver takes ownership of the fence fd and is responsible for closing it when no longer needed. It must do so even if neither a semaphore or fence object is provided, or even if vkAcquireImageANDROID fails and returns an error. If fenceFd is -1, it is as if the native fence was already signalled.

Releasing images

vkQueueSignalReleaseImageANDROID prepares a swapchain image for external use, and creates a native fence and schedules it to be signalled when prior work on the queue has completed:

VkResult VKAPI vkQueueSignalReleaseImageANDROID(
    VkQueue             queue,
    VkImage             image,
    int*                pNativeFenceFd

This API is called during vkQueuePresentWSI on the provided queue. Effects are similar to vkQueueSignalSemaphore, except with a native fence instead of a semaphore. Unlike vkQueueSignalSemaphore, however, this call creates and returns the synchronization object that will be signalled rather than having it provided as input. If the queue is already idle when this function is called, it is allowed (but not required) to set *pNativeFenceFd to -1. The file descriptor returned in *pNativeFenceFd is owned and will be closed by the caller.

Updating drivers

Many drivers can ignore the image parameter, but some may need to prepare CPU-side data structures associated with a gralloc buffer for use by external image consumers. Preparing buffer contents for use by external consumers should have been done asynchronously as part of transitioning the image to VK_IMAGE_LAYOUT_PRESENT_SRC_KHR.


OEMs can test their Vulkan implementation using CTS, which includes drawElements Quality Program (dEQP) tests that exercise the Vulkan Runtime.