Implement a pKVM vendor module

This page explains how to implement a protected kernel-based virtual machine (pKVM) vendor module.

For android16-6.12 and later, when you are done with these steps, you should have a directory tree similar to:

BUILD.bazel
el1.c
hyp/
    BUILD.bazel
    el2.c

For a complete example, see Build a pKVM module with DDK .

For android15-6.6 and earlier:

Makefile
el1.c
hyp/
    Makefile
    el2.c
  1. Add the EL2 hypervisor code (el2.c). At a minimum, this code must declare an init function accepting a reference to the pkvm_module_ops struct:

    #include <asm/kvm_pkvm_module.h>
    
    int pkvm_driver_hyp_init(const struct pkvm_module_ops *ops)
    {
      /* Init the EL2 code */
    
      return 0;
    }
    

    The pKVM vendor module API is a struct encapsulating callbacks to the pKVM hypervisor. This struct follows the same ABI rules as GKI interfaces.

  2. Create the hyp/Makefile to build the hypervisor code:

    hyp-obj-y := el2.o
    include $(srctree)/arch/arm64/kvm/hyp/nvhe/Makefile.module
    
  3. Add the EL1 kernel code (el1.c). This code's init section must contain a call to pkvm_load_el2 module to load the EL2 hypervisor code from step 1.

    #include <linux/init.h>
    #include <linux/module.h>
    #include <linux/kernel.h>
    #include <asm/kvm_pkvm_module.h>
    
    int __kvm_nvhe_pkvm_driver_hyp_init(const struct pkvm_module_ops *ops);
    
    static int __init pkvm_driver_init(void)
    {
        unsigned long token;
    
        return pkvm_load_el2_module(__kvm_nvhe_pkvm_driver_hyp_init, &token);
    }
    module_init(pkvm_driver_init);
    
  4. Create the build rules.

    For android16-6.12 and later, refer to Build a pKVM module with DDK to create ddk_library() for EL2 and ddk_module() for EL1.

    For android15-6.6 and earlier, create the root makefile to tie the EL1 and EL2 code together:

    ifneq ($(KERNELRELEASE),)
    clean-files := hyp/hyp.lds hyp/hyp-reloc.S
    
    obj-m := pkvm_module.o
    pkvm_module-y := el1.o hyp/kvm_nvhe.o
    
    $(PWD)/hyp/kvm_nvhe.o: FORCE
             $(Q)$(MAKE) $(build)=$(obj)/hyp $(obj)/hyp/kvm_nvhe.o
    else
    all:
            make -C $(KDIR) M=$(PWD) modules
    clean:
            make -C $(KDIR) M=$(PWD) clean
    endif
    

Load a pKVM module

As with GKI vendor modules, pKVM vendor modules can be loaded using modprobe. However, for security reasons, loading must occur before deprivileging. To load a pKVM module, you must ensure your modules are included in the root filesystem (initramfs) and you must add the following to your kernel command-line:

kvm-arm.protected_modules=mod1,mod2,mod3,...

pKVM vendor modules stored in the initramfs inherit the signature and protection of initramfs.

If one of the pKVM vendor modules fails to load, the system is considered insecure and it won't be possible to start a protected virtual machine.

Call an EL2 (hypervisor) function from EL1 (kernel module)

A hypervisor call (HVC) is an instruction that lets the kernel to call the hypervisor. With the introduction of pKVM vendor modules, an HVC can be used to call for a function to run at EL2 (in the hypervisor module) from EL1 (the kernel module):

  1. In the EL2 code (el2.c), declare the EL2 handler:

Android 14

   void pkvm_driver_hyp_hvc(struct kvm_cpu_context *ctx)
   {
     /* Handle the call */

     cpu_reg(ctx, 1) = 0;
   }

Android 15 or higher

   void pkvm_driver_hyp_hvc(struct user_pt_regs *regs)
   {
     /* Handle the call */

     regs->regs[0] = SMCCC_RET_SUCCESS;
     regs->regs[1] = 0;
   }
  1. In your EL1 code (el1.c), register the EL2 handler in your pKVM vendor module:

    int __kvm_nvhe_pkvm_driver_hyp_init(const struct pkvm_module_ops *ops);
    void __kvm_nvhe_pkvm_driver_hyp_hvc(struct kvm_cpu_context *ctx); // Android14
    void __kvm_nvhe_pkvm_driver_hyp_hvc(struct user_pt_regs *regs);   // Android15
    
    static int hvc_number;
    
    static int __init pkvm_driver_init(void)
    {
      long token;
      int ret;
    
      ret = pkvm_load_el2_module(__kvm_nvhe_pkvm_driver_hyp_init,token);
      if (ret)
        return ret;
    
      ret = pkvm_register_el2_mod_call(__kvm_nvhe_pkvm_driver_hyp_hvc, token)
      if (ret < 0)
        return ret;
    
      hvc_number = ret;
    
      return 0;
    }
    module_init(pkvm_driver_init);
    
  2. In your EL1 code (el1.c), call the HVC:

    pkvm_el2_mod_call(hvc_number);
    

Debug and profile EL2 code

This section contains several options to debug pKVM module EL2 code.

Emit and read hypervisor trace events

Tracefs supports the pKVM hypervisor. The root user has access to the interface, which is located in /sys/kernel/tracing/hypervisor/:

  • tracing_on: Turns on or off the tracing.
  • trace: Writing to this file resets the trace.
  • trace_pipe: Reading this file prints the hypervisor events.
  • buffer_size_kb: The size of the per-CPU buffer holding events. Increase this value if events are lost.

By default, events are turned off. To enable events use the corresponding /sys/kernel/tracing/hypervisor/events/my_event/enable file in Tracefs. You can also enable any hypervisor event at boot time with the kernel command-line of hyp_event=event1,event2.

Before declaring an event, the EL2 code of the module must declare the following boilerplate, where pkvm_ops is the struct pkvm_module_ops * passed to the module init function:

  #include "events.h"
  #define HYP_EVENT_FILE ../../../../relative/path/to/hyp/events.h
  #include <nvhe/define_events.h>

  #ifdef CONFIG_TRACING
  void *tracing_reserve_entry(unsigned long length)
  {
      return pkvm_ops->tracing_reserve_entry(length);
  }

  void tracing_commit_entry(void)
  {
      pkvm_ops->tracing_commit_entry();
  }
  #endif

Declare events

Declare events in their own .h file:

  $ cat hyp/events.h
  #if !defined(__PKVM_DRIVER_HYPEVENTS_H_) || defined(HYP_EVENT_MULTI_READ)
  #define __PKVM_DRIVER_HYPEVENTS_H_

  #ifdef __KVM_NVHE_HYPERVISOR__
  #include <nvhe/trace.h>
  #endif

  HYP_EVENT(pkvm_driver_event,
          HE_PROTO(u64 id),
          HE_STRUCT(
                  he_field(u64, id)
          ),
          HE_ASSIGN(
                  __entry->id = id;
          ),
          HE_PRINTK("id=0x%08llx", __entry->id)
  );
  #endif

Emit events

You can log events in EL2 code by calling the generated C function:

  trace_pkvm_driver_event(id);

Add additional registration (Android 15 or lower)

For Android 15 and lower, include an additional registration during the module initialization. This isn't required in Android 16 and higher.

  #ifdef CONFIG_TRACING
  extern char __hyp_event_ids_start[];
  extern char __hyp_event_ids_end[];
  #endif

  int pkvm_driver_hyp_init(const struct pkvm_module_ops *ops)
  {
  #ifdef CONFIG_TRACING
      ops->register_hyp_event_ids((unsigned long)__hyp_event_ids_start,
                                        (unsigned long)__hyp_event_ids_end);
  #endif

      /* init module ... */

      return 0;
  }

Emit events without prior declaration (Android 16 and higher)

Declaring events can be cumbersome for quick debugging. trace_hyp_printk() lets the caller pass up to four arguments to a format string without any event declaration:

  trace_hyp_printk("This is my debug");
  trace_hyp_printk("This is my variable: %d", (int)foo);
  trace_hyp_printk("This is my address: 0x%llx", phys);

A boilerplate in the EL2 code is also required. trace_hyp_printk() is a macro that calls the function trace___hyp_printk():

  #include <nvhe/trace.h>

  #ifdef CONFIG_TRACING
  void trace___hyp_printk(u8 fmt_id, u64 a, u64 b, u64 c, u64 d)
  {
          pkvm_ops->tracing_mod_hyp_printk(fmt_id, a, b, c, d);
  }
  #endif

Enable the event __hyp_printk in /sys/kernel/tracing/hypervisor/events/ or at boot time with the kernel command-line hyp_event=__hyp_printk.

Redirect events to dmesg

The kernel command-line parameter hyp_trace_printk=1 makes the hypervisor tracing interface forward each logged event to the kernel's dmesg. This is useful to read events when trace_pipe is inaccessible.

Dump events during a kernel panic (Android 16 and higher)

Hypervisor events are polled. There is therefore a window betwen the last poll and a kernel panic where events have been emitted but not dumped to the console. The kernel config option CONFIG_PKVM_DUMP_TRACE_ON_PANIC attempts to dump the most recent events in the console if hyp_trace_printk has been enabled.

This option is disabled by default for GKI.

Use Ftrace to trace function call and return (Android 16 and higher)

Ftrace is a kernel feature that lets you trace each function call and return. In a similar fashion, the pKVM hypervisor offers two events func and func_ret.

You can select the traced functions with the kernel command-line hyp_ftrace_filter= or with one of the tracefs files:

  • /sys/kernel/tracing/hypervisor/set_ftrace_filter
  • /sys/kernel/tracing/hypervisor/set_ftrace_notrace

Filters use shell-style glob matching.

The following filter traces the functions starting with pkvm_hyp_driver:

  echo "__kvm_nvhe_pkvm_hyp_driver*" > /sys/kernel/tracing/hypervisor/set_ftrace_filter

func and func_ret events are available only with CONFIG_PKVM_FTRACE=y. This option is disabled by default for GKI.