Android includes an eBPF loader and library that loads eBPF programs at boot time to extend kernel functionality. This can be used for collecting statistics from the kernel, monitoring, or debugging.
About eBPF
Extended Berkeley Packet Filter (eBPF) is an in-kernel virtual machine that
runs user-supplied eBPF programs. These programs can be hooked to probes or
events in the kernel, collect useful statistics, and store the results in rich
data structures. A program is loaded into the kernel using the bpf(2)
syscall
and is provided by the user as a binary blob of eBPF machine instructions.
The Android build system has support for compiling C programs to eBPF using
simple build file syntax described later.
More information about eBPF internals and architecture can be found at Brendan Gregg's eBPF page.
Android BPF loader
During Android boot, all eBPF programs located at /system/etc/bpf/
are
loaded. These programs are binary objects built by the Android build system
from C programs accompanied with Android.bp
files in the Android source tree.
The build system stores the generated objects at /system/etc/bpf
, and
they become part of the system image.
Format of an Android eBPF C program
An eBPF C program loaded on an Android device must have the following format:
#include <bpf_helpers.h> /* Define one or more maps in the maps section, for example * define a map of type array int -> uint32_t, with 10 entries */ DEFINE_BPF_MAP(name_of_my_map, ARRAY, int, uint32_t, 10); /* this will also define type-safe accessors: * value * bpf_name_of_my_map_lookup_elem(&key); * int bpf_name_of_my_map_update_elem(&key, &value, flags); * int bpf_name_of_my_map_delete_elem(&key); * as such it is heavily suggested to use lowercase *_map names. * Also note that due to compiler deficiencies you cannot use a type * of 'struct foo' but must instead use just 'foo'. As such structs * must not be defined as 'struct foo {}' and must instead be * 'typedef struct {} foo'. */ SEC("PROGTYPE/PROGNAME") int PROGFUNC(..args..) { <body-of-code ... read or write to MY_MAPNAME ... do other things > } char _license[] SEC("license") = "GPL"; // or other license
Here, name_of_my_map
is the name of your map variable. It tells the BPF loader
what kind of map to create with what parameters. This struct definition is
provided by the bpf_helpers.h
header that the C program includes. The above
code results in a creation of an array map of 10 entries.
Next, the program defines a PROGFUNC
function. When compiled, this function is
placed in a section. The section must have a name of the format
PROGTYPE/PROGNAME
. PROGTYPE
can be any one of the following. More types can
be found in the Loader
source code.
kprobe | Hooks PROGFUNC onto at a kernel instruction using the
kprobe infrastructure. PROGNAME must be the name of the kernel
function being kprobed. Refer to the kprobe kernel documentation for more information about
kprobes.
|
---|---|
tracepoint | Hooks PROGFUNC onto a tracepoint. PROGNAME must be
of the format SUBSYSTEM/EVENT . For example, a tracepoint section
for attaching functions to scheduler context switch events would be
SEC("tracepoint/sched/sched_switch") , where sched is
the name of the trace subsystem, and sched_switch is the name
of the trace event. Check the trace events kernel
documentationfor more information about tracepoints.
|
skfilter | Program functions as a networking socket filter. |
schedcls | Program functions as a networking traffic classifier. |
cgroupskb, cgroupsock | Program runs whenever processes in a CGroup create an AF_INET or AF_INET6 socket. |
As an example of a complete C program, the following program creates a map and
defines a tp_sched_switch
function, which can be attached to the
sched:sched_switch trace
event (see this
section
for how to attach).
The program adds information about the latest task PID that ran on a particular
CPU. Name this myschedtp.c
. We'll refer to this file later in this document.
#include <linux/bpf.h> #include <stdbool.h> #include <stdint.h> #include <bpf_helpers.h> DEFINE_BPF_MAP(cpu_pid_map, ARRAY, int, uint32_t, 1024); struct switch_args { unsigned long long ignore; char prev_comm[16]; int prev_pid; int prev_prio; long long prev_state; char next_comm[16]; int next_pid; int next_prio; }; SEC("tracepoint/sched/sched_switch") int tp_sched_switch(struct switch_args* args) { int key; uint32_t val; key = bpf_get_smp_processor_id(); val = args->next_pid; bpf_cpu_pid_map_update_elem(&key, &val, BPF_ANY); return 0; } char _license[] SEC("license") = "GPL";
The license section is used by the kernel to verify if the program is compatible
with the kernel's license when the program makes use of BPF helper functions
provided by the kernel. Set _license
to your project's license.
Format of the Android.bp file
In order for the Android build system to build an eBPF .c
program, you must
make an entry in the Android.bp
file of the project.
For example, to build an eBPF C program of name bpf_test.c
, make the following
entry in your project's Android.bp
file:
bpf { name: "bpf_test.o", srcs: ["bpf_test.c"], cflags: [ "-Wall", "-Werror", ], }
This compiles the C program resulting in the object
/system/etc/bpf/bpf_test.o
. On boot, the Android system automatically loads
the bpf_test.o
program into the kernel.
Files available in sysfs
During boot up, the Android system automatically loads all the eBPF objects from
/system/etc/bpf/
, creates the maps that the program needs, and pins the loaded
program with its maps to the BPF file system. These files can then be used for
further interaction with the eBPF program or reading maps. This section
describes the conventions used for naming these files and their locations in
sysfs.
The following files are created and pinned:
For any programs loaded, assuming
PROGNAME
is the name of the program andFILENAME
is the name of the eBPF C file, the Android loader creates and pins each program at/sys/fs/bpf/prog_FILENAME_PROGTYPE_PROGNAME
.For example, for the above
sched_switch
tracepoint example inmyschedtp.c
, a program file is created and pinned to/sys/fs/bpf/prog_myschedtp_tracepoint_sched_sched_switch
.For any maps created, assuming
MAPNAME
is the name of the map andFILENAME
is the name of the eBPF C file, the Android loader creates and pins each map to/sys/fs/bpf/map_FILENAME_MAPNAME
.For example, for the above
sched_switch
tracepoint example inmyschedtp.c
, a map file is created and pinned to/sys/fs/bpf/map_myschedtp_cpu_pid_map
.bpf_obj_get()
in the Android BPF library returns a file descriptor from the pinned/sys/fs/bpf
file. This file descriptor can be used for further operations, such as reading maps or attaching a program to a tracepoint.
Android BPF library
The Android BPF library is named libbpf_android.so
and is part of the system
image. This library provides the user with low-level eBPF functionality needed
for creating and reading maps, creating probes, tracepoints, perf buffers, and
so on.
Attaching programs to tracepoints and kprobes
When tracepoint and kprobe programs are loaded (which is done automatically
at boot up as previously described), they need to be activated. To activate them,
first, use bpf_obj_get()
to obtain the program fd
from the pinned file's
location (see the Files available in sysfs
section). Next, call bpf_attach_tracepoint()
in the BPF library, passing it the program fd
and the tracepoint name.
For example, to attach the sched_switch
tracepoint defined in the myschedtp.c
source file in the example above, do the following (error checking isn't shown):
char *tp_prog_path = "/sys/fs/bpf/prog_myschedtp_tracepoint_sched_sched_switch"; char *tp_map_path = "/sys/fs/bpf/map_myschedtp_cpu_pid"; // Attach tracepoint and wait for 4 seconds int mProgFd = bpf_obj_get(tp_prog_path); int mMapFd = bpf_obj_get(tp_map_path); int ret = bpf_attach_tracepoint(mProgFd, "sched", "sched_switch"); sleep(4); // Read the map to find the last PID that ran on CPU 0 android::bpf::BpfMap<int, int> myMap(mMapFd); printf("last PID running on CPU %d is %d\n", 0, myMap.readValue(0));
Reading from the maps
BPF maps support arbitrary complex key and value structures or types. The
Android BPF library includes an android::BpfMap
class that makes use of C++
templates to instantiate BpfMap
based on the key and value's type for the
map in question. The above code shows an example of using a BpfMap
with key
and value as integers. The integers can also be arbitrary structures.
Thus the templatized BpfMap
class makes it easy to define a custom BpfMap
object suitable for the particular map. The map can then be accessed using the
custom-generated functions, which are type aware, resulting in cleaner code.
For more information about BpfMap
, refer to the
Android sources.
Debugging issues
During boot time, several messages related to BPF loading are logged. If the loading process fails for any reason, a detailed log message is provided in logcat. Filtering the logcat logs by "bpf" prints all the messages and any detailed errors during load time, such as eBPF verifier errors.
Users of eBPF in Android
There are two eBPF C programs in Android that you can refer to for examples.
The netd
eBPF C
program
is used by the networking daemon (netd) in Android for various purposes such as
socket filtering and statistics gathering. To see how this program is used,
check the eBPF traffic
monitor sources.
The time_in_state
eBPF C
program
calculates the amount of time an Android app spends at different
CPU frequencies, which is used to calculate power. This program is currently
under development.
Licensing considerations
If you want to contribute an eBPF C program, contribute to the correct
project depending on its license. Contribute a GPL-licensed eBPF C program
to the system/bpfprogs
AOSP project. Contribute an
Apache-licensed program to system/bpf
AOSP project.