Reading Bug Reports

Bugs are a reality in any type of development—and bug reports are critical to identifying and solving problems. All versions of Android support capturing bug reports with Android Debug Bridge (adb); Android versions 4.2 and higher support a Developer Option for taking bug reports and sharing via email, Drive, etc.

Android bug reports contain dumpsys, dumpstate, and logcat data in text (.txt) format, enabling you to easily search for specific content. The following sections detail bug report components, describe common problems, and give helpful tips and grep commands for finding logs associated with those bugs. Most sections also include examples for grep command and output and/or dumpsys output.

Logcat

The logcat log is a string-based dump of all logcat information. The system part is reserved for the framework and has a longer history than main which contains everything else. Each line starts with timestamp PID TID log-level.

Viewing the event log

This log contains string representations of binary-formatted log messages. It is less noisy than the logcat log but also a little harder to read. When viewing event logs, you can search this section for specific process ID (PID) to see what a process has been doing. The basic format is: timestamp PID TID log-level log-tag tag-values.

Log levels include the following:

  • V: verbose
  • D: debug
  • I: information
  • W: warning
  • E: error

 

For other useful event log tags, refer to /services/core/java/com/android/server/EventLogTags.logtags.

ANRs and deadlocks

Bugreports can help you identify what's causing Application Not Responding (ANR) errors and deadlock events.

Identifying unresponsive apps

When an application does not respond within a certain time, usually due to a blocked or busy main thread, the system kills the process and dumps the stack to /data/anr. To discover the culprit behind an ANR, grep for am_anr in the binary event log.

You can also grep for ANR in in the logcat log, which contains more information about what was using CPU at the time of the ANR.

Finding stack traces

You can often find stack traces that correspond to an ANR. Make sure the timestamp and PID on the VM traces match the ANR you are investigating, then check the main thread of the process. Keep in mind:

  • The main thread tells you only what the thread was doing at the time of the ANR, which may or may not correspond to the true cause of the ANR. (The stack in the bug report may be innocent; something else may have been stuck for a long time—but not quite long enough to ANR—before becoming unstuck.)
  • More than one set of stack traces (VM TRACES JUST NOW and VM TRACES AT LAST ANR) might exist. Make sure you are viewing the correct section.

Finding deadlocks

Deadlocks often first appear as ANRs because threads are getting stuck. If the deadlock hits the system server, the watchdog will eventually kill it, leading to an entry in the log similar to: WATCHDOG KILLING SYSTEM PROCESS. From the user perspective, the device reboots, although technically this is a runtime restart rather than a true reboot.

  • In a runtime restart, the system server dies and is restarted; the user sees the device return to the boot animation.
  • In a reboot, the kernel has crashed; the user sees the device return to the Google boot logo.

To find deadlocks, check the VM traces sections for a pattern of thread A waiting on something held by thread B, which in turn is waiting on something held by thread A.

Activities

An Activity is an application component that provides a screen users interact with to do something such as dial a number, take a photo, send an email, etc. From a bug report perspective, an activity is a single, focused thing a user can do, which makes locating the activity that was in focus during a crash very important. Activities (via ActivityManager) run processes, so locating all process stops and starts for a given activity can also aid troubleshooting.

Viewing focused activities

To view a history of focused activities, search for am_focused_activity.

Viewing process starts

To view a history of process starts, search for Start proc.

Is the device thrashing?

To determine if the device is thrashing, check for an abnormal increase in activity around am_proc_died and am_proc_start in a short amount of time.

Memory

Because Android devices often have constrained physical memory, managing random-access memory (RAM) is critical. Bug reports contain several indicators of low memory as well as a dumpstate that provides a memory snapshot.

Identifying low memory

Low memory can cause the system to thrash as it kills some processes to free memory but continues to start other processes. To view corroborating evidence of low memory, check for concentrations of am_proc_died and am_proc_start entries in the binary event log.

Low memory can also slow task switching and thwart return attempts (because the task the user was trying to return to was killed). If the launcher was killed, it restarts when the user touches the home button and logs show the launcher reload its content.

Viewing historical indicators

The am_low_memory entry in the binary event log indicates the last cached process has died. After this, the system starts killing services.

Viewing thrashing indicators

Other indicators of system thrashing (paging, direct reclaim, etc.) include kswapd, kworker, and mmcqd consuming cycles. (Keep in mind the bugreport being gathered can influence thrashing indicators.)

ANR logs can provide a similar memory snapshot.

Getting a memory snapshot

The memory snapshot is a dumpstate that lists running Java and native processes (for details, refer to Viewing Overall Memory Allocations). Keep in mind the snapshot gives only the state at a specific moment in time; the system might have been in better (or worse) shape before the snapshot.

Broadcasts

Applications generate broadcasts to send events within the current application or to another application. Broadcast receivers subscribe to specific messages (via filters), enabling them to both listen and respond to a broadcast. Bug reports contain information about sent broadcasts and unsent broadcasts, as well as a dumpsys of all receivers listening to a specific broadcast.

Viewing historical broadcasts

Historical broadcasts are those that have already been sent, listed in reverse chronological order.

The summary section is an overview of the last 300 foreground broadcasts and the last 300 background broadcasts.

The detail section contains complete information for the last 50 foreground broadcasts and the last 50 background broadcasts, as well as the receivers for each broadcast. Receivers that have a:

  • BroadcastRecord entry are registered at runtime and are sent only to already running processes.
  • ResolveInfo entry are registered through manifest entries. The ActivityManager starts the process for each ResolveInfo if it is not already running.

Viewing active broadcasts

Active broadcasts are those that have yet to be sent. A large number in the queue means the system can't dispatch the broadcasts fast enough to keep up.

Viewing broadcast listeners

To view a list of receivers listening for a broadcast, check the Receiver Resolver Table in the dumpsys activity broadcasts. The following example displays all receivers listening for USER_PRESENT.

Monitor contention

Monitor contention logging can sometimes indicate actual monitor contention, but most often indicates the system is so loaded that everything has slowed down. You might see long monitor events logged by ART in system or event log.

In the system log:

10-01 18:12:44.343 29761 29914 W art     : Long monitor contention event with owner method=void android.database.sqlite.SQLiteClosable.acquireReference() from SQLiteClosable.java:52 waiters=0 for 3.914s

In the event log:

10-01 18:12:44.364 29761 29914 I dvm_lock_sample: [com.google.android.youtube,0,pool-3-thread-9,3914,ScheduledTaskMaster.java,138,SQLiteClosable.java,52,100]

Background compilation

Compilation can be expensive and load the device.

Compilation might occur in the background when Google Play store updates are downloading. In this case, messages from the Google Play store app (finsky) and installd appear prior to dex2oat messages.

Compilation might also occur in the background when an application is loading a dex file that has not yet been compiled. In this case, you won't see finsky or installd logging.

Narrative

Establishing the narrative of a problem (how it started, what happened, how the system reacted) requires a solid timeline of events. You can use the information in the bug report to sync timelines across multiple logs and determine the exact timestamp of the bug report.

Syncing timelines

A bugreport reflects multiple parallel timelines: system log, event log, kernel log, and multiple specialized timelines for broadcasts, battery stats, etc. Unfortunately, timelines are often reported using different time bases.

The system and event log timestamps are in the same timezone as the user (as are most other timestamps). For example, when user taps the home button, the system log reports:

10-03 17:19:52.939  1963  2071 I ActivityManager: START u0 {act=android.intent.action.MAIN cat=[android.intent.category.HOME] flg=0x10200000 cmp=com.google.android.googlequicksearchbox/com.google.android.launcher.GEL (has extras)} from uid 1000 on display 0

For the same action, the event log reports:

10-03 17:19:54.279  1963  2071 I am_focused_activity: [0,com.google.android.googlequicksearchbox/com.google.android.launcher.GEL]

Kernel (dmesg) logs use a different time base, tagging log items with seconds since bootloader completes. To register this timescale to other timescales, search for suspend exit and suspend entry messages:

<6>[201640.779997] PM: suspend exit 2015-10-03 19:11:06.646094058 UTC
…
<6>[201644.854315] PM: suspend entry 2015-10-03 19:11:10.720416452 UTC

Because kernel logs might not include time while in suspend, you should piecewise register the log between the suspend entry and exit messages. Additionally, kernel logs use UTC timezone and must be adjusted to the user timezone.

Identifying bugreport time

To determine when a bugreport was taken, first check the system log (Logcat) for the dumpstate: begin:

10-03 17:19:54.322 19398 19398 I dumpstate: begin

Next, check the kernel log (dmesg) timestamps for the Starting service 'bugreport' message:

<5>[207064.285315] init: Starting service 'bugreport'...

Work backwards to correlate the two events, keeping in mind the caveats mentioned in Syncing timelines. While there's a lot happening after the bugreport is initiated, most activity isn't very useful as the act of taking the bugreport loads the system substantially.

Power

The event log contains screen power status, where 0 is screen off, 1 is screen on, and 2 is for keyguard done.

Bug reports also contain statistics about wake locks, a mechanism used by application developers to indicate their application needs to have the device stay on. (For details on wake locks, refer to PowerManager.WakeLock and Keep the CPU on.)

The aggregated wake lock duration statistics track only the time a wake lock is actually responsible for keeping the device awake and do not include time with the screen on. In addition, if multiple wake locks are held simultaneously, the wake lock duration time is distributed across those wake locks.

For more help visualizing power status, use Battery Historian, a Google open source tool to analyze battery consumers using Android bugreport files.

Packages

The DUMP OF SERVICE package contains application versions (and other useful info).

Processes

Bug reports contain a huge amount of data for processes, including start and stop time, runtime length, associated services, oom_adj score, etc. For details on how Android manages processes, refer to Processes and Threads.

Determining process runtime

The procstats section contains complete statistics on how long processes and associated services have been running. For a quick, human-readable summary, search for AGGREGATED OVER to view data from either the last three or 24 hours, then search for Summary: to view the list of processes, how long those processes have run at various priorities, and their RAM usage formatted as min-average-max PSS/min-average-max USS.

Why is a process running?

The dumpsys activity processes section lists all currently running processes ordered by oom_adj score (Android indicates process importance by assigning the process an oom_adj value, which can be dynamically updated by ActivityManager). The output is similar to that of a memory snapshot but includes additional information about what is causing the process to run. In the example below, the bolded entries indicate the gms.persistent process is running at vis (visible) priority because the system process is bound to its NetworkLocationService.

Scans

Use the following steps to identify applications performing excessive Bluetooth Low Energy (BLE) scans:

  • Find log messages for BluetoothLeScanner:
    $ grep 'BluetoothLeScanner' ~/downloads/bugreport.txt
    07-28 15:55:19.090 24840 24851 D BluetoothLeScanner: onClientRegistered() - status=0 clientIf=5
    
  • Locate the PID in the log messages. In this example, "24840" and "24851" are PID (process ID) and TID (thread ID).
  • Locate the application associated with the PID:
    PID #24840: ProcessRecord{4fe996a 24840:com.badapp/u0a105}
    

    In this example, the package name is com.badapp.

  • Look up the package name on Google Play to identify the responsible application: https://play.google.com/store/apps/details?id=com.badapp.

Note: For devices running Android 7.0, the system collects data for BLE scans and associates these activities with the initiating application. For details, see Low Energy (LE) and Bluetooth scans.