使用汽车监控定时器帮助调试 VHAL。汽车监控定时器可监控进程的运行状况并终止运行状况不佳的进程。如需通过汽车监控定时器监控某个进程,必须向汽车监控定时器注册该进程。当汽车监控定时器终止运行状况不佳的进程时,它会像其他“应用无响应”(ANR) 转储一样,将进程的状态写入 data/anr
。这样做有助于执行调试过程。
本文将介绍供应商 HAL 和服务如何向汽车监控定时器注册进程。
供应商 HAL
通常,供应商 HAL 会使用 hwbinder
对应的线程池。但是,汽车监控定时器客户端会通过 binder
(不同于 hwbinder
)与汽车监控定时器守护程序通信。因此,供应商 HAL 会使用 binder
对应的线程池。
在 makefile 中指定汽车监控定时器 AIDL
- 在
shared_libs
中添加carwatchdog_aidl_interface-ndk_platform
:Android.bp
:cc_defaults { name: "vhal_v2_0_defaults", shared_libs: [ "libbinder_ndk", "libhidlbase", "liblog", "libutils", "android.hardware.automotive.vehicle@2.0", "carwatchdog_aidl_interface-ndk_platform", ], cflags: [ "-Wall", "-Wextra", "-Werror", ], }
添加 SELinux 政策
-
允许
system_server
终止您的 HAL。如果您没有system_server.te
,请创建一个。强烈建议您为每台设备添加一个 SELinux 政策。 -
允许供应商 HAL 使用 binder(
binder_use
宏),并将供应商 HAL 添加到carwatchdog
客户端域(carwatchdog_client_domain
宏)中。请参阅systemserver.te
和vehicle_default.te
的以下代码:# Allow system_server to kill vehicle HAL allow system_server hal_vehicle_server:process sigkill;
# Configuration for register VHAL to car watchdog carwatchdog_client_domain(hal_vehicle_default) binder_use(hal_vehicle_default)
通过继承 BnCarWatchdogClient 实现客户端类
-
在
checkIfAlive
中,执行健康检查。例如,发布到线程循环处理程序。如果运行状况良好,则调用ICarWatchdog::tellClientAlive
。 请参阅WatchogClient.h
和WatchogClient.cpp
的以下代码:class WatchdogClient : public aidl::android::automotive::watchdog::BnCarWatchdogClient { public: explicit WatchdogClient(const ::android::sp<::android::Looper>& handlerLooper, VehicleHalManager* vhalManager);
ndk::ScopedAStatus checkIfAlive(int32_t sessionId, aidl::android::automotive::watchdog::TimeoutLength timeout) override; ndk::ScopedAStatus prepareProcessTermination() override; };ndk::ScopedAStatus WatchdogClient::checkIfAlive(int32_t sessionId, TimeoutLength /*timeout*/) { // Implement or call your health check logic here return ndk::ScopedAStatus::ok(); }
启动 binder 线程并注册客户端
- 为 binder 通信创建线程池。如果供应商 HAL 将 hwbinder 用于自己的目的,则您必须为汽车监控定时器 binder 通信再创建一个线程池。
-
使用名称搜索守护程序并调用
ICarWatchdog::registerClient
。汽车监控定时器守护程序的接口名称为android.automotive.watchdog.ICarWatchdog/default
。 -
根据服务响应能力,选择汽车监控定时器支持的以下三种超时之一,然后在对
ICarWatchdog::registerClient
的调用中传递该超时:- critical(3s)
- moderate(5s)
- normal(10s)
VehicleService.cpp
和WatchogClient.cpp
的以下代码:int main(int /* argc */, char* /* argv */ []) { // Set up thread pool for hwbinder configureRpcThreadpool(4, false /* callerWillJoin */); ALOGI("Registering as service..."); status_t status = service->registerAsService(); if (status != OK) { ALOGE("Unable to register vehicle service (%d)", status); return 1; } // Setup a binder thread pool to be a car watchdog client. ABinderProcess_setThreadPoolMaxThreadCount(1); ABinderProcess_startThreadPool(); sp<Looper> looper(Looper::prepare(0 /* opts */)); std::shared_ptr<WatchdogClient> watchdogClient = ndk::SharedRefBase::make<WatchdogClient>(looper, service.get()); // The current health check is done in the main thread, so it falls short of capturing the real // situation. Checking through HAL binder thread should be considered. if (!watchdogClient->initialize()) { ALOGE("Failed to initialize car watchdog client"); return 1; } ALOGI("Ready"); while (true) { looper->pollAll(-1 /* timeoutMillis */); } return 1; }
bool WatchdogClient::initialize() { ndk::SpAIBinder binder(AServiceManager_getService("android.automotive.watchdog.ICarWatchdog/default")); if (binder.get() == nullptr) { ALOGE("Failed to get carwatchdog daemon"); return false; } std::shared_ptr<ICarWatchdog> server = ICarWatchdog::fromBinder(binder); if (server == nullptr) { ALOGE("Failed to connect to carwatchdog daemon"); return false; } mWatchdogServer = server; binder = this->asBinder(); if (binder.get() == nullptr) { ALOGE("Failed to get car watchdog client binder object"); return false; } std::shared_ptr<ICarWatchdogClient> client = ICarWatchdogClient::fromBinder(binder); if (client == nullptr) { ALOGE("Failed to get ICarWatchdogClient from binder"); return false; } mTestClient = client; mWatchdogServer->registerClient(client, TimeoutLength::TIMEOUT_NORMAL); ALOGI("Successfully registered the client to car watchdog server"); return true; }
供应商服务(原生)
指定汽车监控定时器 AIDL makefile
- 在
shared_libs
中添加carwatchdog_aidl_interface-ndk_platform
。Android.bp
cc_binary { name: "sample_native_client", srcs: [ "src/*.cpp" ], shared_libs: [ "carwatchdog_aidl_interface-ndk_platform", "libbinder_ndk", ], vendor: true, }
添加 SELinux 政策
- 如需添加 SELinux 政策,请允许供应商服务域使用 binder(
binder_use
宏),并将供应商服务域添加到carwatchdog
客户端域(carwatchdog_client_domain
宏)中。请参阅sample_client.te
和file_contexts
的以下代码:type sample_client, domain; type sample_client_exec, exec_type, file_type, vendor_file_type; carwatchdog_client_domain(sample_client) init_daemon_domain(sample_client) binder_use(sample_client)
/vendor/bin/sample_native_client u:object_r:sample_client_exec:s0
通过继承 BnCarWatchdogClient 实现客户端类
- 在
checkIfAlive
中,执行健康检查。一种选择是发布到线程循环处理程序。如果运行状况良好,则调用ICarWatchdog::tellClientAlive
。 请参阅SampleNativeClient.h
和SampleNativeClient.cpp
的以下代码:class SampleNativeClient : public BnCarWatchdogClient { public: ndk::ScopedAStatus checkIfAlive(int32_t sessionId, TimeoutLength timeout) override; ndk::ScopedAStatus prepareProcessTermination() override; void initialize(); private: void respondToDaemon(); private: ::android::sp<::android::Looper> mHandlerLooper; std::shared_ptr<ICarWatchdog> mWatchdogServer; std::shared_ptr<ICarWatchdogClient> mClient; int32_t mSessionId; };
ndk::ScopedAStatus WatchdogClient::checkIfAlive(int32_t sessionId, TimeoutLength timeout) { mHandlerLooper->removeMessages(mMessageHandler, WHAT_CHECK_ALIVE); mSessionId = sessionId; mHandlerLooper->sendMessage(mMessageHandler, Message(WHAT_CHECK_ALIVE)); return ndk::ScopedAStatus::ok(); } // WHAT_CHECK_ALIVE triggers respondToDaemon from thread handler void WatchdogClient::respondToDaemon() { // your health checking method here ndk::ScopedAStatus status = mWatchdogServer->tellClientAlive(mClient, mSessionId); }
启动 binder 线程并注册客户端
汽车监控定时器守护程序的接口名称为 android.automotive.watchdog.ICarWatchdog/default
。
- 使用名称搜索守护程序并调用
ICarWatchdog::registerClient
。请参阅main.cpp
和SampleNativeClient.cpp
的以下代码:int main(int argc, char** argv) { sp<Looper> looper(Looper::prepare(/*opts=*/0)); ABinderProcess_setThreadPoolMaxThreadCount(1); ABinderProcess_startThreadPool(); std::shared_ptr<SampleNativeClient> client = ndk::SharedRefBase::make<SampleNatvieClient>(looper); // The client is registered in initialize() client->initialize(); ... }
void SampleNativeClient::initialize() { ndk::SpAIBinder binder(AServiceManager_getService( "android.automotive.watchdog.ICarWatchdog/default")); std::shared_ptr<ICarWatchdog> server = ICarWatchdog::fromBinder(binder); mWatchdogServer = server; ndk::SpAIBinder binder = this->asBinder(); std::shared_ptr<ICarWatchdogClient> client = ICarWatchdogClient::fromBinder(binder) mClient = client; server->registerClient(client, TimeoutLength::TIMEOUT_NORMAL); }
供应商服务 (Android)
通过继承 CarWatchdogClientCallback 实现客户端
- 按下面所示修改新文件:
private final CarWatchdogClientCallback mClientCallback = new CarWatchdogClientCallback() { @Override public boolean onCheckHealthStatus(int sessionId, int timeout) { // Your health check logic here // Returning true implies the client is healthy // If false is returned, the client should call // CarWatchdogManager.tellClientAlive after health check is // completed } @Override public void onPrepareProcessTermination() {} };
注册客户端
- 调用
CarWatchdogManager.registerClient()
:private void startClient() { CarWatchdogManager manager = (CarWatchdogManager) car.getCarManager( Car.CAR_WATCHDOG_SERVICE); // Choose a proper executor according to your health check method ExecutorService executor = Executors.newFixedThreadPool(1); manager.registerClient(executor, mClientCallback, CarWatchdogManager.TIMEOUT_NORMAL); }
取消注册客户端
- 服务完成时调用
CarWatchdogManager.unregisterClient()
:private void finishClient() { CarWatchdogManager manager = (CarWatchdogManager) car.getCarManager( Car.CAR_WATCHDOG_SERVICE); manager.unregisterClient(mClientCallback); }
检测被汽车监控定时器终止的进程
汽车监控定时器会在已向其注册的进程(供应商 HAL、供应商原生服务、供应商 Android 服务)卡死且无响应时,转储/终止这些进程。可通过检查 logcat 来检测此类转储。转储或终止有问题的进程后,汽车监控定时器会输出日志 carwatchdog killed process_name (pid:process_id)
。因此:
$ adb logcat -s CarServiceHelper | fgrep "carwatchdog killed"
系统会捕获相关日志。例如,如果 KitchenSink 应用(汽车监控定时器客户端)卡死,系统会向日志中写入如下所示的代码行:
05-01 09:50:19.683 578 5777 W CarServiceHelper: carwatchdog killed com.google.android.car.kitchensink (pid: 5574)
如需确定 KitchenSink 应用卡死的原因或位置,请使用存储在 /data/anr
中的进程转储,就像使用 Activity ANR 案例一样。
$ adb root $ adb shell grep -Hn "pid process_pid" /data/anr/*
以下示例输出特定于 KitchenSink 应用:
$ adb shell su root grep -Hn "pid 5574" /data/anr/*. /data/anr/anr_2020-05-01-09-50-18-290:3:----- pid 5574 at 2020-05-01 09:50:18 ----- /data/anr/anr_2020-05-01-09-50-18-290:285:----- Waiting Channels: pid 5574 at 2020-05-01 09:50:18 -----
找到转储文件(例如,以上示例中的 /data/anr/anr_2020-05-01-09-50-18-290
),并开始进行分析。