自 2025 年 3 月 27 日起,我們建議您使用 android-latest-release
而非 aosp-main
建構及貢獻 AOSP。詳情請參閱「Android 開放原始碼計畫變更」。
關於語音互動
透過集合功能整理內容
你可以依據偏好儲存及分類內容。
Voice Interaction Service API 可提供不同潛在語音控制應用程式的抽象概念。您可以按照「應用程式開發」一文所述的規範開發實作項目。本整合指南的內容說明如何將這些應用程式整合至特定 Android Automotive OS (AAOS) 系統映像檔。
術語
本指南中使用的術語如下:
- 輔助資料。啟動語音互動工作階段時,系統可以擷取檢視畫面和螢幕截圖,並將這些資訊傳遞至工作階段。應用程式可以實作
Activity#onProvideAssistData()
和 Activity#onProvideAssistContent()
,藉此公開其他資訊。
- 按住說話 (PTT)。實體語音控制按鈕,通常位於方向盤上。
- RecognitionService (RS)。應用程式透過
SpeechRecognizer
API 使用的語音辨識服務。VIA 必須同時包含 VoiceInteractionService
和 RecognitionService
。
- 輕觸通話 (TTT):軟體語音控制按鈕 (通常是系統 UI 的一部分)。在 Android 中,這也稱為「輔助手勢」。
VoiceInteractionService
:由 VIA 開發人員實作的輕量系統服務。系統會在開機時將所選服務與系統服務繫結,並持續執行。
- VoiceInteractionSession (VIS)。這個類別會封裝使用者互動商業邏輯。負責向使用者顯示語音互動狀態、處理 VoiceInteractor 要求,以及接收 Assist 和螢幕截圖資料。
- VoiceInteractionSessionService (VSS)。服務 (屬於 VIA),負責處理語音互動工作階段。在與使用者進行語音互動期間,這項服務會從 Android 系統服務繫結。這個工作階段的所有商業邏輯都已在
VoiceSession
類別中實作。這項服務只保證在單一使用者的語音會話期間維持運作。
- 語音互動應用程式 (VIA)。設計用於語音控制的 Android 應用程式 (稱為「助理」)。只要在資訊清單中加入
VoiceInteractionService
,即可識別這類應用程式。系統中一次只能選取一個應用程式做為預設應用程式。只有預設應用程式會維持運作 (由系統服務繫結),並且會接收 Push-To-Talk (PTT) 或 Tap-To-Talk (TTT) 事件。
職責
下表說明各方的責任。
汽車製造商 (OEM) |
Android 開放原始碼計劃 |
應用程式開發人員 |
- 使用 AAOS 建構相容的資訊娛樂系統。
- 實作音訊輸入和輸出,可視需要加入 DSP 熱字詞偵測支援功能。
- 為語音互動服務授予系統特權權限。
- 請遵守
VoiceInteractionService 規定,以便存取應用程式的設定畫面。
|
- 定義並改進
VoiceInteractionService 和相關 API。
- 為 VIA 開發人員提供 API 說明文件、程式碼範例和其他支援資料。
- 提供使用者體驗指南,說明相關規定和建議。
|
- 實作
VoiceInteractionService API、RecognitionService API 和 NotificationListenerService API (詳情請參閱「應用程式開發」一文)。
- 提供可自訂的 UI,讓原始設備製造商 (OEM) 調整以配合每個汽車設計系統。
|
使用者體驗規定
OEM 廠商應負最終責任,為客戶提供良好的使用者體驗。原始設備製造商 (OEM) 必須確保所有預先安裝的語音互動服務都符合「預先載入的助理:使用者體驗指南」中所述的規定。
核心助理服務體驗
汽車語音互動應用程式 (VIA) 會執行下列動作:
- [必須] 回應系統處理的語音互動觸發事件 (PTT、TTT)。
- [必備] 以圖像化方式顯示進度 (例如聆聽、處理和執行)。
- [必須] 使用語音或聲音來表示您已瞭解並完成使用者要求。
- [必須] 做為其他應用程式的語音辨識器 (請參閱 SpeechRecognizer API)。
- [SHOULD] 回應啟動字詞觸發事件。
- [MAY] 顯示設定活動,讓使用者設定這項 VIA (例如權限、熱字詞設定和登入)。
- [MAY] 處理輔助資料 (
Intent#ACTION_ASSIST
)
- [MAY] 支援透過螢幕鎖定畫面 (Keyguard) 進行語音互動。
元件
從大方向來看,語音互動應用程式會與下列角色互動:

圖 1. 語音互動動作
詳細資料:
VoiceInteractionManagerService
。這個系統服務負責管理預設 VIA,並將其功能公開給系統的其他部分。
RecognitionService
。這項服務會將語音辨識功能公開給系統中的其他應用程式。
SoundTrigger
:實作熱字詞管理功能,可透過 AlwaysOnHotwordDetector 提供給 VIA。
MediaRecorder
:提供音訊輸入存取權,用於熱字詞偵測 (使用 CPU 時) 和語音辨識。
PhoneWindowManager
/CarInputService
。這些服務負責處理關鍵事件,並透過 VoiceInteractionManagerService
將 PTT 路由至 VIA。
User
:使用者透過觸發條件 (PTT、TTT、熱字) 或 Voice Plate UI 與 VIA 互動。
- CarService、Notifications、Media、Telephony、ContactsProvider 等等。 VoiceInteractionSession 用來執行使用者指令的服務和應用程式。
汽車專屬概念
AAOS 與 Android 的差異如下:
- 除了一般 Google 助理功能之外,AAOS VIA 還可控制車輛功能,例如空調、座椅和車內燈光。只要原始設備製造商 (OEM) 正確設定存取權,即可使用 CarPropertyManager API 整合這些功能 (詳情請參閱「讀取車輛屬性」),如需瞭解如何設定存取權,請參閱「特權權限許可清單」。
- 在汽車應用程式中,自訂和一致性比其他板型規格更為重要。如要進一步瞭解如何實施這些規範,請參閱「自訂化」。
這個頁面中的內容和程式碼範例均受《內容授權》中的授權所規範。Java 與 OpenJDK 是 Oracle 和/或其關係企業的商標或註冊商標。
上次更新時間:2025-07-27 (世界標準時間)。
[[["容易理解","easyToUnderstand","thumb-up"],["確實解決了我的問題","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["缺少我需要的資訊","missingTheInformationINeed","thumb-down"],["過於複雜/步驟過多","tooComplicatedTooManySteps","thumb-down"],["過時","outOfDate","thumb-down"],["翻譯問題","translationIssue","thumb-down"],["示例/程式碼問題","samplesCodeIssue","thumb-down"],["其他","otherDown","thumb-down"]],["上次更新時間:2025-07-27 (世界標準時間)。"],[],[],null,["# About voice interaction\n\nThe Voice Interaction Service API provides an abstraction over different\npotential voice control apps. Implementations can be developed following the guidelines\ndescribed in\n[App development](/docs/automotive/voice/voice_interaction_guide/app_development).\nThe content in this integration guide describes how to integrate these apps into\na specific Android Automotive OS (AAOS) system image.\n\nTerminology\n-----------\n\nThese terms are used through this guide:\n\n- **Assist data.** When a voice interaction session is started, the system is able to capture views and screenshots, and pass this information to the session. Apps can expose additional information by implementing [Activity#onProvideAssistData()](https://developer.android.com/reference/android/app/Activity#onProvideAssistData(android.os.Bundle)) and [Activity#onProvideAssistContent()](https://developer.android.com/reference/android/app/Activity#onProvideAssistContent(android.app.assist.AssistContent)).\n- **[Push-to-talk (PTT)](https://developer.android.com/reference/android/service/voice/VoiceInteractionSession#SHOW_SOURCE_PUSH_TO_TALK)**. Physical voice control button, usually located in the steering wheel.\n- **RecognitionService (RS).** Voice recognition service used by apps through the [SpeechRecognizer](https://developer.android.com/reference/android/speech/SpeechRecognizer)`\n ` API. VIAs must include both the `VoiceInteractionService` *and* the `RecognitionService`.\n- **[Tap-to-talk (TTT)](https://developer.android.com/reference/android/service/voice/VoiceInteractionSession#SHOW_SOURCE_ASSIST_GESTURE)** . Software voice control button, usually included as part of the system UI). In Android this is also referred to as *Assist Gesture*.\n- **[VoiceInteractionService](https://developer.android.com/reference/android/service/voice/VoiceInteractionService)**. Lightweight system service implemented by the VIA developer. The selected service is bound from system service on boot, and is always running.\n- **VoiceInteractionSession (VIS).** This class encapsulates the user interaction business logic. It is responsible for presenting the user with status of the voice interaction, handling VoiceInteractor requests and receiving assist and screenshot data.\n- **VoiceInteractionSessionService (VSS).** A service, part of a VIA, responsible for handling a voice interaction session. This service is bound from Android's system service during a voice interaction with a user. All business logic of this session is implemented in the `VoiceSession` class. This service is only guaranteed to stay alive during a single user voice session.\n- **Voice Interaction App (VIA).** Android app designed to serve as a voice control (referred to as *assistant* ). These apps can be identified by including a `VoiceInteractionService` in their manifest. Only one of these apps can be selected as *default* at a time in the system. Only the default app will be maintained alive (bound from a system service), and will be the receiver of [Push-To-Talk (PTT)](https://developer.android.com/reference/android/service/voice/VoiceInteractionSession#SHOW_SOURCE_PUSH_TO_TALK) or [Tap-To-Talk (TTT)](https://developer.android.com/reference/android/service/voice/VoiceInteractionSession#SHOW_SOURCE_ASSIST_GESTURE) events.\n\nResponsibilities\n----------------\n\nThis table describes the responsibilities of each party.\n\n| Car Manufacturers (OEMs) | AOSP | App Developers |\n|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| - Build a [compatible](/compatibility/android-cdd) infotainment system with AAOS. - Implement audio input and output, optionally including DSP hotword detection support. - Grant system-privileged permissions for the voice interaction services. - Respect `VoiceInteractionService` requirements regarding access to app's settings screens. | - Define and evolve `VoiceInteractionService` and related APIs. - Provide API documentation, sample code and other support material to VIA developers. - Provide UX guidance with requirements and recommendations. | - Implement `VoiceInteractionService` API, RecognitionService API and NotificationListenerService API (see detailed description at [App development](/docs/automotive/voice/voice_interaction_guide/app_development)). - Provide a customizable UI that can be adjusted by OEMs to match each car design system. |\n\nUX requirements\n---------------\n\nOEMs have the ultimate responsibility of providing a good user experience to customers.\nOEMs must ensure that the all pre-installed voice interaction services fulfill the\nrequirements described in\n[Preloaded Assistants: UX Guidance](/static/docs/automotive/voice/voice_interaction_guide/preloaded-assistants_UX-guidelines.pdf).\n\nCore assistant experience\n-------------------------\n\nAn automotive Voice Interaction Application (VIA) performs the following actions:\n\n- \\[MUST\\] Respond to system-handled voice interaction triggers (PTT, TTT).\n- \\[MUST\\] Display a visual representation of their progress (for example, listening, processing, and fulfilling).\n- \\[MUST\\] Use voice or sounds to indicate understanding and completion of user requests.\n- \\[MUST\\] Serve as a speech recognizer for other apps (see the [SpeechRecognizer\n API](https://developer.android.com/reference/android/speech/SpeechRecognizer)).\n- \\[SHOULD\\] Respond to a hotword trigger.\n- \\[MAY\\] Display a settings activity where users can configure this VIA (for example, permissions, hotword configuration, and sign-in).\n- \\[MAY\\] Handle assist data ([Intent#ACTION_ASSIST](https://developer.android.com/reference/android/content/Intent#ACTION_ASSIST))\n- \\[MAY\\] Support voice interaction from Keyguard (lock screen).\n\nComponents\n----------\n\nAt a high level, a voice interaction app interacts with these actors:\n\n**Figure 1.** Voice interaction actors\n\nDetails:\n\n- `VoiceInteractionManagerService`. This system service is responsible for managing the default VIA, and exposing its functionality to the rest of the system.\n- `RecognitionService`. This service exposes speech recognition capabilities to other apps in the system.\n- `SoundTrigger`. Implements hotword management and it's available to VIAs through the AlwaysOnHotwordDetector.\n- `MediaRecorder`. Provides access to audio input for both hotword detection (when using CPU) and speech recognition.\n- `PhoneWindowManager`/`CarInputService`. These services are responsible (among other things) for handling key events, routing PTT to the VIA, by means of the `VoiceInteractionManagerService`.\n- `User`. The user interacts with a VIA by means of Triggers (PTT, TTT, Hotword) or the Voice Plate UI.\n- **CarService, Notifications, Media, Telephony, ContactsProvider, and so on.** Services and apps used by the VoiceInteractionSession to fulfill the user's commands.\n\nAutomotive-specific concepts\n----------------------------\n\nAAOS diverges from Android in the following aspects:\n\n- Besides normal Assistant functionalities, AAOS VIAs can control vehicle functions (for example, HVAC, seats, and interior lights). These functionalities can be integrated using the CarPropertyManager API (see more at [Read a\n vehicle property](/docs/automotive/voice/voice_interaction_guide/fulfilling_commands#vehicle-property)) provided OEMs configure access correctly as described in [Privileged permission allowlisting](/docs/core/permissions/perms-allowlist).\n- Customization and consistency are more relevant in Automotive than in any other form factor. See [Customization](/docs/automotive/voice/voice_interaction_guide/integration_flows#customization) to read more about implementing these guidelines."]]