This platform requires you to deploy the infrastructure to your Google Cloud project.
Get the source code
The source code for the Cloud Telemetry Simulation platform is hosted on
sdv.googlesource.com, which requires authentication as described in
Access tooling repositories.
To access the source code, clone the Cloud Telemetry Simulation repository:
git clone https://sdv.googlesource.com/external/cloud_telemetry_simulation-external
Prerequisites
To deploy the platform, ensure you meet the following prerequisites:
- A Google Cloud project with Billing enabled.
- Web demo security: If you deploy the Web Demo, you must configure an OAuth 2.0 Client ID in Google Cloud APIs & Services > Credentials to secure the App Engine application and restrict access to authorized Google Accounts.
- Software Defined Vehicle (SDV) build artifacts: You must have your own
compiled SDV image artifacts. These are not provided in this repository.
cvd-host_package.tar.gzsdv_core_cf-img-<version>.zip
- Permissions: The user or service account running Terraform must have sufficient permissions to create the resources defined in the configuration (for example, Project Editor, or a custom role with permissions for Compute Engine, Cloud Functions, Identity and Access Management, Cloud Storage, and other necessary services).
- Tools:
- Google Cloud CLI (
gcloud CLI) - Terraform (used version in the repository)
- Docker
- Go (used version for orchestrator functions in the repository)
- Google Cloud CLI (
Deploy the Google Cloud infrastructure
Deploying the simulation platform involves two main steps: using Terraform to deploy the core infrastructure to Google Cloud, and building and pushing the simulation agent Docker image to Artifact Registry. This section guides you through deploying the infrastructure.
Enter values for the following variables to update the code snippets on this page:
Configure the Terraform backend: Create a file named
environments/ENVIRONMENT/backend.hclto specify where Terraform stores its state file in Cloud Storage.# environments/ENVIRONMENT/backend.hcl bucket = "TF_BUCKET_NAME" prefix = "sdv-telemetry-simulation"Configure project variables: Create a file named
environments/ENVIRONMENT/variables.tfvarswith your project's details.# environments/ENVIRONMENT/variables.tfvars project_id = "PROJECT_ID" default_region = "REGION" default_zone = "ZONE" agent_docker_image = "REGION-docker.pkg.dev/PROJECT_ID/sim-agents/simulation-agent" # Security: Map logical tags to SHA256 digests # Security: Map logical tags to SHA256 digests (optional) image_fingerprints = { "latest" = "sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855", "stable" = "sha256:88d4266fd4e6338d13b845fcf289579d209c897823b9217da3e161936f031589" } # Parallel Execution Limit (Default: 5) max_concurrent_simulations = 5Apply the Terraform configuration: Navigate to the
infrastructuredirectory, then initialize and apply the configuration:# Initialize Terraform with your backend configuration terraform init -backend-config=environments/ENVIRONMENT/backend.hcl # (Optional) Preview the changes terraform plan --var-file=environments/ENVIRONMENT/variables.tfvars # Apply the changes to deploy the infrastructure terraform apply --var-file=environments/ENVIRONMENT/variables.tfvars
Build and push the simulation agent image
The simulation agent runs the simulation on the Compute Engine virtual machine (VM). You build it with your SDV artifacts and push it to Artifact Registry.
To build and push the simulation agent image:
Place artifacts: Copy your
cvd-host_package.tar.gzandsdv_core_cf-img-<version>.zipfiles into thesimulation-agent/sdv-image-resources/directory.Build and push: Navigate to the
simulation-agentdirectory, then build and push the image. Replace the image path with the one you configured in yourvariables.tfvarsfile.# Example using the path from the .tfvars example above export AGENT_IMAGE="REGION-docker.pkg.dev/PROJECT_ID/sim-agents/simulation-agent:latest" # Build the image docker build -t $AGENT_IMAGE . # Push the image to Artifact Registry docker push $AGENT_IMAGEUpdate fingerprints: After pushing a new image, you might need to get its SHA256 digest and update the
image_fingerprintsmap in yourvariables.tfvarsfile, then rerunterraform apply.# Get the digest using gcloud gcloud container images describe $AGENT_IMAGE --format="value(image_summary.digest)"Your Cloud Telemetry Simulation platform is deployed and ready to accept simulation requests.
Operations and troubleshooting
This solution lets you use Google Cloud built-in tools for observability. It consumes computation resources only per request and during simulation execution.
Cost management
The architecture is designed to be cost-effective by using serverless and ephemeral resources. Costs are primarily driven by:
- Compute Engine: Billed for the time the simulation VMs are running. Using Spot VMs can significantly reduce this cost.
- Cloud Functions: Billed per invocation.
- Cloud Storage: Billed for storing input and output files and logs.
- Firestore: Billed for reads, writes, and data storage.
Observability
All components are integrated with Google Cloud's operations suite.
- Logs Explorer: This is your primary tool for troubleshooting. You can
filter logs by resource:
- Cloud Functions: Check logs for the
receive-requestorschedule-simulationfunctions to debug orchestration issues. - Compute Engine: Check VM instance logs for startup or shutdown problems.
- Simulation agent: The agent running inside the Docker container forwards its logs to Logs Explorer. Filter by the VM instance name to see detailed simulation progress.
- Cloud Functions: Check logs for the
- Cloud Storage: For completed simulations, the
logcatandbugreportfiles from the Cuttlefish device are uploaded to the simulation's output directory in the Cloud Storage bucket, providing deep insight into the Android environment's behavior.
Service accounts
Terraform creates several service accounts to enable a secure, least-privilege environment. Key service accounts include:
Execution identity (VM):
simulation-agent:- Attached to: The Compute Engine VMs running the simulation.
- Role: Allows the VM to upload results and signal completion.
- Permissions:
roles/storage.objectUser: Reads inputs and uploads artifacts (logs, reports) to Cloud Storage.roles/run.invoker: Authenticates and invokes thefinish-simulationfunction.
Orchestration identities (functions):
read-simulations-function:- Attached to: The
read-simulationCloud Function. - Permissions:
roles/datastore.user: Reads simulation and running-vm records in Firestore.
- Attached to: The
receive-request-function:- Attached to: The
receive-requestCloud Function. - Permissions:
roles/datastore.user: Creates newPENDINGsimulation records in Firestore.roles/storage.objectUser: Verifies the existence of input files in Cloud Storage.
- Attached to: The
scheduler-function:- Attached to: The
schedule-simulationCloud Function. Permissions:
- `roles/pubsub.subscriber`: Pulls messages from the simulation queue. - `roles/datastore.user`: Performs atomic reads and writes to the `running-vms` counter. - `roles/compute.instanceAdmin.v1`: Creates and starts Compute Engine VMs. - `roles/iam.serviceAccountUser`: This permission allows this function to assign the `simulation-agent` service account to the VMs it creates.
- Attached to: The
simulation-finisher-function:- Attached to: The
finish-simulationCloud Function. - Permissions:
-
roles/compute.instanceAdmin.v1: Deletes the VM after execution completes. -roles/datastore.user: Updates the simulation status toCOMPLETEDorFAILED.
- Attached to: The
delete-simulation-function:- Attached to: The
delete-simulationCloud Function. - Permissions:
-
roles/compute.instanceAdmin.v1: Force-deletes virtual machines during cancellation. -roles/datastore.user: Updates the status for canceled jobs.
- Attached to: The
Trigger identities:
scheduler-trigger:- Used by: Eventarc (events) and Cloud Scheduler triggers.
- Permissions:
roles/eventarc.eventReceiverandroles/run.invokerto trigger the orchestrator functions.
cleanup-scheduler:- Used by: The Cloud Scheduler cron job for cleanup.
- Permissions:
roles/run.invokerto trigger the cleanup logic.
Managing Identity and Access Management policies for these service accounts is the primary way to control access and permissions within the system.