This repository provides Terraform configurations to automate the deployment of CANedge data processing infrastructure on Google Cloud Platform. The deployment is split into three parts:
- Input Bucket Deployment: Creates an input bucket for storing uploaded CANedge log files
- MF4-to-Parquet Deployment: Creates an output bucket and Cloud Function for DBC decoding MDF to Parquet
- BigQuery Deployment: Creates BigQuery resources for querying Parquet data
- Log in to the Google Cloud Console
- Select your project from the dropdown (top left)
- Click on the Cloud Shell icon (>_) to open Cloud Shell (top right)
- Once Cloud Shell is open, run the following command to clone this repository:
cd ~ && rm -rf canedge-google-cloud-terraform && git clone https://github.com/CSS-Electronics/canedge-google-cloud-terraform.git && cd canedge-google-cloud-terraform
If you're just getting started, first deploy the input bucket where your CANedge devices will upload MF4 files:
chmod +x deploy_input_bucket.sh && ./deploy_input_bucket.sh --project YOUR_PROJECT_ID --region YOUR_REGION --bucket YOUR_BUCKET_NAMEReplace:
YOUR_PROJECT_IDwith your active Google Cloud project ID (e.g.bigquery7-464008)YOUR_REGIONwith your desired region (e.g.,europe-west1- see this link for available regions)YOUR_BUCKET_NAMEwith your desired bucket name (e.g.canedge-test-bucket-20)
Once you have an input bucket set up, you can optionally deploy the processing pipeline to automatically DBC decode uploaded MF4 files to Parquet format and provide backlog/aggregation processing capabilities:
chmod +x deploy_mdftoparquet.sh && ./deploy_mdftoparquet.sh --project YOUR_PROJECT_ID --bucket YOUR_INPUT_BUCKET_NAME --id YOUR_UNIQUE_ID --email YOUR_EMAILReplace:
YOUR_PROJECT_IDwith your Google Cloud project IDYOUR_INPUT_BUCKET_NAMEwith your input bucket nameYOUR_UNIQUE_IDwith a short unique identifier (e.g.datalake1)YOUR_EMAILwith your email address to receive notifications
Optional parameters:
--zip YOUR_FUNCTION_ZIP: Override the default main function ZIP file--zip-backlog YOUR_BACKLOG_FUNCTION_ZIP: Override the default backlog function ZIP--zip-aggregation YOUR_AGGREGATION_FUNCTION_ZIP: Override the default aggregation function ZIP- Download the ZIP files from the CANedge Intro (Process/MF4 decoders/Parquet data lake/Google)
Note
Make sure to upload all the ZIP files to your input bucket root before deployment
Important
If the deployment fails with a message regarding Eventarc propagation delay, simply re-run the deployment after a few minutes to complete it.
After setting up the MF4-to-Parquet pipeline, you can deploy BigQuery to query your Parquet data lake:
chmod +x deploy_bigquery.sh && ./deploy_bigquery.sh --project YOUR_PROJECT_ID --bucket YOUR_INPUT_BUCKET_NAME --id YOUR_UNIQUE_ID --dataset YOUR_DATASET_IDReplace:
YOUR_PROJECT_IDwith your Google Cloud project IDYOUR_INPUT_BUCKET_NAMEwith your input bucket nameYOUR_UNIQUE_IDwith a short unique identifier (e.g.datalake1)YOUR_DATASET_IDwith your BigQuery dataset ID (e.g.canedge_data)
Optional parameters:
--zip YOUR_FUNCTION_ZIP: Override the default BigQuery function ZIP file- Download the ZIP from the CANedge Intro (Process/MF4 decoders/Parquet data lake/Google)
Note
Make sure to upload the ZIP to your input bucket root before deployment
If you encounter issues with either deployment:
- When deploying the MF4-to-Parquet pipeline for the first time in a Google project, the deployment may fail due to propagation delay on Eventarc permissions - in this case, simply re-run the deployment after a few minutes
- Make sure you have proper permissions in your Google Cloud project
- Use unique identifiers with the
--idparameter to avoid resource conflicts - Check the Google Cloud Console logs for detailed error messages
- For the MF4-to-Parquet and BigQuery deployments, ensure the relevant function ZIP files are uploaded to your input bucket before deployment
- Contact us if you need deployment support
input_bucket/- Terraform configuration for input bucket deploymentmdftoparquet/- Terraform configuration for MF4-to-Parquet pipeline deploymentmodules/- Terraform modules specific to the MF4-to-Parquet pipelineoutput_bucket/- Module for creating the output bucketiam/- Module for setting up IAM permissionscloud_function/- Module for deploying the main Cloud Functioncloud_function_backlog/- Module for deploying the Backlog Cloud Functioncloud_function_aggregation/- Module for deploying the Aggregation Cloud Functioncloud_scheduler_backlog/- Module for the Backlog Cloud Scheduler (paused, manual trigger)cloud_scheduler_aggregation/- Module for the Aggregation Cloud Schedulermonitoring/- Module for setting up monitoring configurations
bigquery/- Terraform configuration for BigQuery deploymentmodules/- Terraform modules specific to the BigQuery deploymentdataset/- Module for creating the BigQuery datasetservice_accounts/- Module for setting up service accountscloud_function/- Module for deploying the BigQuery mapping functioncloud_scheduler_map_tables/- Module for the BigQuery Map Tables Cloud Scheduler (paused, manual trigger)
bigquery-function/- Source code for BigQuery table mapping functiondeploy_input_bucket.sh- Script for input bucket deploymentdeploy_mdftoparquet.sh- Script for MF4-to-Parquet pipeline deploymentdeploy_bigquery.sh- Script for BigQuery deployment
