Real-time Data Mesh

Prerequisites

Confluent

Confluent Cloud Account
Terraform (in order to build everything in Confluent Cloud)
Docker (in order to run the Kafka Producers)

Getting started

Since you'll need some secrets through this walkthrough, the first thing you should do is create a file for your secrets. This repo will ignore the file env.sh, so using that is a safe bet. Start by cloning the repo, then creating the file with the following command.

git clone https://github.com/confluentinc/realtime-data-mesh && cd realtime-data-mesh

echo "# Confluent Cloud\nexport CONFLUENT_CLOUD_API_KEY="key"\nexport CONFLUENT_CLOUD_API_SECRET="secret"\n" > env.sh

With the secrets file created, go to Confluent Cloud and create Cloud API Keys (guide here) and paste the values into the secrets file for the key and secret respectively.

source env.sh

With all the secrets available to the console, you can switch to the Terraform directory where you'll build the necessary Confluent Cloud resources. Follow these next few steps to create everything, and then wait until it's done before moving on.

cd terraform && terraform init

terraform plan

When prompted, approve the plan by entering "yes", or provide the "-auto-approve" flag to the apply command.

terraform apply

Wait for Terraform to finish creating all the resources, then navigate back to the base directory.

cd ..

Station and Customer data products

Create the Ksql topologies

The stations-enriched and customers-enriched data products have topics but don't have any way of having data generated at the moment. In order for this data to be generated, you'll need to create the Ksql statements that will process the raw data and output the enriched data. To create the statements, you'll need to copy-paste the statements from the provided SQL files into the KsqlDB editor in Confluent Cloud.

Starting with the Customers data, open the file customers-enrichment.sql and copy the entire contents of the file (you'll be pasting it in the editor like this).
Navigate to the KsqlDB editor in the Confluent Cloud console and paste the contents of the file into it.
There is no data yet in the topics, so latest/earliest offset isn't much of an issue. Leave it as latest and execute the statments.

After a few seconds, everything should be created if everything worked correctly. Now, let's redo the process for the Stations data.

Open the file stations-enrichment.sql and copy the entire contents.
Navigate back to the KsqlDB editor in Confluent Cloud and paste the contents.
Like before, leave the auto.offset.reset to "latest" and execute the statements.

Like before, give it a second to execute all the statements before moving on.

Produce some data

With the topologies set up to create your first two data products, it's now time to produce the data. Use the following steps to launch the build and launch the services with Docker.

Before launching the services defined in docker-compose.yaml, build them.
```
docker compose build
```
This might take a few minutes depending on your internet speed and computer.
With the images built, you can now launch the services.
```
docker compose up -d
```
(Optional) Data should now be being produced to Confluent Cloud in the background. To monitor the services for any issues, tail their logs.
```
docker compose logs -f
```
You can Crtl+C at any time to interrupt and stop tailing the logs.

With the data being produced, you can open the Confluent Cloud console and begin to look at the data. With these two data products available, you have now completed the "setup" part of this example. The concept of this exercise is to imagine creating a new data product from existing onces, and these are the existing data products.

Inventory Redistribution data product

Now, imagine that all the previous steps you went through to set things up to this point were already existing and that these following steps your starting point. You will use the existing data products in order to create a new one based on a request from the business.

Create the Ksql topology

In order to create the new data product, you'll need to use Ksql in order to create it. Thankfully, doing so will be fairly simple as the statements are prepared for you. As you did previously, follow these steps to create the Ksql statements.

Open the final SQL file inventory-redistribution.sql and copy its contents.
Navigate to the KsqlDB editor in Confluent Cloud and paste the contents.
This time set the auto.offset.reset to "earliest" and execute the statements.

Wait for everything to execute before moving on.

Adding Business Metadata and Tags

Creating the Ksql topology and having data that meets the business expectations for your new data product doesn't mean you have finished creating your data product. As defined by the business' Central Governance team, a data product needs Domain details, Team details, and Data Product details. These three pieces of Business Metadata capture information about the ownership, responsibility, contact information, and more about the data product.

Find your new topic in the topics menu. It should be named inventory.redistribution.
On the right-hand side of the UI, you should see Business Metadata and Tags. Click "Add business metadata", select "Team details", and fill in the values with the following.

Metadata Value

team Bicycle Inventory Team

channel #bike-inv-oncall

slack_alias @bike-inv-oncall
Add more Business Metadata, this time selecting "Domain details" and adding the following information.

Metadata Value

name Inventory

executive_owner Will LaForest

executive_contact [email protected]
Add the final Business Metadata, choosing "Data Product details" and adding the following information.

Metadata Value

name inventory-redistribution

primary_owner David the Data Engineer

primary_contact [email protected]

responsible_team Bicycle Inventory Team

domain Inventory
As the final step, click in "Add tags to this topic" in the "Tags" section above the Business Metadata, and select "DataProduct".

With all your Business Metadata and Tags added to your new topic, you've successfully created your new data product.

(Optional) Explore the new data product

With your new data product created, you can look into what was accomplished. The two ways you'll do this are with Stream Lineage and the message viewer.

Navigate back to the topic inventory.redistribution if not there.
Select the "Messages" tab and select the -1 offset for any partition of your choice by typing -1 in the "Jump to offset" search bar.
Expand the message, and you should see the new data product which contains a match between two Bike stations where the surplus inventory of one is great than or equal to the shortage inventory of the other.
Select "Explore Stream Lineage" in the top right hand corner of the screen when you're done viewing messages.
In this new view, you can see the consumption of the stations.enriched topic, the creation of new topics for high and low inventory, and the joining of them back together to perform the matching.

Cleanup

Since the resources you created have billing implications, it's a good idea to tear things down when you're done. The main things you'll want to teardown are Docker and everything Terraform built.

In the root directory of the repo, stop the containers with the following command. docker compose down
In the Confluent Cloud console, navigate to the inventory.redistribution topic and removed the "DataProduct" tag.
Once the services have been stopped, navigate to the Terraform directory. cd terraform/
Destroy everything with the following command. terraform destroy -auto-approve
- If you run into any issues, you might need to import the business metadata you created by hand in order for Terraform to destroy it.
- If for some reason the above doesn't work, you can always just delete the environment Terraform created named realtime-data-mesh and then reset its configuration in your local directory.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.github		.github
images		images
ride4ever		ride4ever
schemas		schemas
terraform		terraform
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
customers-enrichment.sql		customers-enrichment.sql
docker-compose.yaml		docker-compose.yaml
inventory-redistribution.sql		inventory-redistribution.sql
service.yml		service.yml
stations-enrichment.sql		stations-enrichment.sql

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Real-time Data Mesh

Prerequisites

Getting started

Station and Customer data products

Create the Ksql topologies

Produce some data

Inventory Redistribution data product

Create the Ksql topology

Adding Business Metadata and Tags

(Optional) Explore the new data product

Cleanup

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

Metadata	Value
team	Bicycle Inventory Team
channel	#bike-inv-oncall
slack_alias	@bike-inv-oncall

Metadata	Value
name	Inventory
executive_owner	Will LaForest
executive_contact	[email protected]

Metadata	Value
name	inventory-redistribution
primary_owner	David the Data Engineer
primary_contact	[email protected]
responsible_team	Bicycle Inventory Team
domain	Inventory

License

confluentinc/realtime-data-mesh

Folders and files

Latest commit

History

Repository files navigation

Real-time Data Mesh

Prerequisites

Getting started

Station and Customer data products

Create the Ksql topologies

Produce some data

Inventory Redistribution data product

Create the Ksql topology

Adding Business Metadata and Tags

(Optional) Explore the new data product

Cleanup

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages