The application is a Demo-Webshop, where 10 Services communicate using synchronous and asynchronous communication.
The aim of the application is to implement various error handling methods, try them out and measure their effectiveness under pressure of load tests.
This project is part of a Bachelorthesis in Computer Scienceπ
GET /articles/${category}
Parameter | Type | Description |
---|---|---|
category |
string |
Optional. Filter the articles for a certain category. |
GET /exchange/${currency}
Parameter | Type | Description |
---|---|---|
currency |
string |
Required. Currently supported: USD,GBP,INR,CAS,JPY,SEK,PLN |
Mocks an exchange from ${currency} to β¬
POST /cart
JSON-Body | Type | Description |
---|---|---|
article_id |
string |
Required. Will create a cart with this article_id already in it |
Returns the new carts ID.
PUT /cart/${id}
Parameter | Type | Description |
---|---|---|
id |
string |
Required. The ID of your cart. |
JSON-Body | Type | Description |
---|---|---|
article_id |
string |
Required. The ID of the article you want to add to the cart. |
GET /cart/${id}
Parameter | Type | Description |
---|---|---|
id |
string |
Required. The ID of your created cart. |
POST /order
JSON-Body | Type | Description |
---|---|---|
cartId |
string |
Required |
name |
string |
Required |
address |
string |
Required |
creditCard |
string |
Required |
email |
string |
Required |
Creates an order, that will be validated and shipped in the background.
GET /order/${id}
Parameter | Type | Description |
---|---|---|
id |
string |
Required. The ID of your created order. |
Look at the current status of your order.
Order & Stock: MongoDB with ACID-Transactions
Cart: Redis
API-Gateway: gin-gonic/gin
Synchronous Communication: GRPC
Asynchronous Communication: RabbitMQ
Load Balancing: NGINX with Docker DNS
Load Testing: Locust
TSDB Metrics Prometheus
TSDB Logs InfluxDB
Monitoring Dashboard Grafana
Metrics-Exporter: cAdvisor (Containers), Node Exporter (Hardware) (To make these docker swarm compatible, some configurations were orientated on https://github.com/stefanprodan/swarmprom)
These are the used error handling methods to make the application more resilient:
- Saga-Pattern:
- Order Transaction-Chain
- Retry-Mechanism with Backoff-Algorithm:
- All asynchronous Services
- Stateless Services:
- All Services are stateless (only metrics data is stored in memory)
- Redundancy:
- Round Robin Loadbalancing using Docker's DNS-Server
- Service replication using Compose
- Bulkheads:
- Stock Service was split into two Services to separate Order-Duties and Catalogue-Duties
- Limits inside Compose-File
- Circuit Breakers:
- 4 Circuit Breakers inside each API-Gateway Instance
- Adaptive Timeouts:
- Moving Average for every request-type inside API-Gateway's request-handlers
Configuration for some methods can be made under config/docker.env
The locust load tests were launched using docker on a single pc, using 5 worker container and a manager container.
The Demo-Application was started using the docker-swarm.yml on a Docker-Swarm-Cluster of 5 Raspberry Pi 4 Model B Rev 1.1, running Ubuntu 20.04.3 LTS.
All 6 computers are connected via a Gbit-Ethernet Switch.
All images for this application are build for amd64 and arm64.
Inside this Test-Setup the following Testdata was collected:
- Testrun1 - Basic Application
- Testrun2 - Improve Resource Usage
- Testrun3 - Establish a Bulkhead
- Testrun4 - Usage of Circuit Breakers
- Testrun5 - Adaptive Timeouts
Clone the project
git clone https://github.com/Tobias-Pe/microservices-error-handling
Go to the project directory
cd microservices-error-handling
Start all containers with compose
docker-compose up
PS: there are some run configurations (generated with GoLand) in the .run folder
Init manager node for cluster
docker swarm init
On worker nodes paste command from output of init
command
docker swarm join ...
Deploy stack onto swarm cluster on manager node
docker stack deploy --compose-file docker-swarm.yml app
Optional steps:
docker service create \
--name=viz \
--publish=8080:8080/tcp \
--constraint=node.role==manager \
--mount=type=bind,src=/var/run/docker.sock,dst=/var/run/docker.sock \
alexellis2/visualizer-arm:latest
Config docker to publish metrics
If the Circuit Breaker recovers from the open state faster than Prometheus can scrape the metrics of it, then Prometheus will not observe a change in the Circuit Breaker state.
The Nginx-Server will abort connections if there are too many.
Feel free to play around with the configuration of it, if you'd need more concurrent connections.
Some metrics, which can not be initialized to Zero at launch (e.g. all API-Service Webserver Requests with all possible outcomes can not be initialized at launch), will not show up in Grafana the first time such a metric will be recorded.
The reason behind it is that, when the metric is increased or set, it will be initialised to zero and then instantly changed.
This change from 0 to X will very probably not be recorded by Prometheus separately (current scrape intervall 5 Sec).
Therefore, if the dashboard panel uses the increase() query, it will think there is no increase from 0 to X.