Kubernetes Observability with ECK and Elastic Agent

Introduction

The following content shows ECK capabilities and how easy we can implement a Kubernetes Observability use case with the help of the Elastic Stack. The manifests are not intended to be used in production.

Components:

GKE Cluster
kube-state-metrics
ECK (Elastic Cloud on Kubernetes) operator
Elasticsearch Cluster with Kibana
Elastic Agent acting as Fleet Server (Deployment)
Elastic Agent for kubernetes node level metrics and logs (DaemonSet)
Elastic Agent for kubernetes cluster level metrics (Deployment)
APM Server (for APM demo)
Example application (for APP demo)

There's also an interesing guide here which doesn't use ECK: https://www.elastic.co/guide/en/starting-with-the-elasticsearch-platform-and-its-solutions/current/getting-started-kubernetes.html

ECK Installation

Before installing ECK on GKE ensure your user has cluster-admin permissions.

kubectl create clusterrolebinding cluster-admin-binding-gke --clusterrole=cluster-admin --user=$(gcloud auth list --filter=status:ACTIVE --format="value(account)")

Deploy CRDs:

kubectl create -f https://download.elastic.co/downloads/eck/2.13.0/crds.yaml

Deploy Operator:

kubectl apply -f https://download.elastic.co/downloads/eck/2.13.0/operator.yaml

The following commands should be run from the resources directory of the repository:

cd resources

Kube-state-metrics installation

Kube-state-metrics is an external component used by some of the Kubernetes Integration metricsets in charge of cluster monitoring.

Installation instructions can be found here. In our case:

kubectl apply -f kube-state-metrics-v2.12.0/

After the installation, KSM will be available via http://kube-state-metrics.kube-system.svc.cluster.local:8080.

Elasticsearch and Kibana deployment

Deploy Elasticsearch and Kibana:

kubectl apply -f 01_es_quickstart.yaml -f 02_kb_quickstart.yaml

Review all objects created by ECK:

Services
Secrets
ConfigMaps
StatefulSets for Elasticsearch
Deployment for Kibana

ECK creates and maintains a lot of secrets:

elastic user password
CA certificate for HTTPS endpoints (separate secrets for Elasticsearch and Kibana)
Configuration files
Passwords for certain users

Gain access to the platform

The elastic password can be obtained from one of the created Secrets:

kubectl get secret elasticsearch-quickstart-es-elastic-user -o=jsonpath='{.data.elastic}' | base64 --decode

Access Kibana via the Kubernetes service kibana-quickstart-kb-http.

Deploy RBAC for Elastic Agents

kubectl apply -f 10_RBAC_ElasticAgent.yaml

Deploy Fleet Server

kubectl apply -f 11_FleetServer.yaml

Deploy Elastic Agent

kubectl apply -f 12_ElasticAgent-DaemonSet.yaml

Verify all is running properly.

You will see that the DaemonSet agents are providing system monitoring information but nothing about Kubernetes. That's because the created policy didn't include Kubernetes integration. We will add it manually in the next step.

Add Kubernetes Integration to the DaemonSet agents policy

Open the Fleet Policy used by the DaemonSet agents and add Kubernetes integration with the following inputs configuration:

All Kubelet API related metrics
All kube-state-metrics related metrics (Leader Election enabled). Ensure to use the right host for all the state metricsets: kube-state-metrics.kube-system.svc.cluster.local:8080
Kubernetes Events (Leader Election enabled)
Apiserver metrics (Leader Election enabled)
Container logs

After applying the changes all the data should be flowing. Interesting views:

Discover for troubleshooting, filtering data or checking available values of different fields.
Kubernetes Overview Dashboard
Kubernetes Pods Dashboard
Fleet UI
Observability UI

Note: If the cluster level metrics (with leader election) are not flowing, you might need to delete the lease so it's recreated:

kubectl delete lease elastic-agent-cluster-leader

This could happen if an unexpected agent has acquired the lease (for example the Fleet Server).

APM Demo

We can add an APM Standalone Server as described in the ECK documentation.

Prerequisite: The APM Server, even when being run in standalone mode (not managed by Fleet) requires the APM Integration to be added in Kibana.

Deploy APM Server:

kubectl apply -f 40_apm_server-8.14.1.yaml

Obtain the token for the clients connections:

kubectl get secret/apm-server-quickstart-apm-token -o go-template='{{index .data "secret-token" | base64decode}}'

Obtain the URL for the clients connections:

kubectl get service --selector='common.k8s.elastic.co/type=apm-server'

Configure the URL and Token in the application (app-example/app-deployment.yaml), like:

          - name: ELASTIC_APM_SERVER_URL
            value: "https://apm-server-quickstart-apm-http.default.svc.cluster.local:8200"
          - name: ELASTIC_APM_SECRET_TOKEN
            value: "db2KXI7Z011CvE54t5DJ37nQ"

Deploy the application:

kubectl apply -f apm-example/

Generate some traffic:

kubectl port-forward svc/flask-counter-svc 5000:5000

# and in another window
curl localhost:5000

Stack Monitoring demo

Deploy 2 extra clusters that will be sending monitoring data to the previously created clusters.

kubectl apply -f 20_es_prod_tiers_example.yaml
kubectl apply -f 21_es_dev_qa_example.yaml

Create a central monitoring cluster and ship logs and metrics from all clusters to this new one instead.

kubectl apply -f 30_stack_monitoring_dedicated_cluster.yaml

# Then apply changes in the other manifest to send monitoring data to this new cluster instead

Uninstalling everything:

In order to delete all generated resources:

kubectl delete -f resources/
kubectl delete -f resources/apm-example
kubectl delete -f resources/kube-state-metrics-v2.12.0

If we also want to uninstall ECK:

kubectl delete -f https://download.elastic.co/downloads/eck/2.13.0/operator.yaml
kubectl delete -f https://download.elastic.co/downloads/eck/2.13.0/crds.yaml

Troubleshooting notes

Elastic agent possible issues:

Check for lease adquisition, needed for the data streams that require a leader election.

 k logs elastic-agent-quickstart-agent-8gbgt | grep -i lease
I0703 14:19:14.082880     996 leaderelection.go:248] attempting to acquire leader lease default/elastic-agent-cluster-leader...
I0703 14:19:42.378425     996 leaderelection.go:258] successfully acquired lease default/elastic-agent-cluster-leader

Check that all expected metricsets are being provided, for example state_*:

{"log.level":"info","@timestamp":"2024-07-03T14:52:09.977Z","message":"Non-zero metrics in the last 30s","component":{"binary":"metricbeat","dataset":"elastic_agent.metricbeat","id":"kubernetes/metrics-default","type":"kubernetes/metrics"},"log":{"source":"kubernetes/metrics-default"},"log.logger":"monitoring","log.origin":{"file.line":187,"file.name":"log/log.go","function":"github.com/elastic/beats/v7/libbeat/monitoring/report/log.(*reporter).logSnapshot"},"service.name":"metricbeat","monitoring":{"ecs.version":"1.6.0","metrics":{"beat":{"cgroup":{"memory":{"mem":{"usage":{"bytes":0}}}},"cpu":{"system":{"ticks":780,"time":{"ms":140}},"total":{"ticks":5400,"time":{"ms":1060},"value":5400},"user":{"ticks":4620,"time":{"ms":920}}},"handles":{"limit":{"hard":1048576,"soft":1048576},"open":22},"info":{"ephemeral_id":"6c01ac15-4c51-41a5-8d41-8872aee5ae7d","uptime":{"ms":631602},"version":"8.14.1"},"memstats":{"gc_next":98676648,"memory_alloc":53293520,"memory_sys":26364944,"memory_total":742908704,"rss":228708352},"runtime":{"goroutines":314}},"filebeat":{"harvester":{"open_files":0,"running":0}},"libbeat":{"config":{"module":{"running":23}},"output":{"events":{"acked":1872,"active":0,"batches":3,"duplicates":2,"total":1874},"read":{"bytes":49958,"errors":3},"write":{"bytes":578975,"latency":{"histogram":{"count":56,"max":981,"mean":103.03571428571429,"median":67,"min":49,"p75":83.75,"p95":287.5499999999994,"p99":981,"p999":981,"stddev":145.7396117696747}}}},"pipeline":{"clients":23,"events":{"active":211,"published":1854,"total":1854},"queue":{"acked":1874}}},"metricbeat":{"kubernetes":{"apiserver":{"events":993,"success":993},"container":{"events":57,"success":57},"node":{"events":3,"success":3},"pod":{"events":33,"success":33},"proxy":{"events":18,"success":18},"state_container":{"events":201,"success":201},"state_daemonset":{"events":81,"success":81},"state_deployment":{"events":36,"success":36},"state_namespace":{"events":24,"success":24},"state_node":{"events":9,"success":9},"state_persistentvolume":{"events":9,"success":9},"state_persistentvolumeclaim":{"events":9,"success":9},"state_pod":{"events":108,"success":108},"state_replicaset":{"events":45,"success":45},"state_resourcequota":{"events":18,"success":18},"state_service":{"events":42,"success":42},"state_statefulset":{"events":9,"success":9},"state_storageclass":{"events":9,"success":9},"system":{"events":9,"success":9},"volume":{"events":141,"success":141}}},"registrar":{"states":{"current":0}},"system":{"load":{"1":0.42,"15":0.65,"5":0.44,"norm":{"1":0.105,"15":0.1625,"5":0.11}}}}},"ecs.version":"1.6.0"}

Redeploying Elastic Agent or Changing policies

There's an issue where Elastic Agent might fail to update the token information on the state directory (mounted as a hostPath volume).

Suggested Improvements

Divide the Kubernetes Observability workload in a DaemonSet + Deployment:
- DaemonSet for node level metrics and logs
- Deployment for cluster level metrics (kube-state-metrics related metrics, events, api-server, etc.)

That would require 2 different policies, both with Kubernetes integration, but each of them configured with a different set of inputs.

Agents showing with the real hostname on Fleet UI instead of the pod name (for the DaemonSet).

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
resources		resources
.DS_Store		.DS_Store
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Kubernetes Observability with ECK and Elastic Agent

Introduction

ECK Installation

Kube-state-metrics installation

Elasticsearch and Kibana deployment

Gain access to the platform

Deploy RBAC for Elastic Agents

Deploy Fleet Server

Deploy Elastic Agent

Add Kubernetes Integration to the DaemonSet agents policy

APM Demo

Stack Monitoring demo

Uninstalling everything:

Troubleshooting notes

Redeploying Elastic Agent or Changing policies

Suggested Improvements

About

Uh oh!

Releases

Packages

eedugon/k8s-o11y-eck-agent

Folders and files

Latest commit

History

Repository files navigation

Kubernetes Observability with ECK and Elastic Agent

Introduction

ECK Installation

Kube-state-metrics installation

Elasticsearch and Kibana deployment

Gain access to the platform

Deploy RBAC for Elastic Agents

Deploy Fleet Server

Deploy Elastic Agent

Add Kubernetes Integration to the DaemonSet agents policy

APM Demo

Stack Monitoring demo

Uninstalling everything:

Troubleshooting notes

Redeploying Elastic Agent or Changing policies

Suggested Improvements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages