diff --git a/docs/tutorials/k8s-metrics-monitor.md b/docs/tutorials/k8s-metrics-monitor.md new file mode 100644 index 000000000..7cb8ed5d9 --- /dev/null +++ b/docs/tutorials/k8s-metrics-monitor.md @@ -0,0 +1,319 @@ +--- +keywords: [Kubernetes, Prometheus, monitoring, metrics, observability, GreptimeDB, Prometheus Operator, Grafana] +description: Guide to monitoring Kubernetes metrics using Prometheus with GreptimeDB as the storage backend, including architecture overview, installation, and visualization with Grafana. +--- + +# Monitor Kubernetes Metrics with Prometheus and GreptimeDB + +This guide demonstrates how to set up a complete Kubernetes monitoring solution using Prometheus for metrics collection and GreptimeDB as the long-term storage backend. + +## What is Kubernetes Monitoring + +Kubernetes monitoring is the practice of collecting, analyzing, and acting on metrics and logs from a Kubernetes cluster. +It provides visibility into the health, performance, and resource utilization of your containerized applications and infrastructure. + +Key aspects of Kubernetes monitoring include: + +- **Resource Metrics**: CPU, memory, disk, and network usage for nodes, pods, and containers +- **Cluster Health**: Status of cluster components like kube-apiserver, etcd, and controller-manager +- **Application Metrics**: Custom metrics from your applications running in the cluster +- **Events and Logs**: Kubernetes events and container logs for troubleshooting + +Effective monitoring helps you: +- Detect and diagnose issues before they impact users +- Optimize resource utilization and reduce costs +- Plan capacity based on historical trends +- Ensure SLA compliance +- Troubleshoot performance bottlenecks + +## Architecture Overview + +The monitoring architecture consists of the following components: + +![Kubernetes Monitoring Architecture](/k8s-metrics-monitor-architecture.drawio.svg) + +**Components:** + +- **kube-state-metrics**: Exports cluster-level metrics about Kubernetes objects (deployments, pods, services, etc.) +- **Node Exporter**: Exports hardware and OS-level metrics from each Kubernetes node +- **Prometheus Operator**: Automates Prometheus deployment and configuration using Kubernetes custom resources +- **GreptimeDB**: Acts as the long-term storage backend for Prometheus metrics with high compression and query performance +- **Grafana**: Provides dashboards and visualizations for metrics stored in GreptimeDB + +## Prerequisites + +Before starting, ensure you have: + +- A running Kubernetes cluster (version >= 1.18) +- `kubectl` configured to access your cluster +- [Helm](https://helm.sh/docs/intro/install/) v3.0.0 or higher installed +- Sufficient cluster resources (at least 2 CPU cores and 4GB memory available) + +## Install GreptimeDB + +GreptimeDB serves as the long-term storage backend for Prometheus metrics. +For detailed installation steps, +please refer to the [Deploy GreptimeDB Cluster](/user-guide/deployments-administration/deploy-on-kubernetes/deploy-greptimedb-cluster.md) documentation. + +### Verify the GreptimeDB Installation + +After deploying GreptimeDB, verify that the cluster is running. +In this guide we assume the GreptimeDB cluster is deployed in the `greptime-cluster` namespace and named `greptimedb`. + +```bash +kubectl -n greptime-cluster get greptimedbclusters.greptime.io greptimedb +``` + +```bash +NAME FRONTEND DATANODE META FLOWNODE PHASE VERSION AGE +greptimedb 1 2 1 1 Running v0.17.2 33s +``` + +Check the pods: + +```bash +kubectl get pods -n greptime-cluster +``` + +```bash +NAME READY STATUS RESTARTS AGE +greptimedb-datanode-0 1/1 Running 0 71s +greptimedb-datanode-1 1/1 Running 0 97s +greptimedb-flownode-0 1/1 Running 0 64s +greptimedb-frontend-8bf9f558c-7wdmk 1/1 Running 0 90s +greptimedb-meta-fc4ddb78b-nv944 1/1 Running 0 87s +``` + +### Access GreptimeDB + +To interact with GreptimeDB directly, you can port-forward the frontend service to your local machine. +GreptimeDB supports multiple protocols, with MySQL protocol available on port `4002` by default. + +```bash +kubectl port-forward -n greptime-cluster svc/greptimedb-frontend 4002:4002 +``` + +Connect using any MySQL-compatible client: + +```bash +mysql -h 127.0.0.1 -P 4002 +``` + +### Storage Partitioning + +To improve query performance and reduce storage costs, +GreptimeDB automatically creates columns based on Prometheus metric labels and stores metrics in a physical table. +The default table name is `greptime_physical_table`. +Since we deployed a GreptimeDB cluster with [multiple datanodes](#verify-the-greptimedb-installation), +you can partition the table to distribute data across datanodes for better scalability and performance. + +In this Kubernetes monitoring scenario, we can use the `namespace` label as the partition key. +For example, with namespaces like `kube-public`, `kube-system`, `monitoring`, `default`, `greptime-cluster`, and `etcd-cluster`, +you can create a partitioning scheme based on the first letter of the namespace: + +```sql +CREATE TABLE greptime_physical_table ( + greptime_value DOUBLE NULL, + namespace STRING PRIMARY KEY, + greptime_timestamp TIMESTAMP TIME INDEX, +) +PARTITION ON COLUMNS (namespace) ( + namespace < 'f', + namespace >= 'f' AND namespace < 'g', + namespace >= 'g' AND namespace < 'k', + namespace >= 'k' +) +ENGINE = metric +WITH ( + "physical_metric_table" = "" +); +``` + +For more information about Prometheus metrics storage and query performance optimization, refer to the [Improve efficiency by using metric engine](/user-guide/ingest-data/for-observability/prometheus.md#improve-efficiency-by-using-metric-engine) guide. + +### Prometheus URLs in GreptimeDB + +GreptimeDB provides [Prometheus-compatible APIs](/user-guide/query-data/promql.md#prometheus-http-api) under the HTTP context `/v1/prometheus/`, +enabling seamless integration with existing Prometheus workflows. + +To integrate Prometheus with GreptimeDB, you need the GreptimeDB service address. +Since GreptimeDB runs inside the Kubernetes cluster, use the internal cluster address. + +The GreptimeDB frontend service address follows this pattern: +``` +-frontend..svc.cluster.local: +``` + +In this guide: +- GreptimeDB cluster name: `greptimedb` +- Namespace: `greptime-cluster` +- Frontend port: `4000` + +So the service address is: +```bash +greptimedb-frontend.greptime-cluster.svc.cluster.local:4000 +``` + +The complete [Remote Write URL](/user-guide/ingest-data/for-observability/prometheus.md#remote-write-configuration) for Prometheus is: + +```bash +http://greptimedb-frontend.greptime-cluster.svc.cluster.local:4000/v1/prometheus/write +``` + +This URL consists of: +- **Service endpoint**: `greptimedb-frontend.greptime-cluster.svc.cluster.local:4000` +- **API path**: `/v1/prometheus/write` + +## Install Prometheus + +Now that GreptimeDB is running, we'll install Prometheus to collect metrics and send them to GreptimeDB for long-term storage. + +### Add the Prometheus Community Helm Repository + +```bash +helm repo add prometheus-community https://prometheus-community.github.io/helm-charts +helm repo update +``` + +### Install the kube-prometheus-stack + +The [`kube-prometheus-stack`](https://github.com/prometheus-operator/kube-prometheus) is a comprehensive monitoring solution that includes +Prometheus, Grafana, kube-state-metrics, and node-exporter components. +This stack automatically discovers and monitors all Kubernetes namespaces, +collecting metrics from cluster components, nodes, and workloads. + +In this deployment, we'll configure Prometheus to use GreptimeDB as the remote write destination for long-term metric storage and configure Grafana's default Prometheus data source to use GreptimeDB. + +Create a `kube-prometheus-values.yaml` file with the following configuration: + +```yaml +# Configure Prometheus remote write to GreptimeDB +prometheus: + prometheusSpec: + remoteWrite: + - url: http://greptimedb-frontend.greptime-cluster.svc.cluster.local:4000/v1/prometheus/write + +# Configure Grafana to use GreptimeDB as the default Prometheus data source +grafana: + datasources: + datasources.yaml: + apiVersion: 1 + datasources: + - name: Prometheus + type: prometheus + url: http://greptimedb-frontend.greptime-cluster.svc.cluster.local:4000/v1/prometheus + access: proxy + editable: true +``` + +This configuration file specifies [the GreptimeDB service address](#prometheus-urls-in-greptimedb) for: +- **Prometheus remote write**: Sends collected metrics to GreptimeDB for long-term storage +- **Grafana data source**: Configures GreptimeDB as the default Prometheus data source for dashboard queries + +Install the `kube-prometheus-stack` using Helm with the custom values file: + +```bash +helm install kube-prometheus prometheus-community/kube-prometheus-stack \ + --namespace monitoring \ + --create-namespace \ + --values kube-prometheus-values.yaml +``` + +### Verify the Installation + +Check that all Prometheus components are running: + +```bash +kubectl get pods -n monitoring +``` + +```bash +NAME READY STATUS RESTARTS AGE +alertmanager-kube-prometheus-kube-prome-alertmanager-0 2/2 Running 0 60s +kube-prometheus-grafana-78ccf96696-sghx4 3/3 Running 0 78s +kube-prometheus-kube-prome-operator-775fdbfd75-w88n7 1/1 Running 0 78s +kube-prometheus-kube-state-metrics-5bd5747f46-d2sxs 1/1 Running 0 78s +kube-prometheus-prometheus-node-exporter-ts9nn 1/1 Running 0 78s +prometheus-kube-prometheus-kube-prome-prometheus-0 2/2 Running 0 60s +``` + +### Verify the Monitoring Status + +Use [MySQL protocol](#access-greptimedb) to query GreptimeDB and verify that Prometheus metrics are being written. + +```sql +SHOW TABLES; +``` + +You should see tables created for various Prometheus metrics. + +```sql ++---------------------------------------------------------------------------------+ +| Tables | ++---------------------------------------------------------------------------------+ +| :node_memory_MemAvailable_bytes:sum | +| ALERTS | +| ALERTS_FOR_STATE | +| aggregator_discovery_aggregation_count_total | +| aggregator_unavailable_apiservice | +| alertmanager_alerts | +| alertmanager_alerts_invalid_total | +| alertmanager_alerts_received_total | +| alertmanager_build_info | +| ...... | ++---------------------------------------------------------------------------------+ +1553 rows in set (0.18 sec) +``` + +## Use Grafana for Visualization + +Grafana is included in the kube-prometheus-stack and comes pre-configured with dashboards for comprehensive Kubernetes monitoring. + +### Access Grafana + +Port-forward the Grafana service to access the web interface: + +```bash +kubectl port-forward -n monitoring svc/kube-prometheus-grafana 3000:80 +``` + +### Get Admin Credentials + +Retrieve the admin password using kubectl: + +```bash +kubectl get secret --namespace monitoring kube-prometheus-grafana -o jsonpath="{.data.admin-password}" | base64 --decode ; echo +``` + +### Login Grafana + +1. Open your browser and navigate to [http://localhost:3000](http://localhost:3000) +2. Login with: + - **Username**: `admin` + - **Password**: The password retrieved from the previous step + +### Explore Pre-configured Dashboards + +After logging in, navigate to **Dashboards** to explore the pre-configured Kubernetes monitoring dashboards: + +- **Kubernetes / Compute Resources / Cluster**: Overview of cluster-wide resource utilization +- **Kubernetes / Compute Resources / Namespace (Pods)**: Resource usage breakdown by namespace +- **Kubernetes / Compute Resources / Node (Pods)**: Node-level resource monitoring +- **Node Exporter / Nodes**: Detailed node hardware and OS metrics + +![Grafana Dashboard](/k8s-prom-monitor-grafana.jpg) + +## Conclusion + +You now have a complete Kubernetes monitoring solution with Prometheus collecting metrics and GreptimeDB providing efficient long-term storage. This setup enables you to: + +- Monitor cluster and application health in real-time +- Store metrics for historical analysis and capacity planning +- Create rich visualizations and dashboards with Grafana +- Query metrics using both PromQL and SQL + +For more information about GreptimeDB and Prometheus integration, see: + +- [Prometheus Integration](/user-guide/ingest-data/for-observability/prometheus.md) +- [Query Data in GreptimeDB](/user-guide/query-data/overview.md) + diff --git a/i18n/zh/docusaurus-plugin-content-docs/current/tutorials/k8s-metrics-monitor.md b/i18n/zh/docusaurus-plugin-content-docs/current/tutorials/k8s-metrics-monitor.md new file mode 100644 index 000000000..7f61e542f --- /dev/null +++ b/i18n/zh/docusaurus-plugin-content-docs/current/tutorials/k8s-metrics-monitor.md @@ -0,0 +1,328 @@ +--- +keywords: [Kubernetes, Prometheus, 监控, 指标, 可观测性, GreptimeDB, Prometheus Operator, Grafana] +description: 使用 Prometheus 监控 Kubernetes 指标的指南,以 GreptimeDB 作为存储后端,包括架构概览、安装和使用 Grafana 进行可视化。 +--- + +# 使用 Prometheus 和 GreptimeDB 监控 Kubernetes 指标 + +本指南演示如何建立一个完整的 Kubernetes 监控解决方案, +该方案使用 Prometheus 收集指标, +使用 GreptimeDB 作为长期存储后端。 + +## 什么是 Kubernetes 监控 + +Kubernetes 监控指的是从 Kubernetes 集群中收集、分析和处理指标和日志。 +它是检查容器化应用程序和基础设施的健康状况、性能和资源利用率的关键。 + +Kubernetes 主要监控以下信息: + +- **资源指标**:节点、Pod 和容器的 CPU、内存、磁盘和网络使用情况 +- **集群健康**:集群组件如 kube-apiserver、etcd 和 controller-manager 的状态 +- **应用程序指标**:在集群中运行的应用程序指标 +- **事件和日志**:用于故障诊断的 Kubernetes 事件和容器日志 + +有效的监控可以帮助你: +- 在问题影响用户之前检测和诊断问题 +- 优化资源利用率并降低成本 +- 基于历史趋势进行容量规划 +- 确保 SLA 合规性 +- 排查性能瓶颈 + +## 架构概览 + +监控架构由以下组件组成: + +![Kubernetes 监控架构](/k8s-metrics-monitor-architecture.drawio.svg) + +**组件:** + +- **kube-state-metrics**:导出关于 Kubernetes 对象(部署、Pod、服务等)的集群级指标 +- **Node Exporter**:从每个 Kubernetes 节点导出硬件和操作系统级指标 +- **Prometheus Operator**:使用 Kubernetes 自定义资源自动化 Prometheus 部署和配置 +- **GreptimeDB**:Prometheus 指标的长期存储后端,具有高压缩率和查询性能 +- **Grafana**:为存储在 GreptimeDB 中的指标提供仪表板和可视化 + +## 前提条件 + +在开始之前,确保你拥有: + +- 一个运行中的 Kubernetes 集群(版本 >= 1.18) +- 已配置 `kubectl` 以访问你的集群 +- 已安装 [Helm](https://helm.sh/docs/intro/install/) v3.0.0 或更高版本 +- 足够的集群资源(至少 2 个 CPU 核心和 4GB 可用内存) + +## 安装 GreptimeDB + +GreptimeDB 被作为 Prometheus 指标的长期存储后端, +请参考[部署 GreptimeDB 集群](/user-guide/deployments-administration/deploy-on-kubernetes/deploy-greptimedb-cluster.md)文档了解如何部署。 + +### 验证 GreptimeDB 的部署 + +部署 GreptimeDB 后,验证集群是否正常运行中。 +在本指南中,我们假设 GreptimeDB 集群部署在 `greptime-cluster` 命名空间,名称为 `greptimedb`。 + +```bash +kubectl -n greptime-cluster get greptimedbclusters.greptime.io greptimedb +``` + +```bash +NAME FRONTEND DATANODE META FLOWNODE PHASE VERSION AGE +greptimedb 1 2 1 1 Running v0.17.2 33s +``` + +检查 Pod 状态: + +```bash +kubectl get pods -n greptime-cluster +``` + +```bash +NAME READY STATUS RESTARTS AGE +greptimedb-datanode-0 1/1 Running 0 71s +greptimedb-datanode-1 1/1 Running 0 97s +greptimedb-flownode-0 1/1 Running 0 64s +greptimedb-frontend-8bf9f558c-7wdmk 1/1 Running 0 90s +greptimedb-meta-fc4ddb78b-nv944 1/1 Running 0 87s +``` + +### 访问 GreptimeDB + +可以将 frontend 服务的端口转发到本地来连接 GreptimeDB。 +GreptimeDB 支持多种协议,其中 MySQL 协议默认使用端口 `4002`。 + +```bash +kubectl port-forward -n greptime-cluster svc/greptimedb-frontend 4002:4002 +``` + +使用 MySQL 客户端连接 GreptimeDB: + +```bash +mysql -h 127.0.0.1 -P 4002 +``` + +### 存储分区 + +为了提高查询性能并降低存储成本, +GreptimeDB 会基于 Prometheus 指标标签自动创建列,并将指标存储在物理表中,默认使用的物理表名为 `greptime_physical_table`。 +在上方我们部署了具有[多个 datanode 节点](#验证-greptimedb-的部署)的 GreptimeDB 集群, +你可以对表进行分区将数据分布到各个 datanode 节点上,以获得更好的可扩展性和性能。 + +在此 Kubernetes 监控场景中, +可以使用 `namespace` 标签作为分区键。 +例如,对于 `kube-public`、`kube-system`、`monitoring`、`default`、`greptime-cluster` 和 `etcd-cluster` 等命名空间, +你可以基于命名空间的首字母创建分区方案: + +```sql +CREATE TABLE greptime_physical_table ( + greptime_value DOUBLE NULL, + namespace STRING PRIMARY KEY, + greptime_timestamp TIMESTAMP TIME INDEX, +) +PARTITION ON COLUMNS (namespace) ( + namespace < 'f', + namespace >= 'f' AND namespace < 'g', + namespace >= 'g' AND namespace < 'k', + namespace >= 'k' +) +ENGINE = metric +WITH ( + "physical_metric_table" = "" +); +``` + +有关 Prometheus 指标存储和查询性能优化的更多信息, +请参阅[使用 metric engine 提高效率](/user-guide/ingest-data/for-observability/prometheus.md#通过使用-metric-engine-提高效率)指南。 + +### GreptimeDB 中的 Prometheus URL + +GreptimeDB 在 HTTP 上下文 `/v1/prometheus/` 下提供了[兼容 Prometheus 的 API](/user-guide/query-data/promql.md#prometheus-http-api), +使其能够与现有的 Prometheus 工作流程无缝集成。 + +你需要 GreptimeDB 服务地址来配置 Prometheus。 +由于 GreptimeDB 在 Kubernetes 集群内运行,所以使用内部集群地址。 + +GreptimeDB frontend 服务地址遵循以下模式: +``` +-frontend..svc.cluster.local: +``` + +在本指南中: +- GreptimeDB 集群名称:`greptimedb` +- 命名空间:`greptime-cluster` +- Frontend 端口:`4000` + +因此服务地址为: + +```bash +greptimedb-frontend.greptime-cluster.svc.cluster.local:4000 +``` + +Prometheus 的完整 [Remote Write URL](/user-guide/ingest-data/for-observability/prometheus.md#remote-write-configuration) 为: + +```bash +http://greptimedb-frontend.greptime-cluster.svc.cluster.local:4000/v1/prometheus/write +``` + +此 URL 包含: +- **服务端点**:`greptimedb-frontend.greptime-cluster.svc.cluster.local:4000` +- **API 路径**:`/v1/prometheus/write` + +## 安装 Prometheus + +现在 GreptimeDB 正常运行中, +我们将安装 Prometheus 收集指标并将其发送到 GreptimeDB。 + +### 添加 Prometheus Community Helm 仓库 + +```bash +helm repo add prometheus-community https://prometheus-community.github.io/helm-charts +helm repo update +``` + +### 安装 kube-prometheus-stack + +[`kube-prometheus-stack`](https://github.com/prometheus-operator/kube-prometheus) 是一个综合的监控解决方案,包括 +Prometheus、Grafana、kube-state-metrics 和 node-exporter 组件。 +此 stack 自动发现和监控所有 Kubernetes 命名空间, +收集来自集群组件、节点和工作负载的指标。 + +在此部署中, +我们将配置 Prometheus 使用 GreptimeDB 作为 Remote Write 目标长期存储指标数据, +并配置 Grafana 的默认 Prometheus 数据源使用 GreptimeDB。 + +创建一个 `kube-prometheus-values.yaml` 文件,包含以下配置: + +```yaml +# 配置 Prometheus 远程写入到 GreptimeDB +prometheus: + prometheusSpec: + remoteWrite: + - url: http://greptimedb-frontend.greptime-cluster.svc.cluster.local:4000/v1/prometheus/write + +# 配置 Grafana 使用 GreptimeDB 作为默认 Prometheus 数据源 +grafana: + datasources: + datasources.yaml: + apiVersion: 1 + datasources: + - name: Prometheus + type: prometheus + url: http://greptimedb-frontend.greptime-cluster.svc.cluster.local:4000/v1/prometheus + access: proxy + editable: true +``` + +此配置文件为以下用途指定了[GreptimeDB 服务地址](#greptimedb-中的-prometheus-url): + +- **Prometheus Remote Write**:将收集的指标发送到 GreptimeDB 进行长期存储 +- **Grafana 数据源**:将 GreptimeDB 配置为仪表板查询的默认 Prometheus 数据源 + +使用 Helm 和自定义配置文件安装 `kube-prometheus-stack`: + +```bash +helm install kube-prometheus prometheus-community/kube-prometheus-stack \ + --namespace monitoring \ + --create-namespace \ + --values kube-prometheus-values.yaml +``` + +### 验证安装 + +检查所有 Prometheus 组件是否正在运行: + +```bash +kubectl get pods -n monitoring +``` + +```bash +NAME READY STATUS RESTARTS AGE +alertmanager-kube-prometheus-kube-prome-alertmanager-0 2/2 Running 0 60s +kube-prometheus-grafana-78ccf96696-sghx4 3/3 Running 0 78s +kube-prometheus-kube-prome-operator-775fdbfd75-w88n7 1/1 Running 0 78s +kube-prometheus-kube-state-metrics-5bd5747f46-d2sxs 1/1 Running 0 78s +kube-prometheus-prometheus-node-exporter-ts9nn 1/1 Running 0 78s +prometheus-kube-prometheus-kube-prome-prometheus-0 2/2 Running 0 60s +``` + +### 验证监控状态 + +使用 [MySQL protocol](#访问-greptimedb) 查询 GreptimeDB,验证 Prometheus 指标是否已写入。 + +```sql +SHOW TABLES; +``` + +你应该能看到为各种 Prometheus 指标创建的表名。 + +```sql ++---------------------------------------------------------------------------------+ +| Tables | ++---------------------------------------------------------------------------------+ +| :node_memory_MemAvailable_bytes:sum | +| ALERTS | +| ALERTS_FOR_STATE | +| aggregator_discovery_aggregation_count_total | +| aggregator_unavailable_apiservice | +| alertmanager_alerts | +| alertmanager_alerts_invalid_total | +| alertmanager_alerts_received_total | +| alertmanager_build_info | +| ...... | ++---------------------------------------------------------------------------------+ +1553 rows in set (0.18 sec) +``` + +## 使用 Grafana 进行可视化 + +Grafana 包含在 kube-prometheus-stack 中, +并预配置了 Prometheus 作为数据源的仪表盘。 + +### 访问 Grafana + +将 Grafana 服务的端口转发到本地以访问 Web 界面: + +```bash +kubectl port-forward -n monitoring svc/kube-prometheus-grafana 3000:80 +``` + +### 获取管理员凭证 + +使用 kubectl 检索登录使用的 admin 密码: + +```bash +kubectl get secret --namespace monitoring kube-prometheus-grafana -o jsonpath="{.data.admin-password}" | base64 --decode ; echo +``` + +### 登录 Grafana + +1. 打开浏览器并导航到 [http://localhost:3000](http://localhost:3000) +2. 使用以下凭证登录: + - **用户名**:`admin` + - **密码**:从上一步检索到的密码 + +### 查看预配置的仪表板 + +登录后,导航到**仪表板**以探索预配置的 Kubernetes 监控仪表板: + +- **Kubernetes / Compute Resources / Cluster**:集群范围的资源利用率概览 +- **Kubernetes / Compute Resources / Namespace (Pods)**:按命名空间分解的资源使用情况 +- **Kubernetes / Compute Resources / Node (Pods)**:节点级资源监控 +- **Node Exporter / Nodes**:详细的节点硬件和操作系统指标 + +![Grafana Dashboard](/k8s-prom-monitor-grafana.jpg) + +## 总结 + +你现在部署了完整的 Kubernetes 监控解决方案, +使用 Prometheus 收集指标,使用 GreptimeDB 提供高效的长期存储。 +该解决方案使你能够: + +- 实时监控集群和应用程序健康状况 +- 存储指标以进行历史分析和容量规划 +- 使用 Grafana 创建丰富的可视化和仪表板 +- 使用 PromQL 和 SQL 查询指标 + +有关 GreptimeDB 和 Prometheus 集成的更多信息,请参阅: + +- [Prometheus 集成](/user-guide/ingest-data/for-observability/prometheus.md) +- [在 GreptimeDB 中查询数据](/user-guide/query-data/overview.md) diff --git a/i18n/zh/docusaurus-plugin-content-docs/version-0.17/tutorials/k8s-metrics-monitor.md b/i18n/zh/docusaurus-plugin-content-docs/version-0.17/tutorials/k8s-metrics-monitor.md new file mode 100644 index 000000000..7f61e542f --- /dev/null +++ b/i18n/zh/docusaurus-plugin-content-docs/version-0.17/tutorials/k8s-metrics-monitor.md @@ -0,0 +1,328 @@ +--- +keywords: [Kubernetes, Prometheus, 监控, 指标, 可观测性, GreptimeDB, Prometheus Operator, Grafana] +description: 使用 Prometheus 监控 Kubernetes 指标的指南,以 GreptimeDB 作为存储后端,包括架构概览、安装和使用 Grafana 进行可视化。 +--- + +# 使用 Prometheus 和 GreptimeDB 监控 Kubernetes 指标 + +本指南演示如何建立一个完整的 Kubernetes 监控解决方案, +该方案使用 Prometheus 收集指标, +使用 GreptimeDB 作为长期存储后端。 + +## 什么是 Kubernetes 监控 + +Kubernetes 监控指的是从 Kubernetes 集群中收集、分析和处理指标和日志。 +它是检查容器化应用程序和基础设施的健康状况、性能和资源利用率的关键。 + +Kubernetes 主要监控以下信息: + +- **资源指标**:节点、Pod 和容器的 CPU、内存、磁盘和网络使用情况 +- **集群健康**:集群组件如 kube-apiserver、etcd 和 controller-manager 的状态 +- **应用程序指标**:在集群中运行的应用程序指标 +- **事件和日志**:用于故障诊断的 Kubernetes 事件和容器日志 + +有效的监控可以帮助你: +- 在问题影响用户之前检测和诊断问题 +- 优化资源利用率并降低成本 +- 基于历史趋势进行容量规划 +- 确保 SLA 合规性 +- 排查性能瓶颈 + +## 架构概览 + +监控架构由以下组件组成: + +![Kubernetes 监控架构](/k8s-metrics-monitor-architecture.drawio.svg) + +**组件:** + +- **kube-state-metrics**:导出关于 Kubernetes 对象(部署、Pod、服务等)的集群级指标 +- **Node Exporter**:从每个 Kubernetes 节点导出硬件和操作系统级指标 +- **Prometheus Operator**:使用 Kubernetes 自定义资源自动化 Prometheus 部署和配置 +- **GreptimeDB**:Prometheus 指标的长期存储后端,具有高压缩率和查询性能 +- **Grafana**:为存储在 GreptimeDB 中的指标提供仪表板和可视化 + +## 前提条件 + +在开始之前,确保你拥有: + +- 一个运行中的 Kubernetes 集群(版本 >= 1.18) +- 已配置 `kubectl` 以访问你的集群 +- 已安装 [Helm](https://helm.sh/docs/intro/install/) v3.0.0 或更高版本 +- 足够的集群资源(至少 2 个 CPU 核心和 4GB 可用内存) + +## 安装 GreptimeDB + +GreptimeDB 被作为 Prometheus 指标的长期存储后端, +请参考[部署 GreptimeDB 集群](/user-guide/deployments-administration/deploy-on-kubernetes/deploy-greptimedb-cluster.md)文档了解如何部署。 + +### 验证 GreptimeDB 的部署 + +部署 GreptimeDB 后,验证集群是否正常运行中。 +在本指南中,我们假设 GreptimeDB 集群部署在 `greptime-cluster` 命名空间,名称为 `greptimedb`。 + +```bash +kubectl -n greptime-cluster get greptimedbclusters.greptime.io greptimedb +``` + +```bash +NAME FRONTEND DATANODE META FLOWNODE PHASE VERSION AGE +greptimedb 1 2 1 1 Running v0.17.2 33s +``` + +检查 Pod 状态: + +```bash +kubectl get pods -n greptime-cluster +``` + +```bash +NAME READY STATUS RESTARTS AGE +greptimedb-datanode-0 1/1 Running 0 71s +greptimedb-datanode-1 1/1 Running 0 97s +greptimedb-flownode-0 1/1 Running 0 64s +greptimedb-frontend-8bf9f558c-7wdmk 1/1 Running 0 90s +greptimedb-meta-fc4ddb78b-nv944 1/1 Running 0 87s +``` + +### 访问 GreptimeDB + +可以将 frontend 服务的端口转发到本地来连接 GreptimeDB。 +GreptimeDB 支持多种协议,其中 MySQL 协议默认使用端口 `4002`。 + +```bash +kubectl port-forward -n greptime-cluster svc/greptimedb-frontend 4002:4002 +``` + +使用 MySQL 客户端连接 GreptimeDB: + +```bash +mysql -h 127.0.0.1 -P 4002 +``` + +### 存储分区 + +为了提高查询性能并降低存储成本, +GreptimeDB 会基于 Prometheus 指标标签自动创建列,并将指标存储在物理表中,默认使用的物理表名为 `greptime_physical_table`。 +在上方我们部署了具有[多个 datanode 节点](#验证-greptimedb-的部署)的 GreptimeDB 集群, +你可以对表进行分区将数据分布到各个 datanode 节点上,以获得更好的可扩展性和性能。 + +在此 Kubernetes 监控场景中, +可以使用 `namespace` 标签作为分区键。 +例如,对于 `kube-public`、`kube-system`、`monitoring`、`default`、`greptime-cluster` 和 `etcd-cluster` 等命名空间, +你可以基于命名空间的首字母创建分区方案: + +```sql +CREATE TABLE greptime_physical_table ( + greptime_value DOUBLE NULL, + namespace STRING PRIMARY KEY, + greptime_timestamp TIMESTAMP TIME INDEX, +) +PARTITION ON COLUMNS (namespace) ( + namespace < 'f', + namespace >= 'f' AND namespace < 'g', + namespace >= 'g' AND namespace < 'k', + namespace >= 'k' +) +ENGINE = metric +WITH ( + "physical_metric_table" = "" +); +``` + +有关 Prometheus 指标存储和查询性能优化的更多信息, +请参阅[使用 metric engine 提高效率](/user-guide/ingest-data/for-observability/prometheus.md#通过使用-metric-engine-提高效率)指南。 + +### GreptimeDB 中的 Prometheus URL + +GreptimeDB 在 HTTP 上下文 `/v1/prometheus/` 下提供了[兼容 Prometheus 的 API](/user-guide/query-data/promql.md#prometheus-http-api), +使其能够与现有的 Prometheus 工作流程无缝集成。 + +你需要 GreptimeDB 服务地址来配置 Prometheus。 +由于 GreptimeDB 在 Kubernetes 集群内运行,所以使用内部集群地址。 + +GreptimeDB frontend 服务地址遵循以下模式: +``` +-frontend..svc.cluster.local: +``` + +在本指南中: +- GreptimeDB 集群名称:`greptimedb` +- 命名空间:`greptime-cluster` +- Frontend 端口:`4000` + +因此服务地址为: + +```bash +greptimedb-frontend.greptime-cluster.svc.cluster.local:4000 +``` + +Prometheus 的完整 [Remote Write URL](/user-guide/ingest-data/for-observability/prometheus.md#remote-write-configuration) 为: + +```bash +http://greptimedb-frontend.greptime-cluster.svc.cluster.local:4000/v1/prometheus/write +``` + +此 URL 包含: +- **服务端点**:`greptimedb-frontend.greptime-cluster.svc.cluster.local:4000` +- **API 路径**:`/v1/prometheus/write` + +## 安装 Prometheus + +现在 GreptimeDB 正常运行中, +我们将安装 Prometheus 收集指标并将其发送到 GreptimeDB。 + +### 添加 Prometheus Community Helm 仓库 + +```bash +helm repo add prometheus-community https://prometheus-community.github.io/helm-charts +helm repo update +``` + +### 安装 kube-prometheus-stack + +[`kube-prometheus-stack`](https://github.com/prometheus-operator/kube-prometheus) 是一个综合的监控解决方案,包括 +Prometheus、Grafana、kube-state-metrics 和 node-exporter 组件。 +此 stack 自动发现和监控所有 Kubernetes 命名空间, +收集来自集群组件、节点和工作负载的指标。 + +在此部署中, +我们将配置 Prometheus 使用 GreptimeDB 作为 Remote Write 目标长期存储指标数据, +并配置 Grafana 的默认 Prometheus 数据源使用 GreptimeDB。 + +创建一个 `kube-prometheus-values.yaml` 文件,包含以下配置: + +```yaml +# 配置 Prometheus 远程写入到 GreptimeDB +prometheus: + prometheusSpec: + remoteWrite: + - url: http://greptimedb-frontend.greptime-cluster.svc.cluster.local:4000/v1/prometheus/write + +# 配置 Grafana 使用 GreptimeDB 作为默认 Prometheus 数据源 +grafana: + datasources: + datasources.yaml: + apiVersion: 1 + datasources: + - name: Prometheus + type: prometheus + url: http://greptimedb-frontend.greptime-cluster.svc.cluster.local:4000/v1/prometheus + access: proxy + editable: true +``` + +此配置文件为以下用途指定了[GreptimeDB 服务地址](#greptimedb-中的-prometheus-url): + +- **Prometheus Remote Write**:将收集的指标发送到 GreptimeDB 进行长期存储 +- **Grafana 数据源**:将 GreptimeDB 配置为仪表板查询的默认 Prometheus 数据源 + +使用 Helm 和自定义配置文件安装 `kube-prometheus-stack`: + +```bash +helm install kube-prometheus prometheus-community/kube-prometheus-stack \ + --namespace monitoring \ + --create-namespace \ + --values kube-prometheus-values.yaml +``` + +### 验证安装 + +检查所有 Prometheus 组件是否正在运行: + +```bash +kubectl get pods -n monitoring +``` + +```bash +NAME READY STATUS RESTARTS AGE +alertmanager-kube-prometheus-kube-prome-alertmanager-0 2/2 Running 0 60s +kube-prometheus-grafana-78ccf96696-sghx4 3/3 Running 0 78s +kube-prometheus-kube-prome-operator-775fdbfd75-w88n7 1/1 Running 0 78s +kube-prometheus-kube-state-metrics-5bd5747f46-d2sxs 1/1 Running 0 78s +kube-prometheus-prometheus-node-exporter-ts9nn 1/1 Running 0 78s +prometheus-kube-prometheus-kube-prome-prometheus-0 2/2 Running 0 60s +``` + +### 验证监控状态 + +使用 [MySQL protocol](#访问-greptimedb) 查询 GreptimeDB,验证 Prometheus 指标是否已写入。 + +```sql +SHOW TABLES; +``` + +你应该能看到为各种 Prometheus 指标创建的表名。 + +```sql ++---------------------------------------------------------------------------------+ +| Tables | ++---------------------------------------------------------------------------------+ +| :node_memory_MemAvailable_bytes:sum | +| ALERTS | +| ALERTS_FOR_STATE | +| aggregator_discovery_aggregation_count_total | +| aggregator_unavailable_apiservice | +| alertmanager_alerts | +| alertmanager_alerts_invalid_total | +| alertmanager_alerts_received_total | +| alertmanager_build_info | +| ...... | ++---------------------------------------------------------------------------------+ +1553 rows in set (0.18 sec) +``` + +## 使用 Grafana 进行可视化 + +Grafana 包含在 kube-prometheus-stack 中, +并预配置了 Prometheus 作为数据源的仪表盘。 + +### 访问 Grafana + +将 Grafana 服务的端口转发到本地以访问 Web 界面: + +```bash +kubectl port-forward -n monitoring svc/kube-prometheus-grafana 3000:80 +``` + +### 获取管理员凭证 + +使用 kubectl 检索登录使用的 admin 密码: + +```bash +kubectl get secret --namespace monitoring kube-prometheus-grafana -o jsonpath="{.data.admin-password}" | base64 --decode ; echo +``` + +### 登录 Grafana + +1. 打开浏览器并导航到 [http://localhost:3000](http://localhost:3000) +2. 使用以下凭证登录: + - **用户名**:`admin` + - **密码**:从上一步检索到的密码 + +### 查看预配置的仪表板 + +登录后,导航到**仪表板**以探索预配置的 Kubernetes 监控仪表板: + +- **Kubernetes / Compute Resources / Cluster**:集群范围的资源利用率概览 +- **Kubernetes / Compute Resources / Namespace (Pods)**:按命名空间分解的资源使用情况 +- **Kubernetes / Compute Resources / Node (Pods)**:节点级资源监控 +- **Node Exporter / Nodes**:详细的节点硬件和操作系统指标 + +![Grafana Dashboard](/k8s-prom-monitor-grafana.jpg) + +## 总结 + +你现在部署了完整的 Kubernetes 监控解决方案, +使用 Prometheus 收集指标,使用 GreptimeDB 提供高效的长期存储。 +该解决方案使你能够: + +- 实时监控集群和应用程序健康状况 +- 存储指标以进行历史分析和容量规划 +- 使用 Grafana 创建丰富的可视化和仪表板 +- 使用 PromQL 和 SQL 查询指标 + +有关 GreptimeDB 和 Prometheus 集成的更多信息,请参阅: + +- [Prometheus 集成](/user-guide/ingest-data/for-observability/prometheus.md) +- [在 GreptimeDB 中查询数据](/user-guide/query-data/overview.md) diff --git a/sidebars.ts b/sidebars.ts index 5eee5e0f4..bd85e1935 100644 --- a/sidebars.ts +++ b/sidebars.ts @@ -403,6 +403,13 @@ const sidebars: SidebarsConfig = { }, ], }, + { + type: 'category', + label: 'Tutorials', + items:[ + 'tutorials/k8s-metrics-monitor' + ] + }, { type: 'category', label: 'GreptimeCloud', diff --git a/static/k8s-metrics-monitor-architecture.drawio.svg b/static/k8s-metrics-monitor-architecture.drawio.svg new file mode 100644 index 000000000..397299b6b --- /dev/null +++ b/static/k8s-metrics-monitor-architecture.drawio.svg @@ -0,0 +1,205 @@ + + + + + + + + + + + + + +
+
+
+ Kubernetes +
+
+
+
+ + Kubernetes + +
+
+
+ + + + + + + + + + + +
+
+
+ kube-state- +
+ metrics +
+
+
+
+
+ + kube-state-... + +
+
+
+ + + + + + + + + + + + +
+
+
+ GreptimeDB +
+
+
+
+ + GreptimeDB + +
+
+
+ + + + + + + + + + + +
+
+
+ Node +
+ Exporter +
+
+
+
+
+ + Node... + +
+
+
+ + + + + + + + + + + +
+
+
+ Prometheus +
+
+
+
+ + Prometheus + +
+
+
+ + + + + + + +
+
+
+ Grafana +
+
+
+
+ + Grafana + +
+
+
+ + + + + + + +
+
+
+ + Metrics + +
+
+
+
+ + Metrics + +
+
+
+
+ + + + + Text is not SVG - cannot display + + + +
\ No newline at end of file diff --git a/static/k8s-prom-monitor-grafana.jpg b/static/k8s-prom-monitor-grafana.jpg new file mode 100644 index 000000000..9c42ffe01 Binary files /dev/null and b/static/k8s-prom-monitor-grafana.jpg differ diff --git a/versioned_docs/version-0.17/tutorials/k8s-metrics-monitor.md b/versioned_docs/version-0.17/tutorials/k8s-metrics-monitor.md new file mode 100644 index 000000000..7cb8ed5d9 --- /dev/null +++ b/versioned_docs/version-0.17/tutorials/k8s-metrics-monitor.md @@ -0,0 +1,319 @@ +--- +keywords: [Kubernetes, Prometheus, monitoring, metrics, observability, GreptimeDB, Prometheus Operator, Grafana] +description: Guide to monitoring Kubernetes metrics using Prometheus with GreptimeDB as the storage backend, including architecture overview, installation, and visualization with Grafana. +--- + +# Monitor Kubernetes Metrics with Prometheus and GreptimeDB + +This guide demonstrates how to set up a complete Kubernetes monitoring solution using Prometheus for metrics collection and GreptimeDB as the long-term storage backend. + +## What is Kubernetes Monitoring + +Kubernetes monitoring is the practice of collecting, analyzing, and acting on metrics and logs from a Kubernetes cluster. +It provides visibility into the health, performance, and resource utilization of your containerized applications and infrastructure. + +Key aspects of Kubernetes monitoring include: + +- **Resource Metrics**: CPU, memory, disk, and network usage for nodes, pods, and containers +- **Cluster Health**: Status of cluster components like kube-apiserver, etcd, and controller-manager +- **Application Metrics**: Custom metrics from your applications running in the cluster +- **Events and Logs**: Kubernetes events and container logs for troubleshooting + +Effective monitoring helps you: +- Detect and diagnose issues before they impact users +- Optimize resource utilization and reduce costs +- Plan capacity based on historical trends +- Ensure SLA compliance +- Troubleshoot performance bottlenecks + +## Architecture Overview + +The monitoring architecture consists of the following components: + +![Kubernetes Monitoring Architecture](/k8s-metrics-monitor-architecture.drawio.svg) + +**Components:** + +- **kube-state-metrics**: Exports cluster-level metrics about Kubernetes objects (deployments, pods, services, etc.) +- **Node Exporter**: Exports hardware and OS-level metrics from each Kubernetes node +- **Prometheus Operator**: Automates Prometheus deployment and configuration using Kubernetes custom resources +- **GreptimeDB**: Acts as the long-term storage backend for Prometheus metrics with high compression and query performance +- **Grafana**: Provides dashboards and visualizations for metrics stored in GreptimeDB + +## Prerequisites + +Before starting, ensure you have: + +- A running Kubernetes cluster (version >= 1.18) +- `kubectl` configured to access your cluster +- [Helm](https://helm.sh/docs/intro/install/) v3.0.0 or higher installed +- Sufficient cluster resources (at least 2 CPU cores and 4GB memory available) + +## Install GreptimeDB + +GreptimeDB serves as the long-term storage backend for Prometheus metrics. +For detailed installation steps, +please refer to the [Deploy GreptimeDB Cluster](/user-guide/deployments-administration/deploy-on-kubernetes/deploy-greptimedb-cluster.md) documentation. + +### Verify the GreptimeDB Installation + +After deploying GreptimeDB, verify that the cluster is running. +In this guide we assume the GreptimeDB cluster is deployed in the `greptime-cluster` namespace and named `greptimedb`. + +```bash +kubectl -n greptime-cluster get greptimedbclusters.greptime.io greptimedb +``` + +```bash +NAME FRONTEND DATANODE META FLOWNODE PHASE VERSION AGE +greptimedb 1 2 1 1 Running v0.17.2 33s +``` + +Check the pods: + +```bash +kubectl get pods -n greptime-cluster +``` + +```bash +NAME READY STATUS RESTARTS AGE +greptimedb-datanode-0 1/1 Running 0 71s +greptimedb-datanode-1 1/1 Running 0 97s +greptimedb-flownode-0 1/1 Running 0 64s +greptimedb-frontend-8bf9f558c-7wdmk 1/1 Running 0 90s +greptimedb-meta-fc4ddb78b-nv944 1/1 Running 0 87s +``` + +### Access GreptimeDB + +To interact with GreptimeDB directly, you can port-forward the frontend service to your local machine. +GreptimeDB supports multiple protocols, with MySQL protocol available on port `4002` by default. + +```bash +kubectl port-forward -n greptime-cluster svc/greptimedb-frontend 4002:4002 +``` + +Connect using any MySQL-compatible client: + +```bash +mysql -h 127.0.0.1 -P 4002 +``` + +### Storage Partitioning + +To improve query performance and reduce storage costs, +GreptimeDB automatically creates columns based on Prometheus metric labels and stores metrics in a physical table. +The default table name is `greptime_physical_table`. +Since we deployed a GreptimeDB cluster with [multiple datanodes](#verify-the-greptimedb-installation), +you can partition the table to distribute data across datanodes for better scalability and performance. + +In this Kubernetes monitoring scenario, we can use the `namespace` label as the partition key. +For example, with namespaces like `kube-public`, `kube-system`, `monitoring`, `default`, `greptime-cluster`, and `etcd-cluster`, +you can create a partitioning scheme based on the first letter of the namespace: + +```sql +CREATE TABLE greptime_physical_table ( + greptime_value DOUBLE NULL, + namespace STRING PRIMARY KEY, + greptime_timestamp TIMESTAMP TIME INDEX, +) +PARTITION ON COLUMNS (namespace) ( + namespace < 'f', + namespace >= 'f' AND namespace < 'g', + namespace >= 'g' AND namespace < 'k', + namespace >= 'k' +) +ENGINE = metric +WITH ( + "physical_metric_table" = "" +); +``` + +For more information about Prometheus metrics storage and query performance optimization, refer to the [Improve efficiency by using metric engine](/user-guide/ingest-data/for-observability/prometheus.md#improve-efficiency-by-using-metric-engine) guide. + +### Prometheus URLs in GreptimeDB + +GreptimeDB provides [Prometheus-compatible APIs](/user-guide/query-data/promql.md#prometheus-http-api) under the HTTP context `/v1/prometheus/`, +enabling seamless integration with existing Prometheus workflows. + +To integrate Prometheus with GreptimeDB, you need the GreptimeDB service address. +Since GreptimeDB runs inside the Kubernetes cluster, use the internal cluster address. + +The GreptimeDB frontend service address follows this pattern: +``` +-frontend..svc.cluster.local: +``` + +In this guide: +- GreptimeDB cluster name: `greptimedb` +- Namespace: `greptime-cluster` +- Frontend port: `4000` + +So the service address is: +```bash +greptimedb-frontend.greptime-cluster.svc.cluster.local:4000 +``` + +The complete [Remote Write URL](/user-guide/ingest-data/for-observability/prometheus.md#remote-write-configuration) for Prometheus is: + +```bash +http://greptimedb-frontend.greptime-cluster.svc.cluster.local:4000/v1/prometheus/write +``` + +This URL consists of: +- **Service endpoint**: `greptimedb-frontend.greptime-cluster.svc.cluster.local:4000` +- **API path**: `/v1/prometheus/write` + +## Install Prometheus + +Now that GreptimeDB is running, we'll install Prometheus to collect metrics and send them to GreptimeDB for long-term storage. + +### Add the Prometheus Community Helm Repository + +```bash +helm repo add prometheus-community https://prometheus-community.github.io/helm-charts +helm repo update +``` + +### Install the kube-prometheus-stack + +The [`kube-prometheus-stack`](https://github.com/prometheus-operator/kube-prometheus) is a comprehensive monitoring solution that includes +Prometheus, Grafana, kube-state-metrics, and node-exporter components. +This stack automatically discovers and monitors all Kubernetes namespaces, +collecting metrics from cluster components, nodes, and workloads. + +In this deployment, we'll configure Prometheus to use GreptimeDB as the remote write destination for long-term metric storage and configure Grafana's default Prometheus data source to use GreptimeDB. + +Create a `kube-prometheus-values.yaml` file with the following configuration: + +```yaml +# Configure Prometheus remote write to GreptimeDB +prometheus: + prometheusSpec: + remoteWrite: + - url: http://greptimedb-frontend.greptime-cluster.svc.cluster.local:4000/v1/prometheus/write + +# Configure Grafana to use GreptimeDB as the default Prometheus data source +grafana: + datasources: + datasources.yaml: + apiVersion: 1 + datasources: + - name: Prometheus + type: prometheus + url: http://greptimedb-frontend.greptime-cluster.svc.cluster.local:4000/v1/prometheus + access: proxy + editable: true +``` + +This configuration file specifies [the GreptimeDB service address](#prometheus-urls-in-greptimedb) for: +- **Prometheus remote write**: Sends collected metrics to GreptimeDB for long-term storage +- **Grafana data source**: Configures GreptimeDB as the default Prometheus data source for dashboard queries + +Install the `kube-prometheus-stack` using Helm with the custom values file: + +```bash +helm install kube-prometheus prometheus-community/kube-prometheus-stack \ + --namespace monitoring \ + --create-namespace \ + --values kube-prometheus-values.yaml +``` + +### Verify the Installation + +Check that all Prometheus components are running: + +```bash +kubectl get pods -n monitoring +``` + +```bash +NAME READY STATUS RESTARTS AGE +alertmanager-kube-prometheus-kube-prome-alertmanager-0 2/2 Running 0 60s +kube-prometheus-grafana-78ccf96696-sghx4 3/3 Running 0 78s +kube-prometheus-kube-prome-operator-775fdbfd75-w88n7 1/1 Running 0 78s +kube-prometheus-kube-state-metrics-5bd5747f46-d2sxs 1/1 Running 0 78s +kube-prometheus-prometheus-node-exporter-ts9nn 1/1 Running 0 78s +prometheus-kube-prometheus-kube-prome-prometheus-0 2/2 Running 0 60s +``` + +### Verify the Monitoring Status + +Use [MySQL protocol](#access-greptimedb) to query GreptimeDB and verify that Prometheus metrics are being written. + +```sql +SHOW TABLES; +``` + +You should see tables created for various Prometheus metrics. + +```sql ++---------------------------------------------------------------------------------+ +| Tables | ++---------------------------------------------------------------------------------+ +| :node_memory_MemAvailable_bytes:sum | +| ALERTS | +| ALERTS_FOR_STATE | +| aggregator_discovery_aggregation_count_total | +| aggregator_unavailable_apiservice | +| alertmanager_alerts | +| alertmanager_alerts_invalid_total | +| alertmanager_alerts_received_total | +| alertmanager_build_info | +| ...... | ++---------------------------------------------------------------------------------+ +1553 rows in set (0.18 sec) +``` + +## Use Grafana for Visualization + +Grafana is included in the kube-prometheus-stack and comes pre-configured with dashboards for comprehensive Kubernetes monitoring. + +### Access Grafana + +Port-forward the Grafana service to access the web interface: + +```bash +kubectl port-forward -n monitoring svc/kube-prometheus-grafana 3000:80 +``` + +### Get Admin Credentials + +Retrieve the admin password using kubectl: + +```bash +kubectl get secret --namespace monitoring kube-prometheus-grafana -o jsonpath="{.data.admin-password}" | base64 --decode ; echo +``` + +### Login Grafana + +1. Open your browser and navigate to [http://localhost:3000](http://localhost:3000) +2. Login with: + - **Username**: `admin` + - **Password**: The password retrieved from the previous step + +### Explore Pre-configured Dashboards + +After logging in, navigate to **Dashboards** to explore the pre-configured Kubernetes monitoring dashboards: + +- **Kubernetes / Compute Resources / Cluster**: Overview of cluster-wide resource utilization +- **Kubernetes / Compute Resources / Namespace (Pods)**: Resource usage breakdown by namespace +- **Kubernetes / Compute Resources / Node (Pods)**: Node-level resource monitoring +- **Node Exporter / Nodes**: Detailed node hardware and OS metrics + +![Grafana Dashboard](/k8s-prom-monitor-grafana.jpg) + +## Conclusion + +You now have a complete Kubernetes monitoring solution with Prometheus collecting metrics and GreptimeDB providing efficient long-term storage. This setup enables you to: + +- Monitor cluster and application health in real-time +- Store metrics for historical analysis and capacity planning +- Create rich visualizations and dashboards with Grafana +- Query metrics using both PromQL and SQL + +For more information about GreptimeDB and Prometheus integration, see: + +- [Prometheus Integration](/user-guide/ingest-data/for-observability/prometheus.md) +- [Query Data in GreptimeDB](/user-guide/query-data/overview.md) + diff --git a/versioned_sidebars/version-0.17-sidebars.json b/versioned_sidebars/version-0.17-sidebars.json index 19a7fd1fe..3f8191dad 100644 --- a/versioned_sidebars/version-0.17-sidebars.json +++ b/versioned_sidebars/version-0.17-sidebars.json @@ -402,6 +402,13 @@ } ] }, + { + "type": "category", + "label": "Tutorials", + "items": [ + "tutorials/k8s-metrics-monitor" + ] + }, { "type": "category", "label": "GreptimeCloud",