Skip to content

bug: add flush_interval_sec #1459

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
SohilShri opened this issue Jan 27, 2025 · 1 comment
Open

bug: add flush_interval_sec #1459

SohilShri opened this issue Jan 27, 2025 · 1 comment

Comments

@SohilShri
Copy link

Describe the issue

We are using

  1. fluentbit image: kubesphere/fluentbit-3.1.8

  2. FLuent operator version: https://github.com/fluent/fluent-operator/releases/tag/v3.2.0

We have recently implemented logToMetrics plugin using fluentbit operator CRD.
we are seeing error in our Prometheus that duplicate metrics with diffrent value and same time stamp is coming to Prometheus.

ts=2025-01-27T12:51:31.436Z caller=scrape.go:1754 level=warn component="scrape manager" scrape_pool=serviceMonitor/cloud/fluent-bit-product-01/0 target=http://10.149.29.143:2021/api/v2/metrics/prometheus msg="Error on ingesting samples with different value but same timestamp" num_dropped=16
ts=2025-01-27T12:51:32.029Z caller=scrape.go:1754 level=warn component="scrape manager" scrape_pool=serviceMonitor/cloud/fluent-bit-product-01/0 target=http://10.149.39.145:2021/api/v2/metrics/prometheus msg="Error on ingesting samples with different value but same timestamp" num_dropped=8
ts=2025-01-27T12:51:42.463Z caller=scrape.go:1754 level=warn component="scrape manager" scrape_pool=serviceMonitor/cloud/fluent-bit-product-01/0 target=http://10.149.7.132:2021/api/v2/metrics/prometheus msg="Error on ingesting samples with different value but same timestamp" num_dropped=30
ts=2025-01-27T12:51:43.822Z caller=scrape.go:1754 level=warn component="scrape manager" scrape_pool=serviceMonitor/cloud/fluent-bit-product-01/0 target=http://10.149.1.52:2021/api/v2/metrics/prometheus msg="Error on ingesting samples with different value but same timestamp" num_dropped=3

Below is our Endpoint in Service monitor:

  endpoints:
    - port: metrics
      path: /api/v2/metrics/prometheus
      interval: 30s
{{- end }}
Below is our log to metric plugin:
 - logToMetrics:
      addLabel:
      - timestamp os.time()
      kubernetesMode: true
      metricDescription: Count of logs processed by fluent-bit - Gauge
      metricMode: counter
      metricName: product_log_to_metrics
      tag: product_log_to_metrics

This issues is already opened in fluentbit end, The workaround suggested to use flush_interval_sec is not available in fluent operator CRD latest version:
fluent/fluent-bit#9413

To Reproduce

Same as: fluent/fluent-bit#9413

Steps to reproduce the problem:
Deploy fluent-bit with the below tail input config as a daemonset into a k8s cluster using version 3.1.4 to see container logs and metrics to validate success.
Update fluent-bit image to 3.1.5 (or newer, <= 3.1.8) and verify /metrics endpoint on port 2021

Expected behavior

No duplicate metrics on the additional endpoint /metrics for log_to_metrics feature usually on port 2021, no warnings in Prometheus logs, no PrometheusDuplicateTimestamps errors.

Your Environment

- Fluent Operator version: 3.2.0
- Environment name and version (e.g. Kubernetes? What version?): EKS Kubernetes 1.27
- Filter and Plugins: kubernetes, log_to_metrics

How did you install fluent operator?

No response

Additional context

No response

@cw-Guo cw-Guo changed the title bug: bug: add flush_interval_sec Jan 27, 2025
@cw-Guo
Copy link
Collaborator

cw-Guo commented Jan 27, 2025

@SohilShri Thanks for reporting. It will be great if you able to contribute to this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants