From 54eaee18831ebf99813a86b0eb1a54357dd63d2f Mon Sep 17 00:00:00 2001 From: marco Date: Tue, 26 Aug 2025 12:32:34 +0200 Subject: [PATCH 01/22] document post-install behavior of "cscli setup unattended" --- crowdsec-docs/docs/log_processor/intro.mdx | 44 +++++++++++++++++----- 1 file changed, 34 insertions(+), 10 deletions(-) diff --git a/crowdsec-docs/docs/log_processor/intro.mdx b/crowdsec-docs/docs/log_processor/intro.mdx index 32df210a9..90c7e1dd1 100644 --- a/crowdsec-docs/docs/log_processor/intro.mdx +++ b/crowdsec-docs/docs/log_processor/intro.mdx @@ -4,13 +4,13 @@ title: Introduction sidebar_position: 1 --- -The Log Processor is one of the core component of the Security Engine to: +The Log Processor is a core component of the Security Engine. It: -- Read logs from [Data Sources](log_processor/data_sources/introduction.md) in the form of Acquistions. -- Parse the logs and extract relevant information using [Parsers](log_processor/parsers/introduction.mdx). -- Enrich the parsed information with additional context such as GEOIP, ASN using [Enrichers](log_processor/parsers/enricher.md). -- Monitor the logs for patterns of interest known as [Scenarios](log_processor/scenarios/introduction.mdx). -- Push alerts to the Local API (LAPI) for alert/decisions to be stored within the database. +- Reads logs from [Data Sources](log_processor/data_sources/introduction.md) via Acquistions. +- Parses logs and extract relevant information using [Parsers](log_processor/parsers/introduction.mdx). +- Enriches the parsed information with additional context such as GEOIP, ASN using [Enrichers](log_processor/parsers/enricher.md). +- Monitors patterns of interest via [Scenarios](log_processor/scenarios/introduction.mdx). +- Pushes alerts to the Local API (LAPI), where alert/decisions are stored. !TODO: Add diagram of the log processor pipeline - Read logs from datasources @@ -21,7 +21,7 @@ The Log Processor is one of the core component of the Security Engine to: ## Introduction -The Log Processor is an internal core component of the Security Engine in charge of reading logs from Data Sources, parsing them, enriching them, and monitoring them for patterns of interest. +The Log Processor reads logs from Data Sources, parses and enriches them, and monitors them for patterns of interest. Once a pattern of interest is detected, the Log Processor will push alerts to the Local API (LAPI) for alert/decisions to be stored within the database. @@ -35,10 +35,10 @@ Data Sources are individual modules that can be loaded at runtime by the Log Pro Acquisitions are the configuration files that define how the Log Processor should read logs from a Data Source. Acquisitions are defined in YAML format and are loaded by the Log Processor at runtime. -We have two ways to define Acquisitions within the [configuration directory](/u/troubleshooting/security_engine#where-is-configuration-stored) : +We support two ways to define Acquisitions in the [configuration directory](/u/troubleshooting/security_engine#where-is-configuration-stored): -- `acquis.yaml` file: This used to be only place to define Acquisitions prior to `1.5.0`. This file is still supported for backward compatibility. -- `acquis.d` folder: This is a directory where you can define multiple Acquisitions in separate files. This is useful when you want to auto generate files using an external application such as ansible. +- `acquis.yaml` file: the legacy, single-file configuration (still supported) +- `acquis.d` directory: a directory of multiple acquisition files (since v1.5.0, recommended for any non-trivial setup) ```yaml title="Example Acquisition Configuration" ## /etc/crowdsec/acquis.d/file.yaml @@ -50,8 +50,32 @@ labels: type: syslog ``` +When CrowdSec is installed via a package manager on a fresh system, the package manager may run `cscli setup` in **unattended** mode. +It detects installed services and common log file locations, installs the related Hub collections, and generates acquisition files under `acquis.d/setup..yaml`, e.g. `setup.linux.yaml`). + +Generated files are meant to be managed by crowdsec; don’t edit them in place. If you need changes, delete the generated file and create your own. + +When upgrading or reinstalling crowdsec, it detects non-generated or modified files and won’t overwrite your custom acquisitions. + +:::caution + +Make sure the same data sources aren’t ingested more than once: duplicating inputs can artificially increase scenario sensitivity. + +::: + +Examples: + + - If an application logs to both `journald` and `/var/log/*`, you usually only need one of them. + + - If an application writes to `/var/log/syslog` or `/var/log/messages`, it’s already acquired by `setup.linux.yaml` (since 1.7) or `acquis.yam`. You don’t need to add a separate acquisition for the same logs. + +For config-managed deployments (e.g., Ansible), set the environment variable `CROWDSEC_SETUP_UNATTENDED_DISABLE` to any non-empty value to skip the automated setup. +In that case, ensure you configure at least one data source and install the OS collection (e.g., crowdsecurity/linux). + For more information on Data Sources and Acquisitions, see the [Data Sources](log_processor/data_sources/introduction.md) documentation. +For more information on the automated configuration, see the command `cscli setup`. + ## Collections Collections are used to group together Parsers, Scenarios, and Enrichers that are related to a specific application. For example the `crowdsecurity/nginx` collection contains all the Parsers, Scenarios, and Enrichers that are needed to parse logs from an NGINX web server and detect patterns of interest. From b2e2def63dccf9d3db4ca35c50827b3e5150cd60 Mon Sep 17 00:00:00 2001 From: marco Date: Thu, 28 Aug 2025 15:31:31 +0200 Subject: [PATCH 02/22] moved content --- .../data_sources/introduction.md | 97 ++++++++++++++----- crowdsec-docs/docs/log_processor/intro.mdx | 24 ----- .../post_installation/acquisition.mdx | 7 +- 3 files changed, 79 insertions(+), 49 deletions(-) diff --git a/crowdsec-docs/docs/log_processor/data_sources/introduction.md b/crowdsec-docs/docs/log_processor/data_sources/introduction.md index 8cb0281d8..e8bfb2cdf 100644 --- a/crowdsec-docs/docs/log_processor/data_sources/introduction.md +++ b/crowdsec-docs/docs/log_processor/data_sources/introduction.md @@ -6,26 +6,70 @@ sidebar_position: 1 ## Datasources -To be able to monitor applications, the Security Engine needs to access logs. -DataSources are configured via the [acquisition](/configuration/crowdsec_configuration.md#acquisition_path) configuration, or specified via the command-line when performing cold logs analysis. +To monitor applications, the Security Engine needs to read logs. +DataSources define where to access them (either as files, or over the network from a centralized logging service). +They can be defined: + +- in [Acquisition files](/configuration/crowdsec_configuration.md#acquisition_path). Each file can contain multiple DataSource definitions. +- for cold log analysis, you can also specify acquisitions via the command line. + + +### Service detection (automated setup) + +When CrowdSec is installed via a package manager on a fresh system, the package may run [`cscli setup`](/cscli/cscli_setup) in **unattended** mode. + +The `cscli setup` command will: + +- detect installed services and common log file locations +- install the related Hub collections +- generate acquisition files under `acquis.d/` as `setup..yaml` (e.g., `setup.linux.yaml`) + +Generated files are meant to be managed by CrowdSec; don’t edit them in place. If you need changes, delete the generated file and create your own. + +When upgrading or reinstalling CrowdSec, it detects non-generated or modified files and won’t overwrite your custom acquisitions. + +:::caution + +Make sure the same data sources are not ingested more than once: duplicating inputs can artificially increase scenario sensitivity. + +::: + +Examples: + +- If an application logs to both `journald` and `/var/log/*`, you usually only need one of them. +- If an application writes to `/var/log/syslog` or `/var/log/messages`, it’s already acquired by `setup.linux.yaml` (since 1.7) or `acquis.yam`. You don’t need to add a separate acquisition for the same logs. + +For config-managed deployments (e.g., Ansible), set the environment variable `CROWDSEC_SETUP_UNATTENDED_DISABLE` to any non-empty value to skip the automated setup. +In that case, ensure you configure at least one data source and install the OS collection (e.g., crowdsecurity/linux). + +### Assisted service detection (semi-automated setup) + +If you installed new applications and want to detect the service detection again, running [`cscli setup`](/cscli/cscli_setup) yourself will guide you through the +automated setup, with confirmation prompts. You will receive a warning if you already configured some acquisition yourself but they won't be +modified by `cscli`. + +Note that `cscli setup` will not remove any collection or acquisition file in `acquis.d/setup..yaml`, even if the service has been uninstalled since the file creation. + + +## Datasources modules Name | Type | Stream | One-shot -----|------|--------|---------- -[Appsec](/log_processor/data_sources/appsec.md) | expose HTTP service for the Appsec component | yes | no -[AWS cloudwatch](/log_processor/data_sources/cloudwatch.md) | single stream or log group | yes | yes -[AWS kinesis](/log_processor/data_sources/kinesis.md)| read logs from a kinesis strean | yes | no -[AWS S3](/log_processor/data_sources/s3.md)| read logs from a S3 bucket | yes | yes -[docker](/log_processor/data_sources/docker.md) | read logs from docker containers | yes | yes -[file](/log_processor/data_sources/file.md) | single files, glob expressions and .gz files | yes | yes -[HTTP](/log_processor/data_sources/http.md) | read logs from an HTTP endpoint | yes | no -[journald](/log_processor/data_sources/journald.md) | journald via filter | yes | yes -[Kafka](/log_processor/data_sources/kafka.md)| read logs from kafka topic | yes | no -[Kubernetes Audit](/log_processor/data_sources/kubernetes_audit.md) | expose a webhook to receive audit logs from a Kubernetes cluster | yes | no -[Loki](/log_processor/data_sources/loki.md) | read logs from loki | yes | yes -[VictoriaLogs](/log_processor/data_sources/victorialogs.md) | read logs from VictoriaLogs | yes | yes -[syslog service](/log_processor/data_sources/syslog_service.md) | read logs received via syslog protocol | yes | no -[Windows Event](/log_processor/data_sources/windows_event_log.md)| read logs from windows event log | yes | yes +[Appsec](/log_processor/data_sources/appsec) | expose HTTP service for the Appsec component | yes | no +[AWS cloudwatch](/log_processor/data_sources/cloudwatch) | single stream or log group | yes | yes +[AWS kinesis](/log_processor/data_sources/kinesis)| read logs from a kinesis strean | yes | no +[AWS S3](/log_processor/data_sources/s3)| read logs from a S3 bucket | yes | yes +[docker](/log_processor/data_sources/docker) | read logs from docker containers | yes | yes +[file](/log_processor/data_sources/file) | single files, glob expressions and .gz files | yes | yes +[HTTP](/log_processor/data_sources/http) | read logs from an HTTP endpoint | yes | no +[journald](/log_processor/data_sources/journald) | journald via filter | yes | yes +[Kafka](/log_processor/data_sources/kafka)| read logs from kafka topic | yes | no +[Kubernetes Audit](/log_processor/data_sources/kubernetes_audit) | expose a webhook to receive audit logs from a Kubernetes cluster | yes | no +[Loki](/log_processor/data_sources/loki) | read logs from loki | yes | yes +[VictoriaLogs](/log_processor/data_sources/victorialogs) | read logs from VictoriaLogs | yes | yes +[syslog service](/log_processor/data_sources/syslog_service) | read logs received via syslog protocol | yes | no +[Windows Event](/log_processor/data_sources/windows_event_log)| read logs from windows event log | yes | yes ## Common configuration parameters @@ -46,6 +90,7 @@ An expression that will run after the acquisition has read one line, and before It allows to modify an event (or generate multiple events from one line) before parsing. For example, if you acquire logs from a file containing a JSON object on each line, and each object has a `Records` array with multiple events, you can use the following to generate one event per entry in the array: + ``` map(JsonExtractSlice(evt.Line.Raw, "Records"), ToJsonString(#)) ``` @@ -70,31 +115,39 @@ If not set, then crowdsec will think all logs happened at once, which can lead t A map of labels to add to the event. The `type` label is mandatory, and used by the Security Engine to choose which parser to use. -## Acquisition configuration example +## Acquisition configuration examples -```yaml title="/etc/crowdsec/acquis.yaml" +```yaml title="/etc/crowdsec/acquis.d/nginx.yaml" filenames: - /var/log/nginx/*.log labels: type: nginx ---- +``` + +```yaml title="/etc/crowdsec/acquis.d/linux.yaml" filenames: - /var/log/auth.log - /var/log/syslog labels: type: syslog ---- +``` + +```yaml title="/etc/crowdsec/acquis.d/docker.yaml" source: docker container_name_regexp: - .*caddy* labels: type: caddy --- -... +source: docker +container_name_regexp: + - .*nginx* +labels: + type: nginx ``` :::warning The `labels` and `type` fields are necessary to dispatch the log lines to the right parser. -Also note between each datasource is `---` this is needed to separate multiple YAML documents (each datasource) in a single file. +In the last example we defined multiple datasources separated by the line `---`, which is the standard YAML marker. ::: diff --git a/crowdsec-docs/docs/log_processor/intro.mdx b/crowdsec-docs/docs/log_processor/intro.mdx index 90c7e1dd1..b11a16531 100644 --- a/crowdsec-docs/docs/log_processor/intro.mdx +++ b/crowdsec-docs/docs/log_processor/intro.mdx @@ -50,32 +50,8 @@ labels: type: syslog ``` -When CrowdSec is installed via a package manager on a fresh system, the package manager may run `cscli setup` in **unattended** mode. -It detects installed services and common log file locations, installs the related Hub collections, and generates acquisition files under `acquis.d/setup..yaml`, e.g. `setup.linux.yaml`). - -Generated files are meant to be managed by crowdsec; don’t edit them in place. If you need changes, delete the generated file and create your own. - -When upgrading or reinstalling crowdsec, it detects non-generated or modified files and won’t overwrite your custom acquisitions. - -:::caution - -Make sure the same data sources aren’t ingested more than once: duplicating inputs can artificially increase scenario sensitivity. - -::: - -Examples: - - - If an application logs to both `journald` and `/var/log/*`, you usually only need one of them. - - - If an application writes to `/var/log/syslog` or `/var/log/messages`, it’s already acquired by `setup.linux.yaml` (since 1.7) or `acquis.yam`. You don’t need to add a separate acquisition for the same logs. - -For config-managed deployments (e.g., Ansible), set the environment variable `CROWDSEC_SETUP_UNATTENDED_DISABLE` to any non-empty value to skip the automated setup. -In that case, ensure you configure at least one data source and install the OS collection (e.g., crowdsecurity/linux). - For more information on Data Sources and Acquisitions, see the [Data Sources](log_processor/data_sources/introduction.md) documentation. -For more information on the automated configuration, see the command `cscli setup`. - ## Collections Collections are used to group together Parsers, Scenarios, and Enrichers that are related to a specific application. For example the `crowdsecurity/nginx` collection contains all the Parsers, Scenarios, and Enrichers that are needed to parse logs from an NGINX web server and detect patterns of interest. diff --git a/crowdsec-docs/unversioned/getting_started/post_installation/acquisition.mdx b/crowdsec-docs/unversioned/getting_started/post_installation/acquisition.mdx index 5dcfd1bea..91ebcc8bb 100644 --- a/crowdsec-docs/unversioned/getting_started/post_installation/acquisition.mdx +++ b/crowdsec-docs/unversioned/getting_started/post_installation/acquisition.mdx @@ -5,13 +5,14 @@ title: Acquisition # Acquisition -By default when CrowdSec is installed it will attempt to detect the running services and acquire the appropriate log sources and [Collections](https://docs.crowdsec.net/docs/next/collections/intro). +By default when CrowdSec is installed it will attempt to [detect the running services](/log_processor/data_sources#service-detection) and acquire the appropriate log sources and [Collections](https://docs.crowdsec.net/docs/next/collections/intro). -However, we should check that this detection worked or you may want to manually acquire additional [Collections](https://docs.crowdsec.net/docs/next/collections/intro) for other services that are not detected. +However, we should check that this detection worked and the log locations are correct. +You may want to manually acquire additional [Collections](https://docs.crowdsec.net/docs/next/collections/intro) for the services that were not detected. ## What log sources are already detected? -To find what log sources are already detected, you can use the `cscli` command line tool. +To find out which log sources are providing data to crowdsec, you can query the CrowdSec metrics with the `cscli` command line tool. ```bash cscli metrics show acquisition From 6f3d358dfb13864bb3a70cd8fb26cfca6c81450c Mon Sep 17 00:00:00 2001 From: marco Date: Thu, 28 Aug 2025 15:55:57 +0200 Subject: [PATCH 03/22] formatting, title --- .../docs/log_processor/data_sources/introduction.md | 8 +++----- crowdsec-docs/docs/log_processor/intro.mdx | 2 +- 2 files changed, 4 insertions(+), 6 deletions(-) diff --git a/crowdsec-docs/docs/log_processor/data_sources/introduction.md b/crowdsec-docs/docs/log_processor/data_sources/introduction.md index e8bfb2cdf..77b23d7b1 100644 --- a/crowdsec-docs/docs/log_processor/data_sources/introduction.md +++ b/crowdsec-docs/docs/log_processor/data_sources/introduction.md @@ -1,11 +1,9 @@ --- id: intro -title: Acquisition Datasources Introduction +title: Acquisition Datasources sidebar_position: 1 --- -## Datasources - To monitor applications, the Security Engine needs to read logs. DataSources define where to access them (either as files, or over the network from a centralized logging service). @@ -33,13 +31,13 @@ When upgrading or reinstalling CrowdSec, it detects non-generated or modified fi Make sure the same data sources are not ingested more than once: duplicating inputs can artificially increase scenario sensitivity. -::: - Examples: - If an application logs to both `journald` and `/var/log/*`, you usually only need one of them. - If an application writes to `/var/log/syslog` or `/var/log/messages`, it’s already acquired by `setup.linux.yaml` (since 1.7) or `acquis.yam`. You don’t need to add a separate acquisition for the same logs. +::: + For config-managed deployments (e.g., Ansible), set the environment variable `CROWDSEC_SETUP_UNATTENDED_DISABLE` to any non-empty value to skip the automated setup. In that case, ensure you configure at least one data source and install the OS collection (e.g., crowdsecurity/linux). diff --git a/crowdsec-docs/docs/log_processor/intro.mdx b/crowdsec-docs/docs/log_processor/intro.mdx index b11a16531..4d2c27c80 100644 --- a/crowdsec-docs/docs/log_processor/intro.mdx +++ b/crowdsec-docs/docs/log_processor/intro.mdx @@ -19,7 +19,7 @@ The Log Processor is a core component of the Security Engine. It: - Monitor the logs for patterns of interest -## Introduction +## Log Processor The Log Processor reads logs from Data Sources, parses and enriches them, and monitors them for patterns of interest. From c059a3e434c58782d9184f288f36562cae0389fd Mon Sep 17 00:00:00 2001 From: marco Date: Thu, 28 Aug 2025 16:28:38 +0200 Subject: [PATCH 04/22] typos --- crowdsec-docs/docs/appsec/alerts_and_scenarios.md | 2 +- crowdsec-docs/docs/appsec/configuration.md | 4 ++-- crowdsec-docs/docs/appsec/quickstart/traefik.mdx | 8 ++++---- crowdsec-docs/docs/getting_started/crowdsec_tour.mdx | 2 +- .../docs/log_processor/data_sources/introduction.md | 6 +++--- .../docs/log_processor/data_sources/syslog_service.md | 4 ++-- .../docs/log_processor/data_sources/troubleshoot.md | 2 +- crowdsec-docs/unversioned/bouncers/haproxy.mdx | 2 +- crowdsec-docs/unversioned/bouncers/ingress-nginx.mdx | 2 +- crowdsec-docs/unversioned/bouncers/nginx.mdx | 2 +- crowdsec-docs/unversioned/bouncers/openresty.mdx | 4 ++-- crowdsec-docs/unversioned/cti_api/api_introduction.md | 2 +- .../getting_started/installation/cloudways.mdx | 2 +- .../unversioned/getting_started/installation/docker.mdx | 2 +- .../troubleshooting/remediation_components.mdx | 2 +- 15 files changed, 23 insertions(+), 23 deletions(-) diff --git a/crowdsec-docs/docs/appsec/alerts_and_scenarios.md b/crowdsec-docs/docs/appsec/alerts_and_scenarios.md index c05f9d5eb..0fd252927 100644 --- a/crowdsec-docs/docs/appsec/alerts_and_scenarios.md +++ b/crowdsec-docs/docs/appsec/alerts_and_scenarios.md @@ -115,7 +115,7 @@ We can now create a scenario that will trigger when a single IPs triggers this r type: leaky format: 3.0 name: crowdsecurity/foobar-enum -description: "Ban IPs repeateadly triggering out of band rules" +description: "Ban IPs repeatedly triggering out of band rules" filter: "evt.Meta.log_type == 'appsec-info' && evt.Meta.rule_name == 'crowdsecurity/foobar-access'" distinct: evt.Meta.target_uri leakspeed: "60s" diff --git a/crowdsec-docs/docs/appsec/configuration.md b/crowdsec-docs/docs/appsec/configuration.md index a0f4d3ee8..dfd2c0018 100644 --- a/crowdsec-docs/docs/appsec/configuration.md +++ b/crowdsec-docs/docs/appsec/configuration.md @@ -6,7 +6,7 @@ sidebar_position: 6 ## Overview -This page explains the interraction between various files involved in AppSec configuration and the details about the processing pipeline AppSec request processing. +This page explains the interaction between various files involved in AppSec configuration and the details about the processing pipeline AppSec request processing. **Prerequisites**: - Familiarity with [AppSec concepts](/appsec/intro.md) @@ -24,7 +24,7 @@ The goals of the acquisition file are: - To specify the **address** and **port** where the AppSec-enabled Remediation Component(s) will forward the requests to. - And specify one or more [AppSec configuration files](#appsec-configuration) to use as definition of what rules to apply and how. -Details can be found in the [AppSec Datasource page](/log_processor/data_sources/apps). +Details can be found in the [AppSec Datasource page](/log_processor/data_sources/appsec.md). ### Defining Multiple AppSec Configurations diff --git a/crowdsec-docs/docs/appsec/quickstart/traefik.mdx b/crowdsec-docs/docs/appsec/quickstart/traefik.mdx index 1ecb98ad5..5663fe339 100644 --- a/crowdsec-docs/docs/appsec/quickstart/traefik.mdx +++ b/crowdsec-docs/docs/appsec/quickstart/traefik.mdx @@ -25,7 +25,7 @@ Additionally, we'll show how to monitor these alerts through the [console](https - Traefik Plugin **[Remediation Component](/u/bouncers/intro)**: Thanks to [maxlerebourg](https://github.com/maxlerebourg) and team they created a [Traefik Plugin](https://plugins.traefik.io/plugins/6335346ca4caa9ddeffda116/crowdsec-bouncer-traefik-plugin) that allows you to block requests directly from Traefik. :::info -Prior to starting the guide ensure you are using the [Traefik Plugin](https://plugins.traefik.io/plugins/6335346ca4caa9ddeffda116/crowdsec-bouncer-traefik-plugin) and **NOT** the older [traefik-crowdsec-bouncer](https://app.crowdsec.net/hub/author/fbonalair/remediation-components/traefik-crowdsec-bouncer) as it hasnt recieved updates to use the new AppSec Component. +Prior to starting the guide ensure you are using the [Traefik Plugin](https://plugins.traefik.io/plugins/6335346ca4caa9ddeffda116/crowdsec-bouncer-traefik-plugin) and **NOT** the older [traefik-crowdsec-bouncer](https://app.crowdsec.net/hub/author/fbonalair/remediation-components/traefik-crowdsec-bouncer) as it hasnt received updates to use the new AppSec Component. ::: :::warning @@ -77,7 +77,7 @@ If you have a folder in which you are persisting the configuration files, you ca There steps will change depending on how you are running the Security Engine. If you are running via `docker run` then you should launch the container within the same directory as the `appsec.yaml` file. If you are using `docker-compose` you can use a relative file mount to mount the `appsec.yaml` file. Steps: - 1. Change to the location where you exectued the `docker run` or `docker compose` command. + 1. Change to the location where you executted the `docker run` or `docker compose` command. 2. Create a `appsec.yaml` file at the base of the directory. 3. Add the following content to the `appsec.yaml` file. @@ -96,11 +96,11 @@ Since CrowdSec is running inside a container you must set the `listen_addr` to ` diff --git a/crowdsec-docs/docs/getting_started/crowdsec_tour.mdx b/crowdsec-docs/docs/getting_started/crowdsec_tour.mdx index 6230d19c5..1d0c90d11 100644 --- a/crowdsec-docs/docs/getting_started/crowdsec_tour.mdx +++ b/crowdsec-docs/docs/getting_started/crowdsec_tour.mdx @@ -250,7 +250,7 @@ Those metrics are a great way to know if your configuration is correct: The `Acquisition Metrics` is a great way to know if your parsers are setup correctly: - If you have 0 **LINES PARSED** for a source : You are probably *missing* a parser, or you have a custom log format that prevents the parser from understanding your logs. - - However, it's perfectly OK to have a lot of **LINES UNPARSED** : Crowdsec is not a SIEM, and only parses the logs that are relevant to its scenarios. For example, [ssh parser](https://hub.crowdsec.net/author/crowdsecurity/configurations/sshd-logs), only cares about failed authentication events (at the time of writting). + - However, it's perfectly OK to have a lot of **LINES UNPARSED** : Crowdsec is not a SIEM, and only parses the logs that are relevant to its scenarios. For example, [ssh parser](https://hub.crowdsec.net/author/crowdsecurity/configurations/sshd-logs), only cares about failed authentication events (at the time of writing). - **LINES POURED TO BUCKET** tell you that your scenarios are matching your log sources : it means that some events from this log source made all their way to an actual scenario diff --git a/crowdsec-docs/docs/log_processor/data_sources/introduction.md b/crowdsec-docs/docs/log_processor/data_sources/introduction.md index 77b23d7b1..9aed2f7a6 100644 --- a/crowdsec-docs/docs/log_processor/data_sources/introduction.md +++ b/crowdsec-docs/docs/log_processor/data_sources/introduction.md @@ -34,7 +34,7 @@ Make sure the same data sources are not ingested more than once: duplicating inp Examples: - If an application logs to both `journald` and `/var/log/*`, you usually only need one of them. -- If an application writes to `/var/log/syslog` or `/var/log/messages`, it’s already acquired by `setup.linux.yaml` (since 1.7) or `acquis.yam`. You don’t need to add a separate acquisition for the same logs. +- If an application writes to `/var/log/syslog` or `/var/log/messages`, it’s already acquired by `setup.linux.yaml` (since 1.7) or `acquis.yaml`. You don’t need to add a separate acquisition for the same logs. ::: @@ -56,7 +56,7 @@ Name | Type | Stream | One-shot -----|------|--------|---------- [Appsec](/log_processor/data_sources/appsec) | expose HTTP service for the Appsec component | yes | no [AWS cloudwatch](/log_processor/data_sources/cloudwatch) | single stream or log group | yes | yes -[AWS kinesis](/log_processor/data_sources/kinesis)| read logs from a kinesis strean | yes | no +[AWS kinesis](/log_processor/data_sources/kinesis)| read logs from a kinesis stream | yes | no [AWS S3](/log_processor/data_sources/s3)| read logs from a S3 bucket | yes | yes [docker](/log_processor/data_sources/docker) | read logs from docker containers | yes | yes [file](/log_processor/data_sources/file) | single files, glob expressions and .gz files | yes | yes @@ -105,7 +105,7 @@ By default, when reading logs in real-time, crowdsec will use the time at which Setting this option to `true` will force crowdsec to use the timestamp from the log as the time of the event. -It is mandatory to set this if your application buffers logs before writting them (for example, IIS when writing to a log file, or logs written to S3 from almost any AWS service).
+It is mandatory to set this if your application buffers logs before writing them (for example, IIS when writing to a log file, or logs written to S3 from almost any AWS service).
If not set, then crowdsec will think all logs happened at once, which can lead to some false positive detections. ### `labels` diff --git a/crowdsec-docs/docs/log_processor/data_sources/syslog_service.md b/crowdsec-docs/docs/log_processor/data_sources/syslog_service.md index 691da461f..c595bb98b 100644 --- a/crowdsec-docs/docs/log_processor/data_sources/syslog_service.md +++ b/crowdsec-docs/docs/log_processor/data_sources/syslog_service.md @@ -51,6 +51,6 @@ This module does not support command-line acquisition. :::warning This syslog datasource is currently intended for small setups, and is at risk of losing messages over a few hundreds events/second. -To process significant amounts of logs, rely on dedicated syslog server such as [rsyslog](https://www.rsyslog.com/), with this server writting logs to files that Security Engine will read from. +To process significant amounts of logs, rely on dedicated syslog server such as [rsyslog](https://www.rsyslog.com/), with this server writing logs to files that Security Engine will read from. This page will be updated with further improvements of this data source. -::: \ No newline at end of file +::: diff --git a/crowdsec-docs/docs/log_processor/data_sources/troubleshoot.md b/crowdsec-docs/docs/log_processor/data_sources/troubleshoot.md index 8eeaef795..c5f9121d3 100644 --- a/crowdsec-docs/docs/log_processor/data_sources/troubleshoot.md +++ b/crowdsec-docs/docs/log_processor/data_sources/troubleshoot.md @@ -5,7 +5,7 @@ sidebar_position: 10 --- The [prometheus](/observability/prometheus.md) instrumentation exposes metrics about acquisition and data sources. -Those can as well be view via `cscli metrics` : +Those can as well be viewed via `cscli metrics` : ```bash INFO[19-08-2021 06:33:31 PM] Acquisition Metrics: diff --git a/crowdsec-docs/unversioned/bouncers/haproxy.mdx b/crowdsec-docs/unversioned/bouncers/haproxy.mdx index c9428dd20..b8ba85373 100644 --- a/crowdsec-docs/unversioned/bouncers/haproxy.mdx +++ b/crowdsec-docs/unversioned/bouncers/haproxy.mdx @@ -43,7 +43,7 @@ This component is compatible with HAProxy version 2.5 and higher. ## How does it work ? -This component leverages haproxy lua's API to check e IP address against the local API. +This component leverages haproxy lua's API to check the IP address against the local API. Supported features: diff --git a/crowdsec-docs/unversioned/bouncers/ingress-nginx.mdx b/crowdsec-docs/unversioned/bouncers/ingress-nginx.mdx index 735f3e914..b6cf50f4d 100644 --- a/crowdsec-docs/unversioned/bouncers/ingress-nginx.mdx +++ b/crowdsec-docs/unversioned/bouncers/ingress-nginx.mdx @@ -312,7 +312,7 @@ CAPTCHA_PROVIDER=recaptcha ``` :::info -For backwards compatability reasons `recaptcha` is the default if no value is set. +For backwards compatibility reasons `recaptcha` is the default if no value is set. ::: ### `SECRET_KEY` diff --git a/crowdsec-docs/unversioned/bouncers/nginx.mdx b/crowdsec-docs/unversioned/bouncers/nginx.mdx index d6f86cae0..46bcb0121 100644 --- a/crowdsec-docs/unversioned/bouncers/nginx.mdx +++ b/crowdsec-docs/unversioned/bouncers/nginx.mdx @@ -515,7 +515,7 @@ CAPTCHA_PROVIDER= note: The ratio of fire to smoke is around 1% at the time of writting +> note: The ratio of fire to smoke is around 1% at the time of writing ## CTI Information diff --git a/crowdsec-docs/unversioned/getting_started/installation/cloudways.mdx b/crowdsec-docs/unversioned/getting_started/installation/cloudways.mdx index 78ab610c8..c8a85d78d 100644 --- a/crowdsec-docs/unversioned/getting_started/installation/cloudways.mdx +++ b/crowdsec-docs/unversioned/getting_started/installation/cloudways.mdx @@ -291,7 +291,7 @@ We want CrowdSec to run in the background and start at boot. For this we'll add a systemd service in the user level. ### Create the systemd service for user -- At the time of writting (for v1.6.3) you can use the following content: +- At the time of writing (for v1.6.3) you can use the following content: - Create and edit ~/.config/systemd/user/crowdsec.service ```bash [Unit] diff --git a/crowdsec-docs/unversioned/getting_started/installation/docker.mdx b/crowdsec-docs/unversioned/getting_started/installation/docker.mdx index 1eae6a609..4174096e1 100644 --- a/crowdsec-docs/unversioned/getting_started/installation/docker.mdx +++ b/crowdsec-docs/unversioned/getting_started/installation/docker.mdx @@ -64,7 +64,7 @@ crowdsec: #### Compose key aspects -If you dont find an example that fits your needs, you can create your own `docker-compose.yml` file. Here are the key aspects: +If you don't find an example that fits your needs, you can create your own `docker-compose.yml` file. Here are the key aspects: ##### Provide Access To Logs diff --git a/crowdsec-docs/unversioned/troubleshooting/remediation_components.mdx b/crowdsec-docs/unversioned/troubleshooting/remediation_components.mdx index e189be23f..df7ce8ae9 100644 --- a/crowdsec-docs/unversioned/troubleshooting/remediation_components.mdx +++ b/crowdsec-docs/unversioned/troubleshooting/remediation_components.mdx @@ -59,7 +59,7 @@ You can use the os related commands to filter the logs to only show errors. **Please make sure the log location matches your distribution.** -## My Remediaton Component is not showing any error messages within its log file but its failing to start/work +## My Remediation Component is not showing any error messages within its log file but its failing to start/work Most likely means the bouncer is failing to decode the configuration file provided. To find which line is causing the issue, you can use systemd/journalctl to get the error message: From c6204a31619fae0b76c61cf4203575f8c7beaa1a Mon Sep 17 00:00:00 2001 From: marco Date: Thu, 28 Aug 2025 23:33:34 +0200 Subject: [PATCH 05/22] unit --- crowdsec-docs/docs/appsec/benchmark.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/crowdsec-docs/docs/appsec/benchmark.md b/crowdsec-docs/docs/appsec/benchmark.md index 307d91892..a78f76a2f 100644 --- a/crowdsec-docs/docs/appsec/benchmark.md +++ b/crowdsec-docs/docs/appsec/benchmark.md @@ -15,7 +15,7 @@ sidebar_position: 80 --> -The Application Security Component benchmarks have been run on a AWS EC2 Instance `t2.medium` (2vCPU/4Go RAM). +The Application Security Component benchmarks have been run on a AWS EC2 Instance `t2.medium` (2vCPU/4GiB RAM). All the benchmarks have been run with only one `routine` configured for the Application Security Component. From 0e24c8f8e016cbd4a5d065bb2aa99f4e346cd80b Mon Sep 17 00:00:00 2001 From: marco Date: Mon, 1 Sep 2025 13:43:42 +0200 Subject: [PATCH 06/22] typo --- crowdsec-docs/docs/appsec/quickstart/traefik.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/crowdsec-docs/docs/appsec/quickstart/traefik.mdx b/crowdsec-docs/docs/appsec/quickstart/traefik.mdx index 0d260b663..926707cac 100644 --- a/crowdsec-docs/docs/appsec/quickstart/traefik.mdx +++ b/crowdsec-docs/docs/appsec/quickstart/traefik.mdx @@ -77,7 +77,7 @@ If you have a folder in which you are persisting the configuration files, you ca There steps will change depending on how you are running the Security Engine. If you are running via `docker run` then you should launch the container within the same directory as the `appsec.yaml` file. If you are using `docker-compose` you can use a relative file mount to mount the `appsec.yaml` file. Steps: - 1. Change to the location where you executted the `docker run` or `docker compose` command. + 1. Change to the location where you executed the `docker run` or `docker compose` command. 2. Create a `appsec.yaml` file at the base of the directory. 3. Add the following content to the `appsec.yaml` file. From c3dfda2d80380265decc60250cdf4a2728d53e5c Mon Sep 17 00:00:00 2001 From: marco Date: Mon, 1 Sep 2025 15:29:31 +0200 Subject: [PATCH 07/22] wip --- crowdsec-docs/docs/log_processor/intro.mdx | 6 + .../service-discovery-setup/detect-yaml.md | 91 +++++++++++ .../service-discovery-setup/intro.md | 148 ++++++++++++++++++ 3 files changed, 245 insertions(+) create mode 100644 crowdsec-docs/docs/log_processor/service-discovery-setup/detect-yaml.md create mode 100644 crowdsec-docs/docs/log_processor/service-discovery-setup/intro.md diff --git a/crowdsec-docs/docs/log_processor/intro.mdx b/crowdsec-docs/docs/log_processor/intro.mdx index 4d2c27c80..1697d5abb 100644 --- a/crowdsec-docs/docs/log_processor/intro.mdx +++ b/crowdsec-docs/docs/log_processor/intro.mdx @@ -87,3 +87,9 @@ You can see more information on Whitelists in the [documentation](log_processor/ Alert Context is additional context that can sent with an alert to the LAPI. This context can be shown locally via `cscli` or within the [CrowdSec Console](https://app.crowdsec.net/signup) if you opt in to share context when you enroll your instance. You can read more about Alert Context in the [documentation](log_processor/alert_context/intro.md). + +### Service Discovery & Setup + +On installation, CrowdSec can automatically detect existing services, download the relevant Hub collections, and generate acquisitions based on discovered log files. + +You can [customize or override these steps](log_processor/service-discovery-setup/intro.md), for example when provisioning multiple systems or using configuration management tools. diff --git a/crowdsec-docs/docs/log_processor/service-discovery-setup/detect-yaml.md b/crowdsec-docs/docs/log_processor/service-discovery-setup/detect-yaml.md new file mode 100644 index 000000000..5ae299d35 --- /dev/null +++ b/crowdsec-docs/docs/log_processor/service-discovery-setup/detect-yaml.md @@ -0,0 +1,91 @@ +--- +id: detect-yaml +title: detect.yaml file format +sidebar_position: 1 +--- + +# File layout: `detect.yaml` +A minimal detection file is a YAML map with a top‐level `detect:` key. Under it, each entry describes one service plan: + +```yaml +# detect.yaml +--- +detect: + apache2-file-apache2: + when: + - Systemd.UnitInstalled("apache2.service") or len(Path.Glob("/var/log/apache2/*.log")) > 0 + hub_spec: + collections: + - crowdsecurity/apache2 + acquisition_spec: + filename: apache2.yaml + datasource: + source: file + filenames: + - /var/log/apache2/*.log + labels: + type: apache2 +``` + +Fields + +- `when`: a list of boolean expressions evaluated on the host. Examples include: + - `Systemd.UnitInstalled("")`, `Windows.ServiceEnabled("")` + - `Host.OS == "linux"`, `Host.OS == "windows"` + - `Path.Exists("/path/file")`, `len(Path.Glob("/path/*.log")) > 0` + - `System.ProcessRunning("")` +- `hub_spec`: which Hub items to install (collections/parsers/scenarios, etc.). Unknown item types are preserved and passed through. +- `acquisition_spec`: how to generate a per‐service acquisition file: + - `filename`: base name (no slashes). The actual path will be `acquis.d/setup..yaml`. + - `datasource`: a map validated against the selected `source` (e.g., `file`, `journalctl`, `docker`, `wineventlog`, `cloudwatch`, `kinesis`, …). Required fields vary per source; the CLI validates them for you. + +Examples + +Basic OS / Hub only: + +```yaml +detect: + linux: + when: + - Host.OS == "linux" + hub_spec: + collections: [crowdsecurity/linux] +``` + +`journalctl` source with a filter: + +```yaml +detect: + caddy-journal: + when: + - Systemd.UnitInstalled("caddy.service") + - len(Path.Glob("/var/log/caddy/*.log")) == 0 + hub_spec: + collections: [crowdsecurity/caddy] + acquisition_spec: + filename: caddy.yaml + datasource: + source: journalctl + labels: {type: caddy} + journalctl_filter: + - "_SYSTEMD_UNIT=caddy.service" +``` + +Windows event log: + +```yaml +detect: + windows_auth: + when: [ Host.OS == "windows" ] + hub_spec: + collections: [crowdsecurity/windows] + acquisition_spec: + filename: windows_auth.yaml + datasource: + source: wineventlog + event_channel: Security + event_ids: [4625, 4623] + event_level: information + labels: {type: eventlog} +``` + diff --git a/crowdsec-docs/docs/log_processor/service-discovery-setup/intro.md b/crowdsec-docs/docs/log_processor/service-discovery-setup/intro.md new file mode 100644 index 000000000..196fe782f --- /dev/null +++ b/crowdsec-docs/docs/log_processor/service-discovery-setup/intro.md @@ -0,0 +1,148 @@ +--- +id: intro +title: Service Discovery & Setup +sidebar_position: 1 +--- + + +> Implementation notes & validation +> +> - The CLI can list supported services from your file (`--list-supported-services`). It also validates each datasource by type and errors out on unknown/misplaced fields (e.g., `source` missing; wrong keys for `journalctl`/`docker`; filename with slashes). +> + +--- + +# Use your custom `detect.yaml` +You can provide your file in two ways: + +```bash +# Path flag (recommended) +cscli setup detect --detect-config /path/to/detect.yaml + +# Or via environment variable +CROWDSEC_SETUP_DETECT_CONFIG=/path/to/detect.yaml cscli setup detect +``` + +Helpful flags: +- `--yaml` – render the setup plan as YAML (easy to review/edit); default output is JSON. +- `--force ` – pretend detection matched for `` (repeatable). +- `--ignore ` – drop `` from the plan even if matched (repeatable). +- `--skip-systemd` – disable systemd‐based detection (useful in containers/chroots). +- `--list-supported-services` – print the service keys present in your file and exit. + +**End‑to‑end flow (typical)** +```bash +# 1) build a plan from your rules +cscli setup detect --detect-config ./detect.yaml --yaml > setup.yaml + +# 2) validate that plan (optional but recommended) +cscli setup validate ./setup.yaml + +# 3) install Hub items + write acquis files +cscli setup install-hub ./setup.yaml +cscli setup install-acquisition ./setup.yaml --acquis-dir /etc/crowdsec/acquis.d +``` + +# Examples: override defaults (nginx path, etc.) +If your logs live in non‑standard locations, just encode that in `acquisition_spec`. + +```yaml +# detect.yaml +--- +detect: + nginx-custom: + when: + - Systemd.UnitInstalled("nginx.service") or len(Path.Glob("/srv/logs/nginx/*.log")) > 0 + hub_spec: + collections: [crowdsecurity/nginx] + acquisition_spec: + filename: nginx.yaml + datasource: + source: file + filenames: + - /srv/logs/nginx/*.log # <- your path here + labels: {type: nginx} +``` + +You can also define detection purely by process name when systemd isn’t a good signal: +```yaml + app-by-process: + when: [ System.ProcessRunning("myappd") ] + acquisition_spec: + filename: myappd.yaml + datasource: + source: file + filenames: [ /var/log/myappd/*.log ] + labels: {type: myappd} +``` + +--- + +# Generated acquisition files & coexistence with your own files +When you install acquisition from a setup plan, the CLI writes one file per service as `setup..yaml` in the acquisition directory (typically `/etc/crowdsec/acquis.d`). The content is **prefixed with a header** that includes a truncated `cscli-checksum` and a comment stating it was generated by `cscli setup`. + +- Files carrying a valid `cscli-checksum` are considered **generated** and may be overwritten by future runs. +- Files **without** a valid checksum are treated as **manually edited**; in interactive flows, the CLI shows a colorized diff and asks before overwriting. In unattended flows, the command refuses to proceed if manual files are detected. +- Either way, the safest practice is: **don’t edit generated files**. If you need changes, delete the generated `setup..yaml` and create your own hand‑managed file instead. + +> Tips +> - The actual on‑disk path is computed as `acquis.d/setup.` where `` comes from `acquisition_spec.filename`. +> - Use `--acquis-dir` to target a different directory. +> - `--dry-run` prints what would be created without writing files. + +--- + +# Unattended installs with a custom detect file +Package installers often call: + +```bash +cscli setup unattended +``` + +This mode: +- uses the same `--detect-config` and `--acquis-dir` flags; +- never prompts for confirmation; +- will skip itself entirely if `CROWDSEC_SETUP_UNATTENDED_DISABLE` is non‑empty (handy for Ansible/automation); +- installs Hub items and writes `setup.*.yaml` files if and only if there are no conflicting manual acquisitions. + +--- + +# Validation & troubleshooting +- **Validate your setup plan** before writing files: + ```bash + cscli setup validate ./setup.yaml + ``` +- Common validation errors (examples): + - missing `datasource.source` + - wrong keys for a source type (e.g., `filename` under `journalctl`) + - missing mandatory fields (e.g., `journalctl_filter` for `journalctl`, `containers/services` for `docker`) + - `acquisition_spec.filename` contains slashes/backslashes + +--- + +# Reference snippets +- Linux collection detection: + ```yaml + detect: + linux: + when: [ Host.OS == "linux" ] + hub_spec: + collections: [crowdsecurity/linux] + ``` +- MariaDB/MySQL file detection with distro fallbacks: + ```yaml + detect: + mariadb: + when: + - Systemd.UnitInstalled("mariadb.service") or Path.Exists("/var/log/mariadb/mariadb.log") + hub_spec: { collections: [crowdsecurity/mariadb] } + acquisition_spec: + filename: mariadb.yaml + datasource: + source: file + labels: {type: mysql} + filenames: + - /var/log/mysql/error.log + - /var/log/mariadb/mariadb.log + ``` + From cc2dc07351018a5a40e4591394d6e01cf4c7fe11 Mon Sep 17 00:00:00 2001 From: marco Date: Mon, 1 Sep 2025 15:44:26 +0200 Subject: [PATCH 08/22] wip --- .../service-discovery-setup/detect-yaml.md | 29 +++++++++++++++++++ 1 file changed, 29 insertions(+) diff --git a/crowdsec-docs/docs/log_processor/service-discovery-setup/detect-yaml.md b/crowdsec-docs/docs/log_processor/service-discovery-setup/detect-yaml.md index 5ae299d35..b56ed5138 100644 --- a/crowdsec-docs/docs/log_processor/service-discovery-setup/detect-yaml.md +++ b/crowdsec-docs/docs/log_processor/service-discovery-setup/detect-yaml.md @@ -89,3 +89,32 @@ detect: labels: {type: eventlog} ``` + +## Expression Helpers Reference + +Expressions run against an environment that exposes helpers and facts via these names: + +- Host — host facts from gopsutil/host.InfoStat. See https://pkg.go.dev/github.com/shirou/gopsutil/host#InfoStat + Example: Host.OS == "linux". + +- Path — filesystem helpers: + - Path.Exists(path) -> bool + - Path.Glob(pattern) -> []string + Example: len(Path.Glob("/var/log/nginx/*.log")) > 0. + +- System — process helpers: + - System.ProcessRunning(name) -> bool (by process name) + +- Systemd (Linux) — systemd unit helpers: + - Systemd.UnitInstalled(unit) -> bool + - Systemd.UnitConfig(unit, key) -> string (empty string if unit missing; error if key missing) + - Systemd.UnitLogsToJournal(unit) -> bool (true if stdout/stderr go to journal or journal+console) + +- Windows (Windows builds only): + - Windows.ServiceEnabled(service) -> bool (true if the service exists and is Automatic start; returns false on non-Windows builds) + +- Version — semantic version checks (can be used with Host.PlatformVersion): + - Version.Check(version, constraint) -> bool + - Supports operators like =, !=, <, <=, >, >=, ranges (1.1.1 - 1.3.4), AND with commas (>1, <3), and ~ compatible ranges. + + From 912c3c59aefa309b2c4e3d634adb84da842be286 Mon Sep 17 00:00:00 2001 From: Sebastien Blot Date: Mon, 1 Sep 2025 23:55:07 +0200 Subject: [PATCH 09/22] up --- .../service-discovery-setup/detect-yaml.md | 118 ++++++----- .../service-discovery-setup/expr.md | 151 +++++++++++++++ .../service-discovery-setup/intro.md | 183 +++++++++--------- crowdsec-docs/sidebars.ts | 11 ++ .../post_installation/acquisition.mdx | 2 +- 5 files changed, 327 insertions(+), 138 deletions(-) create mode 100644 crowdsec-docs/docs/log_processor/service-discovery-setup/expr.md diff --git a/crowdsec-docs/docs/log_processor/service-discovery-setup/detect-yaml.md b/crowdsec-docs/docs/log_processor/service-discovery-setup/detect-yaml.md index b56ed5138..8d735c721 100644 --- a/crowdsec-docs/docs/log_processor/service-discovery-setup/detect-yaml.md +++ b/crowdsec-docs/docs/log_processor/service-discovery-setup/detect-yaml.md @@ -4,8 +4,11 @@ title: detect.yaml file format sidebar_position: 1 --- -# File layout: `detect.yaml` -A minimal detection file is a YAML map with a top‐level `detect:` key. Under it, each entry describes one service plan: +# `detect.yaml` syntax + +A minimal detection file is a YAML map with a top‐level `detect:` key. + +Under it, each entry describes one service plan: ```yaml # detect.yaml @@ -27,19 +30,58 @@ detect: type: apache2 ``` -Fields +## Fields + +### `when` + +A list of expression that must return a boolean. + +If multiple expressions are provided, they must all return `true` for the service to be included. + +```yaml +when: + - Host.OS == "linux" + - Systemd.UnitInstalled("") +``` + +You can use any of the helper referenced [here](/log_processor/service-discovery-setup/expr.md). +) + +### `hub_spec` + +A map of hub items to install. + +Specifying an invalid item type or item will log an error but will not prevent the detection to continue. + +```yaml +hub_spec: + collections: + - crowdsecurity/linux + parsers: + - crowdsecurity/nginx-logs + scenarios: + - crowdsecurity/http-bf +``` + +### `acquisition_spec` + +This item defines the acquisition that will be written to disk + +```yaml +acquisition_spec: + filename: foobar.yaml + datasource: + source: docker + container_name: foo + labels: + type: bar +``` + +The `filename` attribute will be used to generate the name of file in the form of `acquis.d/setup..yaml`. -- `when`: a list of boolean expressions evaluated on the host. Examples include: - - `Systemd.UnitInstalled("")`, `Windows.ServiceEnabled("")` - - `Host.OS == "linux"`, `Host.OS == "windows"` - - `Path.Exists("/path/file")`, `len(Path.Glob("/path/*.log")) > 0` - - `System.ProcessRunning("")` -- `hub_spec`: which Hub items to install (collections/parsers/scenarios, etc.). Unknown item types are preserved and passed through. -- `acquisition_spec`: how to generate a per‐service acquisition file: - - `filename`: base name (no slashes). The actual path will be `acquis.d/setup..yaml`. - - `datasource`: a map validated against the selected `source` (e.g., `file`, `journalctl`, `docker`, `wineventlog`, `cloudwatch`, `kinesis`, …). Required fields vary per source; the CLI validates them for you. +The content of `datasource` will be validated (syntax, required fields depending on the datasource configured) and be written as-is to the file. -Examples +## Examples Basic OS / Hub only: @@ -49,7 +91,8 @@ detect: when: - Host.OS == "linux" hub_spec: - collections: [crowdsecurity/linux] + collections: + - crowdsecurity/linux ``` `journalctl` source with a filter: @@ -61,12 +104,14 @@ detect: - Systemd.UnitInstalled("caddy.service") - len(Path.Glob("/var/log/caddy/*.log")) == 0 hub_spec: - collections: [crowdsecurity/caddy] + collections: + - crowdsecurity/caddy acquisition_spec: filename: caddy.yaml datasource: source: journalctl - labels: {type: caddy} + labels: + type: caddy journalctl_filter: - "_SYSTEMD_UNIT=caddy.service" ``` @@ -76,45 +121,22 @@ Windows event log: ```yaml detect: windows_auth: - when: [ Host.OS == "windows" ] + when: + - Host.OS == "windows" hub_spec: - collections: [crowdsecurity/windows] + collections: + - crowdsecurity/windows acquisition_spec: filename: windows_auth.yaml datasource: source: wineventlog event_channel: Security - event_ids: [4625, 4623] + event_ids: + - 4625 + - 4623 event_level: information - labels: {type: eventlog} + labels: + type: eventlog ``` -## Expression Helpers Reference - -Expressions run against an environment that exposes helpers and facts via these names: - -- Host — host facts from gopsutil/host.InfoStat. See https://pkg.go.dev/github.com/shirou/gopsutil/host#InfoStat - Example: Host.OS == "linux". - -- Path — filesystem helpers: - - Path.Exists(path) -> bool - - Path.Glob(pattern) -> []string - Example: len(Path.Glob("/var/log/nginx/*.log")) > 0. - -- System — process helpers: - - System.ProcessRunning(name) -> bool (by process name) - -- Systemd (Linux) — systemd unit helpers: - - Systemd.UnitInstalled(unit) -> bool - - Systemd.UnitConfig(unit, key) -> string (empty string if unit missing; error if key missing) - - Systemd.UnitLogsToJournal(unit) -> bool (true if stdout/stderr go to journal or journal+console) - -- Windows (Windows builds only): - - Windows.ServiceEnabled(service) -> bool (true if the service exists and is Automatic start; returns false on non-Windows builds) - -- Version — semantic version checks (can be used with Host.PlatformVersion): - - Version.Check(version, constraint) -> bool - - Supports operators like =, !=, <, <=, >, >=, ranges (1.1.1 - 1.3.4), AND with commas (>1, <3), and ~ compatible ranges. - - diff --git a/crowdsec-docs/docs/log_processor/service-discovery-setup/expr.md b/crowdsec-docs/docs/log_processor/service-discovery-setup/expr.md new file mode 100644 index 000000000..aba6c3c0e --- /dev/null +++ b/crowdsec-docs/docs/log_processor/service-discovery-setup/expr.md @@ -0,0 +1,151 @@ +--- +id: setup-expr-helpers +title: Expr Helpers +sidebar_position: 1 +--- + +# Expression Helpers Reference + +Various helpers are available for use in the `detect.yaml` file to determine how crowdsec should be configured. + +## Host + + This object gives access to various information about the current state of the operating system + +### `Host.Hostname` + +    Returns the hostname of the machine + +> `Host.Hostname == "mymachine"` + +### `Host.Uptime` + +    Returns the uptime of the machine in seconds. + +### `Host.Boottime` + +    Returns the unix timestamp of the time the machine booted. + +### `Host.Procs` + +    Returns the number of processes on the machine. + +### `Host.OS` + +    Returns the name of the OS (`linux`, `freebsd`, `windows`, ...) + +> `Host.OS == "linux"` + +### `Host.Platform` + +    Returns the variant of the OS (`ubuntu`, `linuxmint`, ....) + +> `Host.Platform == "ubuntu"` + +### `Host.PlatformFamily` + +    Returns the family of the OS (`debian`, `rhel`, ...) + +> `Host.Platform == "debian"` + +### `Host.KernelVersion` + +    Returns the current kernel version as returned by `uname -r` + +> `Host.KernelVersion == "6.16.2" + +### `Host.KernelArch` + +    Returns the native architecture of the system (`x86_64`, ...) + +> `Host.KernelArch == "x86_64"` + +### `Host.VirtualizationSystem` + +    Returns the name of the virtualization system in use if any. + +> `Host.VirtualizationSystem == "kvm"` + +### `Host.VirtualizationRole` + +    Returns the virtualization role of the system if any (`guest`, `host`) + +> `Host.VirtualizationRole == "host"` + +### `Host.HostID` + +    Returns a unique ID specific to the system. + +## Path + +This object exposes helpers functions for the filesystem + +### `Exists(path) bool` + +    Returns `true` if the specified path exists. + +> `Path.Exists("/var/log/nginx/access.log") == true` + +### `Glob(pattern) []string` + +    Returns a list of files matching the provided pattern. + +    Returns an empty list if the glob pattern is invalid + +> `len(Path.Glob("/var/log/nginx/*.log")) > 0` + +## System + +### `ProcessRunning(name) bool` + +    Returns `true` if there's any with the specified name running + +> `System.ProcessRunning("nginx") == true` + +## Systemd + +    This object exposes helpers to get informations about systemd units. + +    Only available on Linux. + +### `UnitInstalled(unitName) bool` + +    Returns `true` if the provided unit is installed. + +> `Systemd.UnitInstalled("nginx") == true` + +### `UnitConfig(unitName, key) string` + +    Returns the value of the specified key from the specified unit. + +    Returns an empty value if the unit if not installed and an error if the key does not exist. + +> `Systemd.UnitConfig("nginx", "StandardOutput") == "journal"` + +### `UnitLogsToJournal(unitName) bool` + +    Returns `true` if unit stdout/stderr are redirect to journal or journal+console. + +> `Systemd.UnitLogsToJournal("nginx") == true` + +## Windows + +    This object exposes helpers to get informations about Windows services. + +    Only available on Windows. + +### `ServiceEnabled(serviceName) bool` + +    Returns `true` if the specified service exists and is configured to start automatically on boot. + +> `Windows.ServiceEnabled("MSSSQLSERVER") == true` + +## Version + +### `Check(version, constraint) bool` + +    Performs a semantic version check. + +    Constraint supports operators like `=`, `!=`, `<`, `<=`, `>`, `>=`, ranges (1.1.1 - 1.3.4), AND with commas (`>1`, `<3`), and ~ compatible ranges. + +> `Version.Check(Host.KernelVersion, ">=6.24.0")` \ No newline at end of file diff --git a/crowdsec-docs/docs/log_processor/service-discovery-setup/intro.md b/crowdsec-docs/docs/log_processor/service-discovery-setup/intro.md index 196fe782f..bbfb848a0 100644 --- a/crowdsec-docs/docs/log_processor/service-discovery-setup/intro.md +++ b/crowdsec-docs/docs/log_processor/service-discovery-setup/intro.md @@ -4,19 +4,58 @@ title: Service Discovery & Setup sidebar_position: 1 --- +# Service Discovery -> Implementation notes & validation -> -> - The CLI can list supported services from your file (`--list-supported-services`). It also validates each datasource by type and errors out on unknown/misplaced fields (e.g., `source` missing; wrong keys for `journalctl`/`docker`; filename with slashes). -> +## Basic Usage ---- +The main way to use the service discovery is with `cscli setup interactive` or `cscli setup unattended`. + +By default, it will use the detection file provided by crowdsec stored in `/etc/crowdsec/detect.yaml`. + +In interactive mode, `cscli` will ask you to choose which service to configure based on those that were detected, and will require confirmation before any operation (installing hub items, generating acquisition config, ...). + +If an `acquis.yaml` file exists, `cscli` will ask for confirmation before proceeding to avoid reading the same files multiple times. + +It is your responsability to check the generated configuration to make sure each log file is only read once by crowdsec. + +As such, you should avoid putting your acquisition configuration in `/etc/crowdsec/acquis.yaml`, but instead create dedicated files in `/etc/crowdsec/acquis.d`. + +When ran in unattended mode, `cscli` will automatically any hub item, but will refuse to run if: + - `acquis.yaml` exists and is not empty + - An automatically generated acquisition file in `/etc/crowdsec/acquis.d` has been modified + +Linux packages (deb or rpm) will automatically call `cscli setup unattended` during installation. -# Use your custom `detect.yaml` -You can provide your file in two ways: +:::warning + +While `cscli setup` will check the generated configuration files for syntax errors or invalid configuration, it does *not* check for duplicate acquisition. + +If using a custom `detect.yaml`, make sure no files are read multiple times (with the same `type` label), as this could lead to false positives. + +::: + +### Generated acquisition files & coexistence with your own files + +When you generated the acquisition configuration with `cscli setup`, `cscli` writes one file per service as `setup..yaml` in the acquisition directory (typically `/etc/crowdsec/acquis.d`). The content is **prefixed with a header** that includes a truncated `cscli-checksum` and a comment stating it was generated by `cscli setup`. + +- Files carrying a valid `cscli-checksum` are considered **generated** and may be overwritten by future runs. +- Files **without** a valid checksum are treated as **manually edited**; in interactive flows, `cscli` shows a colorized diff and asks before overwriting. In unattended flows, the command refuses to proceed if manual files are detected. +- Either way, the safest practice is: **don’t edit generated files**. If you need changes, delete the generated `setup..yaml` and create your own hand‑managed file instead or use a custom `detect.yaml` to generate the proper configuration automatically. + +> Tips +> - The actual on‑disk path is computed as `acquis.d/setup.` where `` comes from `acquisition_spec.filename`. +> - Use `--acquis-dir` to target a different directory. +> - `--dry-run` prints what would be created without writing files. + + +## Advanced Usage + +### Use a custom `detect.yaml` + +You can provide a custom `detect.yaml` in two ways: ```bash -# Path flag (recommended) +# Path flag cscli setup detect --detect-config /path/to/detect.yaml # Or via environment variable @@ -30,21 +69,11 @@ Helpful flags: - `--skip-systemd` – disable systemd‐based detection (useful in containers/chroots). - `--list-supported-services` – print the service keys present in your file and exit. -**End‑to‑end flow (typical)** -```bash -# 1) build a plan from your rules -cscli setup detect --detect-config ./detect.yaml --yaml > setup.yaml +You can see a list of all the available expr helpers in the [dedicated documentation](/log_processor/service-discovery-setup/expr.md). -# 2) validate that plan (optional but recommended) -cscli setup validate ./setup.yaml - -# 3) install Hub items + write acquis files -cscli setup install-hub ./setup.yaml -cscli setup install-acquisition ./setup.yaml --acquis-dir /etc/crowdsec/acquis.d -``` +For example, if you have configured nginx to log in a non-standard location, you can use a custom `detect.yaml` to automatically generate the configuration. -# Examples: override defaults (nginx path, etc.) -If your logs live in non‑standard locations, just encode that in `acquisition_spec`. +This example will generate an acquisition config for the file datasource with the pattern `/srv/logs/nginx/*.log` if the nginx service is installed OR if any file matches the glob pattern `/srv/logs/nginx/*.log`: ```yaml # detect.yaml @@ -54,95 +83,71 @@ detect: when: - Systemd.UnitInstalled("nginx.service") or len(Path.Glob("/srv/logs/nginx/*.log")) > 0 hub_spec: - collections: [crowdsecurity/nginx] + collections: + - crowdsecurity/nginx acquisition_spec: filename: nginx.yaml datasource: source: file filenames: - /srv/logs/nginx/*.log # <- your path here - labels: {type: nginx} + labels: + type: nginx ``` -You can also define detection purely by process name when systemd isn’t a good signal: +:::warning + +When using a custom detect configuration, the default one will be fully ignored. + +This means that on top of your custom detection, you will most likely want to add the basic OS detection, for example: + ```yaml - app-by-process: - when: [ System.ProcessRunning("myappd") ] - acquisition_spec: - filename: myappd.yaml - datasource: - source: file - filenames: [ /var/log/myappd/*.log ] - labels: {type: myappd} +detect: + linux: + when: + - Host.OS == "linux" + hub_spec: + collections: + - crowdsecurity/linux ``` +::: ---- +### Unattended installs with a custom detect file -# Generated acquisition files & coexistence with your own files -When you install acquisition from a setup plan, the CLI writes one file per service as `setup..yaml` in the acquisition directory (typically `/etc/crowdsec/acquis.d`). The content is **prefixed with a header** that includes a truncated `cscli-checksum` and a comment stating it was generated by `cscli setup`. +Linux packages (deb or rpm) will automatically call `cscli setup unattended` during installation. -- Files carrying a valid `cscli-checksum` are considered **generated** and may be overwritten by future runs. -- Files **without** a valid checksum are treated as **manually edited**; in interactive flows, the CLI shows a colorized diff and asks before overwriting. In unattended flows, the command refuses to proceed if manual files are detected. -- Either way, the safest practice is: **don’t edit generated files**. If you need changes, delete the generated `setup..yaml` and create your own hand‑managed file instead. +You can specify a custom detection file to use by setting the `CROWDSEC_SETUP_DETECT_CONFIG` environment variable. -> Tips -> - The actual on‑disk path is computed as `acquis.d/setup.` where `` comes from `acquisition_spec.filename`. -> - Use `--acquis-dir` to target a different directory. -> - `--dry-run` prints what would be created without writing files. +Alternatively, if you want to skip the automatic detection (because you deploy the configuration with Ansible for example), you can set the env var `CROWDSEC_SETUP_UNATTENDED_DISABLE` to any value. ---- +### End-to-end workflow -# Unattended installs with a custom detect file -Package installers often call: +Behind the scenes, `cscli setup` use multiple steps to configure crowdsec: + - Generate a setup files that contains the detected services, their associated hub items and acquisition configuration + - Validate this file + - Install the hub items + - Write the acquisition config to disk +If you wish, you can manually invoke any of those steps (if you only want to install the hub items for example). + +`cscli setup detect` can be used to generate the setup file: ```bash -cscli setup unattended +cscli setup detect --detect-config ./detect.yaml --yaml > setup.yaml ``` -This mode: -- uses the same `--detect-config` and `--acquis-dir` flags; -- never prompts for confirmation; -- will skip itself entirely if `CROWDSEC_SETUP_UNATTENDED_DISABLE` is non‑empty (handy for Ansible/automation); -- installs Hub items and writes `setup.*.yaml` files if and only if there are no conflicting manual acquisitions. +You can then validate its content for syntax error or issues with the acquisition configuration: ---- - -# Validation & troubleshooting -- **Validate your setup plan** before writing files: - ```bash - cscli setup validate ./setup.yaml - ``` -- Common validation errors (examples): - - missing `datasource.source` - - wrong keys for a source type (e.g., `filename` under `journalctl`) - - missing mandatory fields (e.g., `journalctl_filter` for `journalctl`, `containers/services` for `docker`) - - `acquisition_spec.filename` contains slashes/backslashes +```bash +cscli setup validate ./setup.yaml +``` ---- +Then, install the hub items: -# Reference snippets -- Linux collection detection: - ```yaml - detect: - linux: - when: [ Host.OS == "linux" ] - hub_spec: - collections: [crowdsecurity/linux] - ``` -- MariaDB/MySQL file detection with distro fallbacks: - ```yaml - detect: - mariadb: - when: - - Systemd.UnitInstalled("mariadb.service") or Path.Exists("/var/log/mariadb/mariadb.log") - hub_spec: { collections: [crowdsecurity/mariadb] } - acquisition_spec: - filename: mariadb.yaml - datasource: - source: file - labels: {type: mysql} - filenames: - - /var/log/mysql/error.log - - /var/log/mariadb/mariadb.log - ``` +```bash +cscli setup install-hub ./setup.yaml +``` +And finally, write the acquisition config: +```bash +cscli setup install-acquisition ./setup.yaml --acquis-dir /etc/crowdsec/acquis.d +``` \ No newline at end of file diff --git a/crowdsec-docs/sidebars.ts b/crowdsec-docs/sidebars.ts index e55234e7f..cda2d0160 100644 --- a/crowdsec-docs/sidebars.ts +++ b/crowdsec-docs/sidebars.ts @@ -126,6 +126,17 @@ const sidebarsConfig: SidebarConfig = { }, ], }, + { + type: "category", + label: "Service Discovery", + link: { + type: "doc", + id: "log_processor/service-discovery-setup/intro", + }, + items: [ + "log_processor/service-discovery-setup/detect-yaml", + "log_processor/service-discovery-setup/setup-expr-helpers"], + }, { type: "category", label: "Alert Context", diff --git a/crowdsec-docs/unversioned/getting_started/post_installation/acquisition.mdx b/crowdsec-docs/unversioned/getting_started/post_installation/acquisition.mdx index 91ebcc8bb..12edd4875 100644 --- a/crowdsec-docs/unversioned/getting_started/post_installation/acquisition.mdx +++ b/crowdsec-docs/unversioned/getting_started/post_installation/acquisition.mdx @@ -5,7 +5,7 @@ title: Acquisition # Acquisition -By default when CrowdSec is installed it will attempt to [detect the running services](/log_processor/data_sources#service-detection) and acquire the appropriate log sources and [Collections](https://docs.crowdsec.net/docs/next/collections/intro). +By default when CrowdSec is installed it will attempt to [detect the running services](https://docs.crowdsec.net/next/log_processor/service-discovery-setup/intro) (for CrowdSec >= 1.7.0) and acquire the appropriate log sources and [Collections](https://docs.crowdsec.net/docs/next/collections/intro). However, we should check that this detection worked and the log locations are correct. You may want to manually acquire additional [Collections](https://docs.crowdsec.net/docs/next/collections/intro) for the services that were not detected. From 0ac911788cee4d5c58913a11b5bfc78ba3f95df8 Mon Sep 17 00:00:00 2001 From: Sebastien Blot Date: Mon, 1 Sep 2025 23:57:25 +0200 Subject: [PATCH 10/22] lint --- crowdsec-docs/sidebars.ts | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/crowdsec-docs/sidebars.ts b/crowdsec-docs/sidebars.ts index cda2d0160..de29afcc1 100644 --- a/crowdsec-docs/sidebars.ts +++ b/crowdsec-docs/sidebars.ts @@ -134,8 +134,9 @@ const sidebarsConfig: SidebarConfig = { id: "log_processor/service-discovery-setup/intro", }, items: [ - "log_processor/service-discovery-setup/detect-yaml", - "log_processor/service-discovery-setup/setup-expr-helpers"], + "log_processor/service-discovery-setup/detect-yaml", + "log_processor/service-discovery-setup/setup-expr-helpers" + ], }, { type: "category", From 5a29140c79cdc6a4cdfd6f42b21c4075046c0242 Mon Sep 17 00:00:00 2001 From: Sebastien Blot Date: Mon, 1 Sep 2025 23:59:13 +0200 Subject: [PATCH 11/22] lint --- crowdsec-docs/sidebars.ts | 28 ++++++++++++++++++++++------ 1 file changed, 22 insertions(+), 6 deletions(-) diff --git a/crowdsec-docs/sidebars.ts b/crowdsec-docs/sidebars.ts index de29afcc1..d643b0ff0 100644 --- a/crowdsec-docs/sidebars.ts +++ b/crowdsec-docs/sidebars.ts @@ -92,7 +92,11 @@ const sidebarsConfig: SidebarConfig = { type: "doc", id: "log_processor/scenarios/intro", }, - items: ["log_processor/scenarios/format", "log_processor/scenarios/simulation", "log_processor/scenarios/create"], + items: [ + "log_processor/scenarios/format", + "log_processor/scenarios/simulation", + "log_processor/scenarios/create", + ], }, { type: "category", @@ -134,8 +138,8 @@ const sidebarsConfig: SidebarConfig = { id: "log_processor/service-discovery-setup/intro", }, items: [ - "log_processor/service-discovery-setup/detect-yaml", - "log_processor/service-discovery-setup/setup-expr-helpers" + "log_processor/service-discovery-setup/detect-yaml", + "log_processor/service-discovery-setup/setup-expr-helpers", ], }, { @@ -245,7 +249,10 @@ const sidebarsConfig: SidebarConfig = { type: "doc", id: "configuration/crowdsec_configuration", }, - items: ["configuration/feature_flags", "configuration/network_management"], + items: [ + "configuration/feature_flags", + "configuration/network_management", + ], }, { type: "category", @@ -335,7 +342,12 @@ const sidebarsConfig: SidebarConfig = { type: "doc", id: "cscli/cscli_alerts", }, - items: ["cscli/cscli_alerts_delete", "cscli/cscli_alerts_flush", "cscli/cscli_alerts_inspect", "cscli/cscli_alerts_list"], + items: [ + "cscli/cscli_alerts_delete", + "cscli/cscli_alerts_flush", + "cscli/cscli_alerts_inspect", + "cscli/cscli_alerts_list", + ], }, { type: "category", @@ -659,7 +671,11 @@ const sidebarsConfig: SidebarConfig = { type: "doc", id: "cscli/cscli_simulation", }, - items: ["cscli/cscli_simulation_disable", "cscli/cscli_simulation_enable", "cscli/cscli_simulation_status"], + items: [ + "cscli/cscli_simulation_disable", + "cscli/cscli_simulation_enable", + "cscli/cscli_simulation_status", + ], }, { type: "doc", From 8c91f71eb023b48f88696d29603b757a55d475d0 Mon Sep 17 00:00:00 2001 From: Sebastien Blot Date: Tue, 2 Sep 2025 00:04:10 +0200 Subject: [PATCH 12/22] lint --- crowdsec-docs/sidebars.ts | 24 ++++-------------------- 1 file changed, 4 insertions(+), 20 deletions(-) diff --git a/crowdsec-docs/sidebars.ts b/crowdsec-docs/sidebars.ts index d643b0ff0..f03cf4914 100644 --- a/crowdsec-docs/sidebars.ts +++ b/crowdsec-docs/sidebars.ts @@ -92,11 +92,7 @@ const sidebarsConfig: SidebarConfig = { type: "doc", id: "log_processor/scenarios/intro", }, - items: [ - "log_processor/scenarios/format", - "log_processor/scenarios/simulation", - "log_processor/scenarios/create", - ], + items: ["log_processor/scenarios/format", "log_processor/scenarios/simulation", "log_processor/scenarios/create"], }, { type: "category", @@ -249,10 +245,7 @@ const sidebarsConfig: SidebarConfig = { type: "doc", id: "configuration/crowdsec_configuration", }, - items: [ - "configuration/feature_flags", - "configuration/network_management", - ], + items: ["configuration/feature_flags", "configuration/network_management"], }, { type: "category", @@ -342,12 +335,7 @@ const sidebarsConfig: SidebarConfig = { type: "doc", id: "cscli/cscli_alerts", }, - items: [ - "cscli/cscli_alerts_delete", - "cscli/cscli_alerts_flush", - "cscli/cscli_alerts_inspect", - "cscli/cscli_alerts_list", - ], + items: ["cscli/cscli_alerts_delete", "cscli/cscli_alerts_flush", "cscli/cscli_alerts_inspect", "cscli/cscli_alerts_list"], }, { type: "category", @@ -671,11 +659,7 @@ const sidebarsConfig: SidebarConfig = { type: "doc", id: "cscli/cscli_simulation", }, - items: [ - "cscli/cscli_simulation_disable", - "cscli/cscli_simulation_enable", - "cscli/cscli_simulation_status", - ], + items: ["cscli/cscli_simulation_disable", "cscli/cscli_simulation_enable", "cscli/cscli_simulation_status"], }, { type: "doc", From 3a0c5ce6bd7fa55672aa8201906a6685944d6fae Mon Sep 17 00:00:00 2001 From: marco Date: Tue, 2 Sep 2025 09:39:02 +0200 Subject: [PATCH 13/22] repetitions --- .../service-discovery-setup/intro.md | 78 ++++++++++++------- 1 file changed, 48 insertions(+), 30 deletions(-) diff --git a/crowdsec-docs/docs/log_processor/service-discovery-setup/intro.md b/crowdsec-docs/docs/log_processor/service-discovery-setup/intro.md index bbfb848a0..e1414ff8f 100644 --- a/crowdsec-docs/docs/log_processor/service-discovery-setup/intro.md +++ b/crowdsec-docs/docs/log_processor/service-discovery-setup/intro.md @@ -1,6 +1,6 @@ --- id: intro -title: Service Discovery & Setup +title: Service Discovery sidebar_position: 1 --- @@ -10,40 +10,43 @@ sidebar_position: 1 The main way to use the service discovery is with `cscli setup interactive` or `cscli setup unattended`. -By default, it will use the detection file provided by crowdsec stored in `/etc/crowdsec/detect.yaml`. +By default, it will use the detection file provided by crowdsec stored in `/var/lib/crowdsec/data/detect.yaml`. In interactive mode, `cscli` will ask you to choose which service to configure based on those that were detected, and will require confirmation before any operation (installing hub items, generating acquisition config, ...). -If an `acquis.yaml` file exists, `cscli` will ask for confirmation before proceeding to avoid reading the same files multiple times. +It is your responsibility to check the compatibility of the generated acquisitions with the ones you add later or were already on the system. -It is your responsability to check the generated configuration to make sure each log file is only read once by crowdsec. +:::warning -As such, you should avoid putting your acquisition configuration in `/etc/crowdsec/acquis.yaml`, but instead create dedicated files in `/etc/crowdsec/acquis.d`. +While `cscli setup` validates the generated configuration files for syntax errors or invalid configuration, it does *not* check for duplicate acquisition. -When ran in unattended mode, `cscli` will automatically any hub item, but will refuse to run if: - - `acquis.yaml` exists and is not empty - - An automatically generated acquisition file in `/etc/crowdsec/acquis.d` has been modified +If using a custom `detect.yaml`, make sure no logs are read multiple times (with the same `type` label), as this could lead to false positives. -Linux packages (deb or rpm) will automatically call `cscli setup unattended` during installation. +::: -:::warning -While `cscli setup` will check the generated configuration files for syntax errors or invalid configuration, it does *not* check for duplicate acquisition. +`cscli` will ask for confirmation before proceeding if: -If using a custom `detect.yaml`, make sure no files are read multiple times (with the same `type` label), as this could lead to false positives. +- there is an `acquis.yaml` +- there is any non-generated file in `acquis.d` +- you modified the generated files in `acquis.d` (there is a checksum to detect modifications). Proceeding could overwrite them. -::: +Files composed by comments only are ignored. -### Generated acquisition files & coexistence with your own files +Linux packages (deb or rpm) will automatically call `cscli setup unattended` during installation. In the case above, instead of asking for confirmation, unattended mode will just skip the service detection. -When you generated the acquisition configuration with `cscli setup`, `cscli` writes one file per service as `setup..yaml` in the acquisition directory (typically `/etc/crowdsec/acquis.d`). The content is **prefixed with a header** that includes a truncated `cscli-checksum` and a comment stating it was generated by `cscli setup`. -- Files carrying a valid `cscli-checksum` are considered **generated** and may be overwritten by future runs. -- Files **without** a valid checksum are treated as **manually edited**; in interactive flows, `cscli` shows a colorized diff and asks before overwriting. In unattended flows, the command refuses to proceed if manual files are detected. +### Generated acquisition files & coexistence with your own files + +When you generated the acquisition configuration with `cscli setup`, `cscli` writes one file per service as `setup..yaml` in the acquisition directory (typically `/etc/crowdsec/acquis.d`). The content is prefixed with a header that includes a checksum and a comment stating it was generated by `cscli setup`. + +- Files carrying a valid checksum are considered generated and may be overwritten by future runs. +- Files without a valid checksum are treated as manually edited; in interactive mode, `cscli` shows a colorized diff and asks before overwriting. In unattended flows, the command refuses to proceed if manual files are detected. - Either way, the safest practice is: **don’t edit generated files**. If you need changes, delete the generated `setup..yaml` and create your own hand‑managed file instead or use a custom `detect.yaml` to generate the proper configuration automatically. > Tips -> - The actual on‑disk path is computed as `acquis.d/setup.` where `` comes from `acquisition_spec.filename`. + +> - The actual on‑disk path is computed as `acquis.d/setup..yaml` where `` comes from `acquisition_spec.filename`. > - Use `--acquis-dir` to target a different directory. > - `--dry-run` prints what would be created without writing files. @@ -63,17 +66,18 @@ CROWDSEC_SETUP_DETECT_CONFIG=/path/to/detect.yaml cscli setup detect ``` Helpful flags: + - `--yaml` – render the setup plan as YAML (easy to review/edit); default output is JSON. - `--force ` – pretend detection matched for `` (repeatable). - `--ignore ` – drop `` from the plan even if matched (repeatable). -- `--skip-systemd` – disable systemd‐based detection (useful in containers/chroots). +- `--skip-systemd` – disable systemd‐based detection (default if systemctl can't be run). - `--list-supported-services` – print the service keys present in your file and exit. You can see a list of all the available expr helpers in the [dedicated documentation](/log_processor/service-discovery-setup/expr.md). -For example, if you have configured nginx to log in a non-standard location, you can use a custom `detect.yaml` to automatically generate the configuration. +For example, if you have configured nginx to log in a non-standard location, you can use a custom `detect.yaml` to override it. -This example will generate an acquisition config for the file datasource with the pattern `/srv/logs/nginx/*.log` if the nginx service is installed OR if any file matches the glob pattern `/srv/logs/nginx/*.log`: +This example will generate an acquisition with the pattern `/srv/logs/nginx/*.log` if the nginx service is installed OR if any file matches the glob pattern `/srv/logs/nginx/*.log`: ```yaml # detect.yaml @@ -90,9 +94,9 @@ detect: datasource: source: file filenames: - - /srv/logs/nginx/*.log # <- your path here + - /srv/logs/nginx/*.log labels: - type: nginx + type: nginx ``` :::warning @@ -109,28 +113,41 @@ detect: hub_spec: collections: - crowdsecurity/linux + acquisition_spec: + filename: linux.yaml + datasource: + source: file + labels: + type: syslog + filenames: + - /var/log/messages + - /var/log/syslog + - /var/log/kern.log ``` + ::: ### Unattended installs with a custom detect file Linux packages (deb or rpm) will automatically call `cscli setup unattended` during installation. -You can specify a custom detection file to use by setting the `CROWDSEC_SETUP_DETECT_CONFIG` environment variable. +You can specify a custom detection file to use by setting `CROWDSEC_SETUP_DETECT_CONFIG` before installing the package with `apt` or `dnf`. -Alternatively, if you want to skip the automatic detection (because you deploy the configuration with Ansible for example), you can set the env var `CROWDSEC_SETUP_UNATTENDED_DISABLE` to any value. +Alternatively, if you want to skip the automatic detection completely, you can set the env var `CROWDSEC_SETUP_UNATTENDED_DISABLE` to any value. ### End-to-end workflow Behind the scenes, `cscli setup` use multiple steps to configure crowdsec: - - Generate a setup files that contains the detected services, their associated hub items and acquisition configuration - - Validate this file - - Install the hub items - - Write the acquisition config to disk + +- Generate a YAML plan that contains the detected services, their associated hub items and acquisition configuration +- Validate this file +- Install the hub items +- Write the acquisition config to disk If you wish, you can manually invoke any of those steps (if you only want to install the hub items for example). `cscli setup detect` can be used to generate the setup file: + ```bash cscli setup detect --detect-config ./detect.yaml --yaml > setup.yaml ``` @@ -148,6 +165,7 @@ cscli setup install-hub ./setup.yaml ``` And finally, write the acquisition config: + ```bash cscli setup install-acquisition ./setup.yaml --acquis-dir /etc/crowdsec/acquis.d -``` \ No newline at end of file +``` From 2dd95b4d0e915ee8a6f05f14d77ae6133a3a0e84 Mon Sep 17 00:00:00 2001 From: marco Date: Tue, 2 Sep 2025 09:46:13 +0200 Subject: [PATCH 14/22] add PlatformVersion --- .../log_processor/service-discovery-setup/expr.md | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/crowdsec-docs/docs/log_processor/service-discovery-setup/expr.md b/crowdsec-docs/docs/log_processor/service-discovery-setup/expr.md index aba6c3c0e..80bbd4d90 100644 --- a/crowdsec-docs/docs/log_processor/service-discovery-setup/expr.md +++ b/crowdsec-docs/docs/log_processor/service-discovery-setup/expr.md @@ -46,7 +46,13 @@ Various helpers are available for use in the `detect.yaml` file to determine how     Returns the family of the OS (`debian`, `rhel`, ...) -> `Host.Platform == "debian"` +> `Host.PlatformFamily == "debian"` + +### `Host.PlatformVersion` + +    Returns the version of the OS or distribution (for linux, /etc/os-release) + +> `Host.PlatformVersion == "25.04" ### `Host.KernelVersion` @@ -98,13 +104,13 @@ This object exposes helpers functions for the filesystem ### `ProcessRunning(name) bool` -    Returns `true` if there's any with the specified name running +    Returns `true` if there's any process with the specified name running > `System.ProcessRunning("nginx") == true` ## Systemd -    This object exposes helpers to get informations about systemd units. +    This object exposes helpers to get informations about Systemd units.     Only available on Linux. @@ -148,4 +154,4 @@ This object exposes helpers functions for the filesystem     Constraint supports operators like `=`, `!=`, `<`, `<=`, `>`, `>=`, ranges (1.1.1 - 1.3.4), AND with commas (`>1`, `<3`), and ~ compatible ranges. -> `Version.Check(Host.KernelVersion, ">=6.24.0")` \ No newline at end of file +> `Version.Check(Host.KernelVersion, ">=6.24.0")` From 94d2228b93c74fd1bef074974c5f29ad77a88635 Mon Sep 17 00:00:00 2001 From: marco Date: Tue, 2 Sep 2025 09:55:32 +0200 Subject: [PATCH 15/22] lint --- .../service-discovery-setup/detect-yaml.md | 11 ++++------- 1 file changed, 4 insertions(+), 7 deletions(-) diff --git a/crowdsec-docs/docs/log_processor/service-discovery-setup/detect-yaml.md b/crowdsec-docs/docs/log_processor/service-discovery-setup/detect-yaml.md index 8d735c721..72edde89f 100644 --- a/crowdsec-docs/docs/log_processor/service-discovery-setup/detect-yaml.md +++ b/crowdsec-docs/docs/log_processor/service-discovery-setup/detect-yaml.md @@ -1,12 +1,12 @@ --- id: detect-yaml -title: detect.yaml file format +title: detect.yaml syntax sidebar_position: 1 --- -# `detect.yaml` syntax +# `detect.yaml` syntax -A minimal detection file is a YAML map with a top‐level `detect:` key. +A minimal detection file is a YAML map with a top‐level `detect:` key. Under it, each entry describes one service plan: @@ -44,8 +44,7 @@ when: - Systemd.UnitInstalled("") ``` -You can use any of the helper referenced [here](/log_processor/service-discovery-setup/expr.md). -) +You can use any of the helper referenced [here](/log_processor/service-discovery-setup/expr). ### `hub_spec` @@ -138,5 +137,3 @@ detect: labels: type: eventlog ``` - - From 6277114ca92a03f5f4c85a0ad98c71b1471db9b0 Mon Sep 17 00:00:00 2001 From: Sebastien Blot Date: Tue, 2 Sep 2025 10:21:26 +0200 Subject: [PATCH 16/22] up --- .../data_sources/introduction.md | 39 +------------------ 1 file changed, 1 insertion(+), 38 deletions(-) diff --git a/crowdsec-docs/docs/log_processor/data_sources/introduction.md b/crowdsec-docs/docs/log_processor/data_sources/introduction.md index 9aed2f7a6..3fd6f2e88 100644 --- a/crowdsec-docs/docs/log_processor/data_sources/introduction.md +++ b/crowdsec-docs/docs/log_processor/data_sources/introduction.md @@ -9,47 +9,10 @@ DataSources define where to access them (either as files, or over the network fr They can be defined: -- in [Acquisition files](/configuration/crowdsec_configuration.md#acquisition_path). Each file can contain multiple DataSource definitions. +- in [Acquisition files](/configuration/crowdsec_configuration.md#acquisition_path). Each file can contain multiple DataSource definitions. This configuration can be generated automatically, please refer to the [Service Discovery documentation](/log_processor/service-discovery-setup/intro.md) - for cold log analysis, you can also specify acquisitions via the command line. -### Service detection (automated setup) - -When CrowdSec is installed via a package manager on a fresh system, the package may run [`cscli setup`](/cscli/cscli_setup) in **unattended** mode. - -The `cscli setup` command will: - -- detect installed services and common log file locations -- install the related Hub collections -- generate acquisition files under `acquis.d/` as `setup..yaml` (e.g., `setup.linux.yaml`) - -Generated files are meant to be managed by CrowdSec; don’t edit them in place. If you need changes, delete the generated file and create your own. - -When upgrading or reinstalling CrowdSec, it detects non-generated or modified files and won’t overwrite your custom acquisitions. - -:::caution - -Make sure the same data sources are not ingested more than once: duplicating inputs can artificially increase scenario sensitivity. - -Examples: - -- If an application logs to both `journald` and `/var/log/*`, you usually only need one of them. -- If an application writes to `/var/log/syslog` or `/var/log/messages`, it’s already acquired by `setup.linux.yaml` (since 1.7) or `acquis.yaml`. You don’t need to add a separate acquisition for the same logs. - -::: - -For config-managed deployments (e.g., Ansible), set the environment variable `CROWDSEC_SETUP_UNATTENDED_DISABLE` to any non-empty value to skip the automated setup. -In that case, ensure you configure at least one data source and install the OS collection (e.g., crowdsecurity/linux). - -### Assisted service detection (semi-automated setup) - -If you installed new applications and want to detect the service detection again, running [`cscli setup`](/cscli/cscli_setup) yourself will guide you through the -automated setup, with confirmation prompts. You will receive a warning if you already configured some acquisition yourself but they won't be -modified by `cscli`. - -Note that `cscli setup` will not remove any collection or acquisition file in `acquis.d/setup..yaml`, even if the service has been uninstalled since the file creation. - - ## Datasources modules Name | Type | Stream | One-shot From a4c795a7405b5a237f726fb05865b4e66543b59f Mon Sep 17 00:00:00 2001 From: Sebastien Blot Date: Tue, 2 Sep 2025 10:38:27 +0200 Subject: [PATCH 17/22] fix datasources links --- .../data_sources/introduction.md | 28 +++++++++---------- 1 file changed, 14 insertions(+), 14 deletions(-) diff --git a/crowdsec-docs/docs/log_processor/data_sources/introduction.md b/crowdsec-docs/docs/log_processor/data_sources/introduction.md index 3fd6f2e88..25aff98c5 100644 --- a/crowdsec-docs/docs/log_processor/data_sources/introduction.md +++ b/crowdsec-docs/docs/log_processor/data_sources/introduction.md @@ -17,20 +17,20 @@ They can be defined: Name | Type | Stream | One-shot -----|------|--------|---------- -[Appsec](/log_processor/data_sources/appsec) | expose HTTP service for the Appsec component | yes | no -[AWS cloudwatch](/log_processor/data_sources/cloudwatch) | single stream or log group | yes | yes -[AWS kinesis](/log_processor/data_sources/kinesis)| read logs from a kinesis stream | yes | no -[AWS S3](/log_processor/data_sources/s3)| read logs from a S3 bucket | yes | yes -[docker](/log_processor/data_sources/docker) | read logs from docker containers | yes | yes -[file](/log_processor/data_sources/file) | single files, glob expressions and .gz files | yes | yes -[HTTP](/log_processor/data_sources/http) | read logs from an HTTP endpoint | yes | no -[journald](/log_processor/data_sources/journald) | journald via filter | yes | yes -[Kafka](/log_processor/data_sources/kafka)| read logs from kafka topic | yes | no -[Kubernetes Audit](/log_processor/data_sources/kubernetes_audit) | expose a webhook to receive audit logs from a Kubernetes cluster | yes | no -[Loki](/log_processor/data_sources/loki) | read logs from loki | yes | yes -[VictoriaLogs](/log_processor/data_sources/victorialogs) | read logs from VictoriaLogs | yes | yes -[syslog service](/log_processor/data_sources/syslog_service) | read logs received via syslog protocol | yes | no -[Windows Event](/log_processor/data_sources/windows_event_log)| read logs from windows event log | yes | yes +[Appsec](/log_processor/data_sources/appsec.md) | expose HTTP service for the Appsec component | yes | no +[AWS cloudwatch](/log_processor/data_sources/cloudwatch.md) | single stream or log group | yes | yes +[AWS kinesis](/log_processor/data_sources/kinesis.md)| read logs from a kinesis stream | yes | no +[AWS S3](/log_processor/data_sources/s3.md)| read logs from a S3 bucket | yes | yes +[docker](/log_processor/data_sources/docker.md) | read logs from docker containers | yes | yes +[file](/log_processor/data_sources/file.md) | single files, glob expressions and .gz files | yes | yes +[HTTP](/log_processor/data_sources/http.md) | read logs from an HTTP endpoint | yes | no +[journald](/log_processor/data_sources/journald.md) | journald via filter | yes | yes +[Kafka](/log_processor/data_sources/kafka.md)| read logs from kafka topic | yes | no +[Kubernetes Audit](/log_processor/data_sources/kubernetes_audit.md) | expose a webhook to receive audit logs from a Kubernetes cluster | yes | no +[Loki](/log_processor/data_sources/loki.md) | read logs from loki | yes | yes +[VictoriaLogs](/log_processor/data_sources/victorialogs.md) | read logs from VictoriaLogs | yes | yes +[syslog service](/log_processor/data_sources/syslog_service.md) | read logs received via syslog protocol | yes | no +[Windows Event](/log_processor/data_sources/windows_event_log.md)| read logs from windows event log | yes | yes ## Common configuration parameters From 9a82b92b7cce9e3fda04b3315555a2c5be8ddba6 Mon Sep 17 00:00:00 2001 From: Sebastien Blot Date: Tue, 2 Sep 2025 10:38:54 +0200 Subject: [PATCH 18/22] up --- .../docs/log_processor/service-discovery-setup/detect-yaml.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/crowdsec-docs/docs/log_processor/service-discovery-setup/detect-yaml.md b/crowdsec-docs/docs/log_processor/service-discovery-setup/detect-yaml.md index 72edde89f..93cdaca25 100644 --- a/crowdsec-docs/docs/log_processor/service-discovery-setup/detect-yaml.md +++ b/crowdsec-docs/docs/log_processor/service-discovery-setup/detect-yaml.md @@ -4,7 +4,7 @@ title: detect.yaml syntax sidebar_position: 1 --- -# `detect.yaml` syntax +# Syntax A minimal detection file is a YAML map with a top‐level `detect:` key. From 0b6cc9bebc6053b2d75b5332636eaaaa7f9e1d97 Mon Sep 17 00:00:00 2001 From: Sebastien Blot Date: Tue, 2 Sep 2025 10:45:33 +0200 Subject: [PATCH 19/22] up --- .../docs/log_processor/service-discovery-setup/intro.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/crowdsec-docs/docs/log_processor/service-discovery-setup/intro.md b/crowdsec-docs/docs/log_processor/service-discovery-setup/intro.md index e1414ff8f..9ccb22804 100644 --- a/crowdsec-docs/docs/log_processor/service-discovery-setup/intro.md +++ b/crowdsec-docs/docs/log_processor/service-discovery-setup/intro.md @@ -6,6 +6,11 @@ sidebar_position: 1 # Service Discovery +The goals of service discovery are to automatically: + - Detect services on your machine supported by crowdsec + - Install related hub items + - Generate acquisition configuration + ## Basic Usage The main way to use the service discovery is with `cscli setup interactive` or `cscli setup unattended`. From 6fd4482a25b5ef20684c9456bbcf0261f2ff4c29 Mon Sep 17 00:00:00 2001 From: marco Date: Tue, 2 Sep 2025 10:47:14 +0200 Subject: [PATCH 20/22] up --- .../docs/log_processor/service-discovery-setup/detect-yaml.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/crowdsec-docs/docs/log_processor/service-discovery-setup/detect-yaml.md b/crowdsec-docs/docs/log_processor/service-discovery-setup/detect-yaml.md index 93cdaca25..e92e8f131 100644 --- a/crowdsec-docs/docs/log_processor/service-discovery-setup/detect-yaml.md +++ b/crowdsec-docs/docs/log_processor/service-discovery-setup/detect-yaml.md @@ -44,7 +44,7 @@ when: - Systemd.UnitInstalled("") ``` -You can use any of the helper referenced [here](/log_processor/service-discovery-setup/expr). +You can use any of the helper referenced [here](/log_processor/service-discovery-setup/expr.md). ### `hub_spec` From 8c1993fb10b384e5356076da6dc961017715048f Mon Sep 17 00:00:00 2001 From: Sebastien Blot Date: Tue, 2 Sep 2025 10:50:08 +0200 Subject: [PATCH 21/22] up --- crowdsec-docs/docs/log_processor/intro.mdx | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/crowdsec-docs/docs/log_processor/intro.mdx b/crowdsec-docs/docs/log_processor/intro.mdx index 1697d5abb..d0b982470 100644 --- a/crowdsec-docs/docs/log_processor/intro.mdx +++ b/crowdsec-docs/docs/log_processor/intro.mdx @@ -8,16 +8,15 @@ The Log Processor is a core component of the Security Engine. It: - Reads logs from [Data Sources](log_processor/data_sources/introduction.md) via Acquistions. - Parses logs and extract relevant information using [Parsers](log_processor/parsers/introduction.mdx). -- Enriches the parsed information with additional context such as GEOIP, ASN using [Enrichers](log_processor/parsers/enricher.md). +- Enriches the parsed information with additional context such as GEOIP, ASN using [Enrichers](log_processor/parsers/enricher.md). - Monitors patterns of interest via [Scenarios](log_processor/scenarios/introduction.mdx). - Pushes alerts to the Local API (LAPI), where alert/decisions are stored. - -!TODO: Add diagram of the log processor pipeline - Read logs from datasources - Parse the logs - Enrich the parsed information - Monitor the logs for patterns of interest + ## Log Processor @@ -44,10 +43,10 @@ We support two ways to define Acquisitions in the [configuration directory](/u/t ## /etc/crowdsec/acquis.d/file.yaml source: file ## The Data Source module to use filenames: - - /tmp/foo/*.log - - /var/log/syslog + - /tmp/foo/*.log + - /var/log/syslog labels: - type: syslog + type: syslog ``` For more information on Data Sources and Acquisitions, see the [Data Sources](log_processor/data_sources/introduction.md) documentation. From 59d5e42ee010108fb39e08b843e75ac3ec2a388e Mon Sep 17 00:00:00 2001 From: Sebastien Blot Date: Tue, 2 Sep 2025 12:42:11 +0200 Subject: [PATCH 22/22] fix name --- .../docs/log_processor/service-discovery-setup/detect-yaml.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/crowdsec-docs/docs/log_processor/service-discovery-setup/detect-yaml.md b/crowdsec-docs/docs/log_processor/service-discovery-setup/detect-yaml.md index e92e8f131..69d1e59c8 100644 --- a/crowdsec-docs/docs/log_processor/service-discovery-setup/detect-yaml.md +++ b/crowdsec-docs/docs/log_processor/service-discovery-setup/detect-yaml.md @@ -1,6 +1,6 @@ --- id: detect-yaml -title: detect.yaml syntax +title: Syntax sidebar_position: 1 ---