Skip to content

Commit a8464be

Browse files
authored
Update the Collect Multiline Logs doc with a warning callout (#5490)
* Update collect-multiline-logs.md * Update collect-multiline-logs.md
1 parent ae3e787 commit a8464be

File tree

1 file changed

+14
-13
lines changed

1 file changed

+14
-13
lines changed

docs/send-data/reference-information/collect-multiline-logs.md

Lines changed: 14 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -4,9 +4,11 @@ title: Collecting Multiline Logs
44
description: Sumo Logic Sources can be configured to detect log boundaries automatically or with a regular expression.
55
---
66

7-
Sumo Logic Sources by default have multiline processing enabled. Multiline processing is used to ensure a log message that is made up of multiple lines, separated by a line break or carriage return, are properly grouped as a single log message when ingested into Sumo Logic.
7+
Sumo Logic Sources, by default, have multiline processing enabled. Multiline processing is used to ensure that a log message made up of multiple lines, with each line separated by a line break or carriage return, is correctly grouped as a single log message when ingested into Sumo Logic.
88

9-
Multiline processing requires your logs to have line breaks or carriage returns between messages. If the logs are part of a larger individual message (for example, JSON array or XML) Sumo Logic will in most cases not be able to break these into individual logs.
9+
:::warning
10+
The line breaks or carriage returns are control characters used to create new lines, usually represented by the escape sequences `\r` and `\r\n`, but are often invisible in text editors. Sumo Logic will not be able to split your log messages that do not contain these characters.
11+
:::
1012

1113
## Multiline Processing Caveats
1214

@@ -24,19 +26,19 @@ Sources have the option to be configured to automatically infer log boundaries o
2426

2527
## Infer Boundaries
2628

27-
By default, **Infer Boundaries** is selected when **Multiline Processing** is enabled. The Collector will attempt to detect a common pattern which denotes the first line of a multiline message. The Collector will look at each line coming in from a Source and attempt to match that line to the known expression. If the line matches then the Collector will mark this as the start of a new message and any additional lines that do not match the expression will be assumed as part of that message. Once the Collector detects another line matching the expression it will flush the previous lines as a single message and mark that next line as the start of a new message.
29+
By default, **Infer Boundaries** is selected when **Multiline Processing** is enabled. The Collector will attempt to detect a common pattern that denotes the first line of a multiline message. The Collector will look at each line coming in from a Source and attempt to match that line to the known expression. If the line matches, then the Collector will mark this as the start of a new message, and any additional lines that do not match the expression will be assumed as part of that message. Once the Collector detects another line matching the expression, it will flush the previous lines as a single message and mark that next line as the start of a new message.
2830

29-
The Collector will attempt to use the first 1,000 lines, or as many lines as appear within 30 seconds, and an algorithm to try and determine a pattern that may denote a new message starting line. **Infer boundaries** works best if the log messages contain a common anchor to start the line, such as a timestamp, and the formatting of the messages being received by the source are in a consistent format.
31+
The Collector will attempt to use the first 1,000 lines, or as many lines as appear within 30 seconds, and an algorithm to try and determine a pattern that may denote a new message starting line. **Infer boundaries** works best if the log messages contain a common anchor to start the line, such as a timestamp, and the formatting of the messages being received by the source is in a consistent format.
3032

3133
## Boundary Regex
3234

3335
You can specify the boundary between messages using a regular expression. Enter a regular expression for the full first line of every multi-line message in your log files.
3436

35-
In cases where a single Source is being used to collect multiple different types of files of varying formats or if no consistent pattern is detected within the messages being received then it is possible for each line to be flushed as a single message or some messages to be improperly grouped into a single message.
37+
In cases where a single Source is being used to collect multiple different types of files of varying formats or if no consistent pattern is detected within the messages being received, then it is possible for each line to be flushed as a single message or some messages to be improperly grouped into a single message.
3638

37-
Even when ingesting a single Source type, auto detection is not guaranteed to work for all cases, this is noted within the Source configuration with the following text: `Please note, Infer Boundaries may not be accurate for all log types`. In this case, a custom **Boundary Regex** expression may be required for detecting the start of each log message.
39+
Even when ingesting a single Source type, auto-detection is not guaranteed to work for all cases. This is noted within the Source configuration with the following text: `Please note, Infer Boundaries may not be accurate for all log types`. In this case, a custom **Boundary Regex** expression may be required for detecting the start of each log message.
3840

39-
When the option for **Boundary Regex** is used with the multiline detection the Collector will use the supplied regular expression to try and match the first line of a multiline message.
41+
When the option for **Boundary Regex** is used with the multiline detection, the Collector will use the supplied regular expression to try and match the first line of a multiline message.
4042

4143
:::note
4244
The expression supplied must match the entire first line of a message up to, and in some cases including, the trailing line feed or carriage return.
@@ -54,8 +56,7 @@ Acceptable boundary expressions may be:
5456
* `.*\d{4}-\d{2}-\d{2}\s+\d{2}:\d{2}:\d{2}.*\n`
5557
* `^.*\[CPU-ResourceMonitor-1\].*`
5658

57-
Unacceptable boundary expressions would include the following since they
58-
do not match the entire first line:
59+
Unacceptable boundary expressions would include the following, since they do not match the entire first line:
5960

6061
* `^\d{4}-\d{2}-\d{2}\s+\d{2}:\d{2}:\d{2}`
6162
* `[CPU-ResourceMonitor-1\]`
@@ -64,12 +65,12 @@ do not match the entire first line:
6465

6566
### How Does Multiline Work With Syslog Sources?
6667

67-
Sumo Logic does not provide any options for multiline detection within Syslog Sources. For Syslog messages received over UDP Sumo Logic will treat all content contained within a single syslog request as a single message.
68+
Sumo Logic does not provide any options for multiline detection within Syslog Sources. For Syslog messages received over UDP, Sumo Logic will treat all content contained within a single syslog request as a single message.
6869

69-
When syslog messages are received over TCP Sumo Logic will treat each line within a request as a new message. This is because TCP is received as a data stream and the Collector will flush a message whenever a line feed is detected.
70+
When syslog messages are received over TCP, Sumo Logic will treat each line within a request as a new message. This is because TCP is received as a data stream, and the Collector will flush a message whenever a line feed is detected.
7071

7172
### How Does Multiline Work With HTTP Sources?
7273

73-
Multiline detection on an HTTP source only works within the confines of a single HTTP request. If you send multiple multiline messages within a single HTTP post request the multiline options will apply to those messages. If you send a multiline message as separate POST requests the multiline options do not apply.
74+
Multiline detection on an HTTP source only works within the confines of a single HTTP request. If you send multiple multiline messages within a single HTTP post request, the multiline options will apply to those messages. If you send a multiline message as separate POST requests, the multiline options do not apply.
7475

75-
Sumo Logic cannot thread together multiple HTTP posts into a single message. This is due to there being no guarantee of the order of receipt (simply the nature of HTTP) and because there is no certainty that multiple clients are not sending to the same HTTP Source, which may cause additional issues with how the order of messages are received.
76+
Sumo Logic cannot thread together multiple HTTP posts into a single message. This is due to there being no guarantee of the order of receipt (simply the nature of HTTP) and because there is no certainty that multiple clients are not sending to the same HTTP Source, which may cause additional issues with how the order of messages is received.

0 commit comments

Comments
 (0)