[DRAFT] Describe Trace Snapshot Profiling #343

tduncan · 2025-06-06T22:55:15Z

First (rough) draft for defining how agents should profile traces selected for snapshotting. I'm very unfamiliar with writing this type of documentation so any help steering it where it needs to go is greatly appreciated!

laurit · 2025-06-12T10:54:28Z

specification/behaviors.md

+When a trace is profiled agents MUST add the span attribute `splunk.snapshot.profiling` 
+with a value of `true` to the entry span.
+
+Agents SHOULD take an initial stack trace sample when starting to profile a trace.


As discussed before the initial sample will not contain user code. The stack trace is likely to be identical for all requests. What use does it have?

In the case with Node.js, these stack traces will never be correlated to the given trace ID, as we're only collecting stacktraces that are sampled during an active span.

laurit · 2025-06-12T11:21:40Z

specification/behaviors.md

+
+When a language runtime supports threading, stacks MUST be sampled only for 
+trace ids selected for snapshotting. The samples for profiled threads SHOULD be 
+taken instantaneously and MAY be taken at separate times.


The samples for profiled threads SHOULD be taken instantaneously and MAY be taken at separate times.

I don't follow the meaning of this sentence. Also taking a stack trace is not an instantaneous operation by any means. In java for example stack trace is taken at a safepoint, which means that all threads are suspended (apparently recent vms are able to suspend only a single thread, but I don't know whether it is used when taking a stack trace).

specification/behaviors.md

laurit · 2025-06-12T11:35:56Z

specification/behaviors.md

+It is RECOMMENDED to export stack traces in batches to take advantage of the pprof 
+data format.
+
+Agents SHOULD attempt to export any remaining stack traces during the Agent shutdown phase. 


Not sure whether this requirement makes sense, idk whether it can be easily implemented for all languages.

laurit · 2025-06-12T11:37:40Z

specification/behaviors.md

+The logs containing profiling data MUST be sent via OTLP. Instrumentation
+libraries SHOULD reuse persistent OTLP connections from other signals (traces,
+metrics).


Although it is copied from https://github.com/signalfx/gdi-specification/blob/main/specification/behaviors.md#call-stack-ingest wanted to point out that I suspect that this is not true for the java implementation.

laurit · 2025-06-12T11:39:38Z

specification/semantic_conventions.md

+
+**Status**: [Experimental](../README.md#versioning-and-status-of-the-specification)
+
+Unless stated otherwise Agents MUST follow the `Profiling `ResourceLogs` Message`


Profiling ResourceLogs Message

is this a typo?

laurit · 2025-06-12T11:40:38Z

specification/semantic_conventions.md

+The span attribute `splunk.snapshot.profiling` with a value of `true` indicates that 
+a trace within a service has been profiled.


maybe point out that the attribute should be se on the local root span?

Co-authored-by: Lauri Tulmin <[email protected]>

breedx-splk · 2025-10-29T17:23:48Z

specification/configuration.md

 | `SPLUNK_PROFILER_MEMORY_ENABLED`       | false   | Whether memory profiling is enabled. [2] [6]                                             |
 | `SPLUNK_REALM`                         | `none`  | Which realm to send exported data. [3]                                                   |
 | `SPLUNK_TRACE_RESPONSE_HEADER_ENABLED` | true    | Whether `Server-Timing` header is added to HTTP responses. [4]                           |
+| `SPLUNK_SNAPSHOT_PROFILER_ENABLED`     | false   | Whether Trace Snapshot CPU profiling is enabled. [2] [5]                                 |


This will be covered/superseded by #353 .

breedx-splk · 2025-10-29T17:24:27Z

specification/semantic_conventions.md

+| `splunk.snapshot.profiler.enabled`           | string | Enable or Disable trace snapshot profiling          | `true` or `false`            | `false` |
+| `splunk.snapshot.profiler.sampling.interval` | string | Interval in which to take trace stack trace samples | Any valid duration `string`  | `10ms`  |


Superseded in #353.

tduncan added 2 commits June 2, 2025 15:25

Add SPLUNK_SNAPSHOT_PROFILER_ENABLED configuration property.

c62857d

Add first draft spec for trace snapshot profiling.

e170a24

tduncan requested review from a team as code owners June 6, 2025 22:55

seemk mentioned this pull request Jun 12, 2025

Snapshot profiling signalfx/splunk-otel-js#1023

Merged

laurit reviewed Jun 12, 2025

View reviewed changes

specification/behaviors.md Outdated Show resolved Hide resolved

t2t2 reviewed Jun 12, 2025

View reviewed changes

specification/behaviors.md Outdated Show resolved Hide resolved

laurit reviewed Jun 12, 2025

View reviewed changes

tduncan and others added 2 commits June 12, 2025 08:19

Update specification/behaviors.md

3970d95

Co-authored-by: Lauri Tulmin <[email protected]>

Correct baggage entry key name.

ea126c8

breedx-splk reviewed Oct 29, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[DRAFT] Describe Trace Snapshot Profiling #343

[DRAFT] Describe Trace Snapshot Profiling #343

Uh oh!

tduncan commented Jun 6, 2025

Uh oh!

laurit Jun 12, 2025

Uh oh!

seemk Jun 26, 2025

Uh oh!

laurit Jun 12, 2025

Uh oh!

Uh oh!

Uh oh!

laurit Jun 12, 2025

Uh oh!

laurit Jun 12, 2025

Uh oh!

laurit Jun 12, 2025

Uh oh!

laurit Jun 12, 2025

Uh oh!

breedx-splk Oct 29, 2025

Uh oh!

breedx-splk Oct 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants


		Status: [Experimental](../README.md#versioning-and-status-of-the-specification)

		Unless stated otherwise Agents MUST follow the `Profiling `ResourceLogs` Message`

		The span attribute `splunk.snapshot.profiling` with a value of `true` indicates that
		a trace within a service has been profiled.

		\| `splunk.snapshot.profiler.enabled` \| string \| Enable or Disable trace snapshot profiling \| `true` or `false` \| `false` \|
		\| `splunk.snapshot.profiler.sampling.interval` \| string \| Interval in which to take trace stack trace samples \| Any valid duration `string` \| `10ms` \|

[DRAFT] Describe Trace Snapshot Profiling #343

Are you sure you want to change the base?

[DRAFT] Describe Trace Snapshot Profiling #343

Uh oh!

Conversation

tduncan commented Jun 6, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants