Skip to content

Distributed Tracing for Entities (Isolated) #3076

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 94 commits into
base: dev
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
94 commits
Select commit Hold shift + click to select a range
f76b7d5
added tracing for entities
Mar 6, 2025
24642de
slight update based on change in how the distributed trace context wi…
Mar 10, 2025
096057d
some stylistic cleanup
Mar 10, 2025
1987309
more stylistic cleanup
Mar 10, 2025
9dd6237
another style error
Mar 11, 2025
bf37ede
addressing some PR comments
Mar 12, 2025
ba9a7b6
missed a few comments
Mar 12, 2025
6357034
got them all i think
Mar 12, 2025
5384b27
reverting to an old version since having ActivityContext as a field o…
Mar 12, 2025
115ae6c
added some json properties to RequestMessage - the lack of them was c…
Mar 12, 2025
d4916bb
added some null safety checks
Mar 13, 2025
e2b1c9d
added some more tags to the entities activities
Mar 27, 2025
b6b7771
forgot one new file
Mar 28, 2025
69ddb1b
first commit!
Mar 28, 2025
032ad86
one small proto change
Mar 28, 2025
917d4d9
testing to see if i can get that large diff to be removed
Mar 31, 2025
9d16105
trying again
Mar 31, 2025
6496d4f
finally found the issue
Mar 31, 2025
6374e6e
found another file with the wrong line endings
Mar 31, 2025
8a5511c
anddd one more:
Mar 31, 2025
3de4240
adding blank lines to the end of files for which they were removed
Mar 31, 2025
298e532
removing two file changes
Mar 31, 2025
3b11e78
had the wrong line endings
Mar 31, 2025
505b4ec
trying again
Mar 31, 2025
ef5c211
slight style updates, and fixed ending the activity before waiting fo…
Apr 8, 2025
dd4c190
refactored most of the trace activities into durabletask.core to more…
Apr 11, 2025
e644c6b
missed some changes
Apr 11, 2025
2d6729e
one more tiny thing apparently
Apr 11, 2025
27fbae6
trying to fix spacing again
Apr 11, 2025
eef53c1
another try
Apr 11, 2025
64cb668
yet another attempt
Apr 11, 2025
924580e
slight comment update
Apr 11, 2025
0cd1598
Merge branch 'stevosyan/distributed-tracing-for-entities' into stevos…
Apr 12, 2025
63a838d
pushing what i have so far
Apr 14, 2025
ac5359d
aligning this repo with the new durabletask changes
Apr 15, 2025
e0828b1
addressing PR comments
Apr 15, 2025
919827b
changed line endings
Apr 15, 2025
bcc0cb7
changed line endings
Apr 15, 2025
8d5cff5
Merge branch 'stevosyan/distributed-tracing-for-entities' into stevos…
Apr 15, 2025
7df80d1
missed a null check
Apr 15, 2025
9f3da39
missed a null check
Apr 15, 2025
5ea76cd
Merge branch 'stevosyan/distributed-tracing-for-entities' into stevos…
Apr 16, 2025
3077f1b
fixing line endings
Apr 16, 2025
977a6ad
reverting to the old design
Apr 18, 2025
3646b65
moved entity starting an orchestration activity back here
Apr 22, 2025
40fcc96
missed some stuff
Apr 22, 2025
46b4363
Merge branch 'stevosyan/distributed-tracing-for-entities' into stevos…
Apr 22, 2025
7c8ce0d
returned client signaling an entity in the isolated case back to this…
Apr 22, 2025
8813ec3
fixing a small bug
Apr 23, 2025
24b0b05
Merge branch 'stevosyan/distributed-tracing-for-entities' into stevos…
Apr 23, 2025
c85aec6
added supports for out of proc entities
Apr 25, 2025
4e81559
added support for an entities enabled flag on the durabletask.core side
Apr 25, 2025
84bad49
Merge branch 'stevosyan/distributed-tracing-for-entities' into stevos…
Apr 25, 2025
2acf846
added a start time to OperationResult
Apr 25, 2025
6bb3c62
Merge branch 'dev' into stevosyan/distributed-tracing-for-entities
May 2, 2025
c57b4ec
Merge branch 'dev' into stevosyan/distributed-tracing-for-entities
May 2, 2025
12d55f0
added OOProc support for signaling entities
May 2, 2025
b7394d9
removed unnecessary using
May 2, 2025
2e460ee
removed unused field
May 2, 2025
f49af64
fixing line endings
May 7, 2025
4554fd8
removed unnecessary usings
May 7, 2025
a394161
Merge branch 'dev' into stevosyan/distributed-tracing-for-entities
May 7, 2025
4ea1ea5
Merge branch 'stevosyan/distributed-tracing-for-entities' into stevos…
May 8, 2025
1b91a29
changing our logic for a client signaling an entity in the OOProc cas…
May 9, 2025
f92b7d7
Merge branch 'dev' into stevosyan/distributed-tracing-for-entities
May 10, 2025
10e2dae
style updates
May 10, 2025
855401e
slight style update... decided not to unnecessarily check for a inval…
May 12, 2025
0b60a19
decided to remove the ID check for the Activity for a new orchestrati…
May 12, 2025
734663e
Merge branch 'stevosyan/distributed-tracing-for-entities' into stevos…
May 12, 2025
e930f3d
fixing warnings in DiagnosticActivityExtensions class
May 12, 2025
e6ed24b
suppressing remaining warnings
May 12, 2025
bf93e6a
Merge branch 'stevosyan/distributed-tracing-for-entities' into stevos…
May 12, 2025
7a52e6b
removed unnecessary manual setting of ExecutionStartedEvent.ParentTra…
May 12, 2025
3eb34b6
addressing a PR comment
May 13, 2025
0baab80
added the request time to a call to an entity in DurableOrchestration…
May 13, 2025
7eec60b
Merge branch 'stevosyan/distributed-tracing-for-entities' into stevos…
May 13, 2025
bfe8de8
making the start time nullable for the call we have in localgrpcliste…
May 13, 2025
d31ffd4
one more tiny change: since the start time is now nullable, we don't …
May 13, 2025
ea9f27b
one small fix
May 14, 2025
068a163
added nullable enable to DiagnosticActivityExtensions, addressing a P…
May 14, 2025
84eea53
changed startTime/endTime to startTimeUtc/endTimeUtc and made the req…
May 16, 2025
fcc1802
removed redundant setting of request time
May 21, 2025
f935504
Merge branch 'dev' into stevosyan/distributed-tracing-for-entities
sophiatev May 21, 2025
2b01328
Merge branch 'stevosyan/distributed-tracing-for-entities' into stevos…
May 31, 2025
c0d429e
Merge branch 'dev' into stevosyan/distributed-tracing-for-entities
May 31, 2025
9e00193
Merge branch 'stevosyan/distributed-tracing-for-entities' into stevos…
May 31, 2025
08d9f2d
Merge branch 'dev' into stevosyan/distributed-tracing-for-entities
Jun 3, 2025
06c96a0
Merge branch 'stevosyan/distributed-tracing-for-entities' into stevos…
Jun 3, 2025
86d06b7
first commit (#3125)
sophiatev Jun 5, 2025
e291176
Merge branch 'stevosyan/distributed-tracing-for-entities' into stevos…
Jun 5, 2025
ca6ebcb
adding protobuf changes
Jun 5, 2025
39c8f4a
Merge branch 'dev' into stevosyan/distributed-tracing-for-entities-is…
Jun 6, 2025
27655ca
missed one file
Jun 6, 2025
1b13f19
updating references to dotnet packages
Jun 6, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions Directory.Packages.props
Original file line number Diff line number Diff line change
Expand Up @@ -28,8 +28,8 @@
<PackageVersion Include="Microsoft.CodeAnalysis.CSharp" Version="3.9.0" />
<PackageVersion Include="Microsoft.CodeAnalysis.CSharp.Workspaces" Version="3.9.0" />
<PackageVersion Include="Microsoft.CodeAnalysis.Workspaces.Common" Version="3.9.0" />
<PackageVersion Include="Microsoft.DurableTask.Client.Grpc" Version="1.10.0" />
<PackageVersion Include="Microsoft.DurableTask.Worker.Grpc" Version="1.10.0" />
<PackageVersion Include="Microsoft.DurableTask.Client.Grpc" Version="1.11.0" />
<PackageVersion Include="Microsoft.DurableTask.Worker.Grpc" Version="1.11.0" />
<PackageVersion Include="Microsoft.Extensions.Azure" Version="1.7.0" />
<PackageVersion Include="Microsoft.Extensions.Http" Version="6.0.0" />
<PackageVersion Include="Microsoft.NET.Sdk.Functions" Version="4.2.0" />
Expand Down Expand Up @@ -65,7 +65,7 @@
<PackageVersion Include="Microsoft.CodeAnalysis" Version="3.9.0" />
<PackageVersion Include="Microsoft.CodeAnalysis.Workspaces.MSBuild" Version="3.9.0" />
<PackageVersion Include="Microsoft.Diagnostics.Tracing.TraceEvent" Version="2.0.65" />
<PackageVersion Include="Microsoft.DurableTask.Abstractions" Version="1.10.0" />
<PackageVersion Include="Microsoft.DurableTask.Abstractions" Version="1.11.0" />
<PackageVersion Include="Microsoft.DurableTask.Generators" Version="1.0.0-preview.1" />
<PackageVersion Include="Microsoft.DurableTask.SqlServer.AzureFunctions" Version="1.5.0" />
<PackageVersion Include="Microsoft.Extensions.Configuration" Version="9.0.0" />
Expand Down
17 changes: 9 additions & 8 deletions src/WebJobs.Extensions.DurableTask/Correlation/TraceHelper.cs
Original file line number Diff line number Diff line change
Expand Up @@ -17,14 +17,15 @@ internal class TraceHelper
private const string Source = "WebJobs.Extensions.DurableTask";

private static readonly ActivitySource ActivityTraceSource = new ActivitySource(Source);
internal static Activity? StartActivityForNewOrchestration(ExecutionStartedEvent startEvent, ActivityContext parentTraceContext)

internal static Activity? StartActivityForNewOrchestration(ExecutionStartedEvent startEvent, ActivityContext parentTraceContext, DateTimeOffset? startTime = default)
{
// Start the new activity to represent scheduling the orchestration
Activity? newActivity = ActivityTraceSource.StartActivity(
Schema.SpanNames.CreateOrchestration(startEvent.Name, startEvent.Version),
kind: ActivityKind.Producer,
parentContext: parentTraceContext);
kind: ActivityKind.Producer,
parentContext: parentTraceContext,
startTime: startTime ?? default);

if (newActivity == null)
{
Expand All @@ -46,8 +47,8 @@ internal class TraceHelper

return newActivity;
}
internal static Activity? StartActivityForCallingOrSignalingEntity(string targetEntityId, string entityName, string operationName, bool signalEntity, DateTime? scheduledTime, ActivityContext? parentTraceContext, DateTimeOffset startTime = default, string? entityId = null)

internal static Activity? StartActivityForCallingOrSignalingEntity(string targetEntityId, string entityName, string operationName, bool signalEntity, DateTime? scheduledTime, ActivityContext? parentTraceContext, DateTimeOffset? startTime = default, string? entityId = null)
{
// We only want to create a trace activity for calling or signaling an entity in the case that we can successfully get the parent trace context of the request.
// Otherwise, we will create an unlinked trace activity with no parent.
Expand All @@ -59,8 +60,8 @@ internal class TraceHelper
Activity? newActivity = ActivityTraceSource.StartActivity(
Schema.SpanNames.CallOrSignalEntity(entityName, operationName),
kind: signalEntity ? ActivityKind.Producer : ActivityKind.Client,
parentContext: parentTraceContext.Value,
startTime: startTime);
parentContext: parentTraceContext.Value,
startTime: startTime ?? default);

if (newActivity == null)
{
Expand Down
81 changes: 72 additions & 9 deletions src/WebJobs.Extensions.DurableTask/Grpc/orchestrator_service.proto
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ import "google/protobuf/timestamp.proto";
import "google/protobuf/duration.proto";
import "google/protobuf/wrappers.proto";
import "google/protobuf/empty.proto";
import "google/protobuf/struct.proto";

message OrchestrationInstance {
string instanceId = 1;
Expand Down Expand Up @@ -75,6 +76,7 @@ message ExecutionStartedEvent {
google.protobuf.Timestamp scheduledStartTimestamp = 6;
TraceContext parentTraceContext = 7;
google.protobuf.StringValue orchestrationSpanID = 8;
map<string, string> tags = 9;
}

message ExecutionCompletedEvent {
Expand All @@ -93,6 +95,7 @@ message TaskScheduledEvent {
google.protobuf.StringValue version = 2;
google.protobuf.StringValue input = 3;
TraceContext parentTraceContext = 4;
map<string, string> tags = 5;
}

message TaskCompletedEvent {
Expand Down Expand Up @@ -254,6 +257,7 @@ message ScheduleTaskAction {
string name = 1;
google.protobuf.StringValue version = 2;
google.protobuf.StringValue input = 3;
map<string, string> tags = 4;
}

message CreateSubOrchestrationAction {
Expand Down Expand Up @@ -288,6 +292,15 @@ message TerminateOrchestrationAction {
bool recurse = 3;
}

message SendEntityMessageAction {
oneof EntityMessageType {
EntityOperationSignaledEvent entityOperationSignaled = 1;
EntityOperationCalledEvent entityOperationCalled = 2;
EntityLockRequestedEvent entityLockRequested = 3;
EntityUnlockSentEvent entityUnlockSent = 4;
}
}

message OrchestratorAction {
int32 id = 1;
oneof orchestratorActionType {
Expand All @@ -297,6 +310,7 @@ message OrchestratorAction {
SendEventAction sendEvent = 5;
CompleteOrchestrationAction completeOrchestration = 6;
TerminateOrchestrationAction terminateOrchestration = 7;
SendEntityMessageAction sendEntityMessage = 8;
}
}

Expand All @@ -307,6 +321,7 @@ message OrchestratorRequest {
repeated HistoryEvent newEvents = 4;
OrchestratorEntityParameters entityParameters = 5;
bool requiresHistoryStreaming = 6;
map<string, google.protobuf.Value> properties = 7;
}

message OrchestratorResponse {
Expand All @@ -330,17 +345,12 @@ message CreateInstanceRequest {
google.protobuf.StringValue executionId = 7;
map<string, string> tags = 8;
TraceContext parentTraceContext = 9;
google.protobuf.Timestamp requestTime = 10;
}

message OrchestrationIdReusePolicy {
repeated OrchestrationStatus operationStatus = 1;
CreateOrchestrationAction action = 2;
}

enum CreateOrchestrationAction {
ERROR = 0;
IGNORE = 1;
TERMINATE = 2;
repeated OrchestrationStatus replaceableStatus = 1;
reserved 2;
}

message CreateInstanceResponse {
Expand Down Expand Up @@ -381,6 +391,7 @@ message OrchestrationState {
google.protobuf.StringValue executionId = 12;
google.protobuf.Timestamp completedTimestamp = 13;
google.protobuf.StringValue parentInstanceId = 14;
map<string, string> tags = 15;
}

message RaiseEventRequest {
Expand Down Expand Up @@ -457,6 +468,7 @@ message PurgeInstanceFilter {

message PurgeInstancesResponse {
int32 deletedInstanceCount = 1;
google.protobuf.BoolValue isComplete = 2;
}

message CreateTaskHubRequest {
Expand All @@ -481,6 +493,8 @@ message SignalEntityRequest {
google.protobuf.StringValue input = 3;
string requestId = 4;
google.protobuf.Timestamp scheduledTime = 5;
TraceContext parentTraceContext = 6;
google.protobuf.Timestamp requestTime = 7;
}

message SignalEntityResponse {
Expand Down Expand Up @@ -551,6 +565,8 @@ message EntityBatchResult {
repeated OperationAction actions = 2;
google.protobuf.StringValue entityState = 3;
TaskFailureDetails failureDetails = 4;
string completionToken = 5;
repeated OperationInfo operationInfos = 6; // used only with DTS
}

message EntityRequest {
Expand All @@ -564,6 +580,7 @@ message OperationRequest {
string operation = 1;
string requestId = 2;
google.protobuf.StringValue input = 3;
TraceContext traceContext = 4;
}

message OperationResult {
Expand All @@ -573,12 +590,21 @@ message OperationResult {
}
}

message OperationInfo {
string requestId = 1;
OrchestrationInstance responseDestination = 2; // null for signals
}

message OperationResultSuccess {
google.protobuf.StringValue result = 1;
google.protobuf.Timestamp startTimeUtc = 2;
google.protobuf.Timestamp endTimeUtc = 3;
}

message OperationResultFailure {
TaskFailureDetails failureDetails = 1;
google.protobuf.Timestamp startTimeUtc = 2;
google.protobuf.Timestamp endTimeUtc = 3;
}

message OperationAction {
Expand All @@ -594,6 +620,8 @@ message SendSignalAction {
string name = 2;
google.protobuf.StringValue input = 3;
google.protobuf.Timestamp scheduledTime = 4;
google.protobuf.Timestamp requestTime = 5;
TraceContext parentTraceContext = 6;
}

message StartNewOrchestrationAction {
Expand All @@ -602,6 +630,32 @@ message StartNewOrchestrationAction {
google.protobuf.StringValue version = 3;
google.protobuf.StringValue input = 4;
google.protobuf.Timestamp scheduledTime = 5;
google.protobuf.Timestamp requestTime = 6;
TraceContext parentTraceContext = 7;
}

message AbandonActivityTaskRequest {
string completionToken = 1;
}

message AbandonActivityTaskResponse {
// Empty.
}

message AbandonOrchestrationTaskRequest {
string completionToken = 1;
}

message AbandonOrchestrationTaskResponse {
// Empty.
}

message AbandonEntityTaskRequest {
string completionToken = 1;
}

message AbandonEntityTaskResponse {
// Empty.
}

service TaskHubSidecarService {
Expand Down Expand Up @@ -665,6 +719,15 @@ service TaskHubSidecarService {

// clean entity storage
rpc CleanEntityStorage(CleanEntityStorageRequest) returns (CleanEntityStorageResponse);

// Abandons a single work item
rpc AbandonTaskActivityWorkItem(AbandonActivityTaskRequest) returns (AbandonActivityTaskResponse);

// Abandon an orchestration work item
rpc AbandonTaskOrchestratorWorkItem(AbandonOrchestrationTaskRequest) returns (AbandonOrchestrationTaskResponse);

// Abandon an entity work item
rpc AbandonTaskEntityWorkItem(AbandonEntityTaskRequest) returns (AbandonEntityTaskResponse);
}

message GetWorkItemsRequest {
Expand Down Expand Up @@ -714,4 +777,4 @@ message StreamInstanceHistoryRequest {

message HistoryChunk {
repeated HistoryEvent events = 1;
}
}
4 changes: 2 additions & 2 deletions src/WebJobs.Extensions.DurableTask/Grpc/versions.txt
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
# The following files were downloaded from branch main at 2025-02-07 22:22:03 UTC
https://raw.githubusercontent.com/microsoft/durabletask-protobuf/6000187e90d79fba297dfd75d42095abc1462eba/protos/orchestrator_service.proto
# The following files were downloaded from branch main at 2025-06-05 21:25:17 UTC
https://raw.githubusercontent.com/microsoft/durabletask-protobuf/fd9369c6a03d6af4e95285e432b7c4e943c06970/protos/orchestrator_service.proto
Original file line number Diff line number Diff line change
Expand Up @@ -755,7 +755,7 @@ private async Task ExecuteOutOfProcBatch()
request.IsSignal,
request.ScheduledTime,
parentTraceContext,
request.RequestTime.Value);
request.RequestTime);
parentTraceContext = callEntityActivity.Context;
}

Expand Down
38 changes: 28 additions & 10 deletions src/WebJobs.Extensions.DurableTask/LocalGrpcListener.cs
Original file line number Diff line number Diff line change
Expand Up @@ -236,7 +236,7 @@ public override Task<Empty> Hello(Empty request, ServerCallContext context)

// Create a new activity with the parent context
ActivityContext.TryParse(traceParent, traceState, out ActivityContext parentActivityContext);
using Activity? scheduleOrchestrationActivity = TraceHelper.StartActivityForNewOrchestration(executionStartedEvent, parentActivityContext);
using Activity? scheduleOrchestrationActivity = TraceHelper.StartActivityForNewOrchestration(executionStartedEvent, parentActivityContext, request.RequestTime?.ToDateTimeOffset());

// Schedule the orchestration
await this.GetDurabilityProvider(context).CreateTaskOrchestrationAsync(
Expand Down Expand Up @@ -319,17 +319,35 @@ private OrchestrationStatus[] GetStatusesNotToOverride()
{
this.CheckEntitySupport(context, out var durabilityProvider, out var entityOrchestrationService);

EntityMessageEvent eventToSend = ClientEntityHelpers.EmitOperationSignal(
new OrchestrationInstance() { InstanceId = request.InstanceId },
Guid.Parse(request.RequestId),
request.Name,
request.Input,
EntityMessageEvent.GetCappedScheduledTime(
DateTime.UtcNow,
entityOrchestrationService.EntityBackendProperties!.MaximumSignalDelayTime,
request.ScheduledTime?.ToDateTime()));
Activity? signalEntityActivity = null;

// We only want to create a trace activity for signaling the entity in the case that we can successfully parse the trace context of the signal entity request.
// Otherwise, we will create an unlinked trace activity with no parent.
if (ActivityContext.TryParse(request.ParentTraceContext?.TraceParent, request.ParentTraceContext?.TraceState, out ActivityContext parentTraceContext))
{
signalEntityActivity = TraceHelper.StartActivityForCallingOrSignalingEntity(
request.InstanceId,
EntityId.GetEntityIdFromSchedulerId(request.InstanceId).EntityName,
request.Name,
signalEntity: true,
request.ScheduledTime?.ToDateTime(),
parentTraceContext,
request.RequestTime?.ToDateTime());
}

EntityMessageEvent eventToSend = ClientEntityHelpers.EmitOperationSignal(
new OrchestrationInstance() { InstanceId = request.InstanceId },
Guid.Parse(request.RequestId),
request.Name,
request.Input,
EntityMessageEvent.GetCappedScheduledTime(
DateTime.UtcNow,
entityOrchestrationService.EntityBackendProperties!.MaximumSignalDelayTime,
request.ScheduledTime?.ToDateTime()),
signalEntityActivity != null ? new DTCore.Tracing.DistributedTraceContext(signalEntityActivity.Id!, signalEntityActivity.TraceStateString) : null);

await durabilityProvider.SendTaskOrchestrationMessageAsync(eventToSend.AsTaskMessage());
signalEntityActivity?.Dispose();

// No fields in the response
return new P.SignalEntityResponse();
Expand Down
Loading
Loading