Skip to content

Conversation

@Yopi
Copy link

@Yopi Yopi commented Nov 5, 2025

The idea and proposal behind groups of events and sequences.

The idea and proposal behind groups of events and sequences.

A subset (or a full) hierarchy of events can be thought of as a sequence of events.

A sequence groups related events via parent-child hierarchies. Each root event that matches a creation criteria creates its own sequence. Descendant events (via `parent_id`) are automatically added to the same sequence. Cost and revenue are aggregated on each sequence.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Descendant events (via parent_id) are automatically added to the same sequence.

Recursively?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see you answered that below.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recursively in the sense that if:

Event A is ingested: added to event_sequence
Event B with parent_id A is ingested: Added to the event_sequence on the basis that A is added
Event C with parent_id B is ingested: Added to the event_sequence on the basis that B is added


A subset (or a full) hierarchy of events can be thought of as a sequence of events.

A sequence groups related events via parent-child hierarchies. Each root event that matches a creation criteria creates its own sequence. Descendant events (via `parent_id`) are automatically added to the same sequence. Cost and revenue are aggregated on each sequence.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cost and revenue are aggregated on each sequence

I'd leave revenue out of it for now to simplify the mental model.

This is the first time we make cost a first-class citizen versus metadata._cost. Should we maybe be a bit more flexible in how we name and store this so that we can also store other aggregates if we were to explore that (e.g. total input / output token count from metadata._llm)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely, I'm thinking more that we would want to highlight that we could aggregate anything in the events into the sequence aggregation. We could highlight costs + other, or just say that it's possible to aggregate values into the sequence?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"aggregate anything" sure, yes, I was talking specifically about not designing the data model as having a "total_cost" column.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes, sorry, that's what I meant as well. I updated the data model in the entity graph but it's a bit fuzzy in this text.


This simplifies the creation and understanding of what a single sequence (or sequence definition) is.

It does not solve the comparison between multiple sequences with different definitions.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, but I'm having trouble thinking of a reasonable reason to do this.

Why would you want to compare e.g. a single sequence "workout_tracking" with "nutrition_tracking" in a fitness app? I could see how globally you want to know the average cost of tracking a workout and tracking food intake but you'd never compare an instance of each side by side.

In other words you want to compare apples to apples, oranges to oranges, and additionally are interested in both the average apple and average orange.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about when you want to track e.g.

  • nutrition_tracking (outcome: ate balanced)
  • nutrition_tracking (outcome: ate too much candy)

How often does users eat balanced vs too much candy. If you have multiple sequences that are tracking the same events you might want to start to compare the outcomes of those sequences.

When it comes to fully disparate sequences, then I fully agree that it does not make sense to want to compare them

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Nutrition Tracked (no success)
  • Nutrition Tracked: Balanced (success: ate_balanced)
  • Nutrition Tracked: Cheat Day (success: too_much_candy)

But yes, I get what you're after, eventually you'll want to run numbers balanced <> cheat day, maybe.

But that's solvable outside of the scope of this design, no, it's more dashboarding than anything else.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants