Skip to content

Refactor tracking data model #377

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 12 commits into
base: master
Choose a base branch
from

Conversation

probberechts
Copy link
Contributor

@probberechts probberechts commented Dec 18, 2024

This pull request proposes a new domain model for tracking data and updates various serializers to align with it.

The idea is that a tracking dataset is a collection of object detections. There are two ways to organize this collection:

  1. As a sequence of frames where each frame contains the coordinates of all object that were detected at a point in time
  2. As a set of trajectories where each trajectory contains the coordinates of a single object over a period of time.

uml_tracking

The deserializers still create the frame-based representation but you can now also compute the trajectory-based representation for any trackable object.

Why this is better

I think this has a few advantages:

  1. Which representation is most convenient depends on the use case. A frame-based representation allows to analyze player interactions, while a trajectory-based representation allows to analyze a player's kinematics. Now, we allow both representations.
  2. Previously we had different attributes and logic for handling ball coordinates and player coordinates. Now both the ball and the ball and players are handled as "trackable objects". It also becomes easier to implement support for tracking other objects (e.g., the referee).
  3. I need this for Compute speed and acceleration #298

Breaking changes

  • the PlayerData entity was replaced by a Detection entity

Still thinking about

  • How can we best support joint tracking data? A subclass of Detection, a Detection.joint_coordinates attribute, ...?
  • Maybe Detection is too ambiguous for an entity name? Edit: Now using TrackedObjectState

@probberechts probberechts requested review from koenvo and JanVanHaaren and removed request for koenvo December 18, 2024 13:16
@UnravelSports
Copy link
Contributor

@probberechts when considering joint tracking I would propose not adding it directly as Detection.joint_coordinates but perhaps as Detection.joints. Where the joints attribute can hold not only the coordinates directly, but perhaps a hierarchical description of the joint connections (perhaps as a graph) and potentially derived values like joint angles, or even inverse kinematics (although this is a bit far fetched).

Additionally, we should probably consider that joint data does not always come in the form of world coordinates, but sometimes it is provided as "angles" (e.g. "head_angle", "hip_angle" etc.) without the inclusion of coordinates.

@probberechts probberechts marked this pull request as draft May 17, 2025 11:55
@jan-swiatek
Copy link

Hi, I came across this PR and wanted to suggest an approach using the MultiIndex. This allows for convenient working with:

  • entire frames (1. aspect) by querying rows
  • selected players or ball (2. aspect) by querying columns

Here is an example:

import pandas as pd

data = [
    {
        ("frame_id", "", ""):     1,
        ("phase", "", ""):        1,
        ("home", 9, "position"):  (0.7, 0.5),
        ("away", 7, "position"):  (0.3, 0.35),
        ("ball", "position", ""): np.array([0.5, 0.5, 0.0]),
        ("ball", "status", ""):   "dead"
    },
    {
        ("frame_id", "", ""):     2,
        ("phase", "", ""):        1,
        ("home", 9, "position"):  (0.75, 0.52),
        ("away", 7, "position"):  (0.31, 0.36),
        ("ball", "position", ""): np.array([0.56, 0.47, 0.0]),
        ("ball", "status", ""):   "alive"
    }
]

multi_columns = pd.MultiIndex.from_tuples(data[0].keys())
tracking_df   = pd.DataFrame(data, columns=multi_columns)

# Fetch all frame ids
print(tracking_df["frame_id"])

# Fetch all info about frame with given frame id
print(tracking_df[tracking_df["frame_id"] == 2])

# Fetch all info about ball
print(tracking_df["ball"])

# Fetch all info about home team
print(tracking_df["home"])

# Fetch ball status from given frame
print(tracking_df.loc[0, ("ball", "status")])

# Fetch positions from all frames for home players only
print(tracking_df.loc[:, pd.IndexSlice["home", :, "position"]])

Since I haven't use this package that much yet, I'm not able to tell if this could cover 100% of current features. I'm aware that kloppy's approach is more object oriented, but maybe it's worth exploration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants