Frame By Frame Linkers

Base class for frame by frame linkers. The current implementation follows the tracks handling strategy proposed in KOFT [9].

class byotrack.implementation.linker.frame_by_frame.base.AssociationMethod(*values)

Bases: Enum

Association methods (Greedy or Jonker-Volgenant).

GREEDY
Select the best match between tracks and detections iteratively until no match can be selected below the cost limit eta. It is usually not optimal for tracking but it is much faster.
OPT_HARD | SPARSE_OPT_HARD
Solve the linear association problem (see pylapy). Hard threshold the association matrix with the cost limit eta. Use the sparse version to increase speed with numerous targets.
OPT_SMOOTH | SPARSE_OPT_SMOOTH
Solve a cost_limit extended association problem (see pylapy) It relaxes the linear association problem, allowing to not link a node for the cost limit eta. Use the sparse version to increase speed with numerous targets.

solve(cost: Tensor, eta: float = inf) → Tensor

Solve tracks-to-detections association.

Parameters:

cost (torch.Tensor) – Cost matrix Shape: (N, M), dtype: float
eta (float) – Cost limit Default: inf (No thresholding)

Returns:

Links (i, j): Shape: (L, 2), dtype: int32

Return type:

torch.Tensor

class byotrack.implementation.linker.frame_by_frame.base.TrackHandler(n_valid: int, n_gap: int, start: int, identifier: int)

Bases: object

Handle a track during the tracking procedure.

It accumulates the track data at each new association and store the optional motion model data.

A TrackHandler is created for each unlinked detections in the linking process and then updated with the following associated detections. At the beginning, the track is considered HYPOTHETICAL. For a track to be considered valid, it requires n_valid consecutive associated detections after the track creation (state: HYPOTHETICAL => VALID). It a miss detection occurs during this time interval, then the track is deleted and considered invalid (state: HYPOTHETICAL => INVALID).

Once confirmed, a VALID track is resilient to miss detections, waiting n_gap frames before ending the track (VALID => FINISHED).

n_valid

Number of frames with a correct association required to validate the track at its creation.

Type:: int

n_gap

Number of frames with no association before the track termination.

Type:: int

start

Starting frame of the track

Type:: int

identifier

Identifier of the track handler (and of the track)

Type:: int

track_state

Current state of the handler

Type:: TrackState

last_association

Number of frames since the last association

Type:: int

detection_ids

Identifiers of the associated detection (-1 if None)

Type:: list[int]

track_ids

Index of the track at each frame in the linker.active_tracks list. It allows the linker to store data as tensor and be able to rebuild tracks at the end.

Type:: list[int]

merge_id

Identifier to an optional merged track handler (See Tracks.merge_id)

Type:: int

parent_id

Identifier to an optional parent track handler (See Tracks.parent_id)

Type:: int

is_split

Just to know if the track splits

Type:: bool

class TrackState(*values)

Bases: IntEnum

TrackState of a TrackHandler.

HYPOTHETICAL
Initial state before validation of the track.
VALID
The track has been validated and is still active
FINISHED
The track is valid and finished
INVALID
The track is not valid and deleted

is_active() → bool: Check whether the track is still active (Hypothetical or Valid).

update(frame_id: int, detection_id: int) → None

Update track handler. It stores the detection_id and update the track state.

It should be called for each time frame and each active track.

Parameters:

frame_id (int) – The current frame. This is given for safety checks to ensure that the Linker and TrackHandler agree.
detection_id (int) – Detection id in the Detections object. -1 if not associated to a particular detection.

register_track_id(track_id: int) → None

For still active tracks, it registers the track id after the update step.

Parameters:: track_id (int) – The index of the track in linker.active_tracks at this time frame.

class byotrack.implementation.linker.frame_by_frame.base.OnlineFlowExtractor(optflow: OpticalFlow)

Bases: object

Extract optical flow maps online from a video.

reset() → None: Reset the flow extractor.

update(frame: ndarray) → None

Extract the flow for a new given frame.

It will compute and register the flow map between the last given frame and the current frame.

Parameters:: frame (np.ndarray) – Current frame of the video

class byotrack.implementation.linker.frame_by_frame.base.FrameByFrameLinkerParameters(association_threshold: float = -1.0, *, n_valid=3, n_gap=3, association_method: str | AssociationMethod = AssociationMethod.SPARSE_OPT_SMOOTH, anisotropy: tuple[float, float, float] = (1.0, 1.0, 1.0), split_factor: float = 0.0, merge_factor: float = 0.0)

Bases: object

Parameters of the abstract FrameByFrameLinker.

Note

Most parameters can be estimated automatically from the detections using estimate.

association_threshold

This is the main hyperparameter, it defines the threshold on the distance used not to link tracks with detections. A low threshold will typically reduce wrong assignments and ID-switches, but may increase track fragmentation. Higher values will reduce track fragmentation, but miss-detected tracks may be linked to a wrong detection. Default: -1.0 (to be estimated, see estimate.)

Type:: float

n_valid

Number of detections required to validate the track after its creation. If a track is missed during its first n_valid frames, it is dropped. This provides robustness to false positive detections. With no false positives, it can be set to 1 (a detection always belongs to a track). Highers values allow to remove non time-consistent false positives, but may prune real tracks that have been miss-detected. Default: 3

Type:: int

n_gap

Number of consecutive frames without any association (miss-detected) before the track termination. This provides robustness to false negative detections. Without any false negatives, it can be set to 0. Higher values allow to support larger gaps in the track, but may lead to wrong assignments. Default: 3

Type:: int

association_method

The frame-by-frame association to use. See AssociationMethod. It can be provided as a string. (Choice: GREEDY, OPT_HARD, OPT_SMOOTH, SPARSE_OPT_HARD, SPARSE_OPT_SMOOTH) Default: SPARSE_OPT_SMOOTH

Type:: AssociationMethod

anisotropy

Anisotropy of images (Ratio of the pixel sizes for each axis, depth first). This will be used to scale distances. Default: (1., 1., 1.)

Type:: tuple[float, float, float]

split_factor

Allow splitting of tracks, using a second association step. The association threshold in this case is split_factor * association_threshold. Default: 0.0 (No splits)

Type:: float

merge_factor

Allow merging of tracks, using a second association step. The association threshold in this case is merge_factor * association_threshold. Default: 0.0 (No merges)

Type:: float

check() → None: Check the specification for invalid values.

estimate(detections_sequence: Sequence[byotrack.Detections]) → FrameByFrameLinkerParameters

Estimate parameters from the given detections.

Estimation is triggered by providing negative dummy values for positive parameters. The dummy values are then replaced by their estimate.

Estimators:

association_threshold: max(3 * statistics.average_radius, statistics.average_min_dist)
anisotropy: Computed from statistics.anisotropy.
split_factor: 1.0 if the number of detection increase by more than 30% over the full sequence.
merge_factor: 1.0 if the number of detection decrease by more than 30% over the full sequence.

Parameters:: detections_sequence (Sequence[byotrack.Detections]) – Detections for the current sequence.
Returns:: self with updated parameters.
Return type:: FrameByFrameLinkerParameters

class byotrack.implementation.linker.frame_by_frame.base.FrameByFrameLinker(specs: FrameByFrameLinkerParameters, optflow: OpticalFlow | None = None, features_extractor: FeaturesExtractor | None = None, *, save_all=False)

Bases: OnlineLinker

Links detections online using frame-by-frame association.

Abstract class for frame-by-frame linker. It decomposes the update step in 6 parts:

Optional optical flow computations (Handled by this class with the optflow given)
Motion modeling to predict track positions (motion_model)
Features extraction (handled by this class with the features_extractor given)
Track-to-detection cost computation (cost)
Solving the linear association problem (handled in associate)
Post matching update to handle tracks (post_association)

The association relies on the AssociationMethod enum and tracks handling is done with TrackHandler.

It follows the tracks handling strategy describe in KOFT [9].

specs

Parameters specifications of the algorithm. See FrameByFrameLinkerParameters.

Type:: FrameByFrameLinkerParameters

optflow

Optional wrapper around the given optional OpticalFlow that will extract flow maps of the video online. (The underlying OpticalFlow object is accessible in self.optflow.optflow) Default: None

Type:: OnlineFlowExtractor | None

features_extractor

Optional features extractor that will extract features for the detections, which could be useful for tracking. Default: None

Type:: FeaturesExtractor | None

save_all

Save metadata useless for the final building of tracks but that could be useful for analysis. For instance, it will keep invalid tracks. Or the computed features inside the Detections objects.

Type:: bool

frame_id

Current frame id of the linking process

Type:: int

inactive_tracks

Terminated tracks

Type:: list[TrackHandler]

active_tracks

Current track handlers

Type:: list[TrackHandler]

all_positions

Positions of the active tracks at each seen frames. Using the valid track handlers track_ids, it allows the reconstruction of tracks.

Type:: list[torch.Tensor]

split_links

Current split_links shape: (L’, 2), dtype: int32

Type:: torch.Tensor

merge_links

Current merge_links shape: (L’’, 2), dtype: int32

Type:: torch.Tensor

reset(dim=2) → None

Reset the linking algorithm.

Flush all data stored from a previous linking and prepare a new linking.

Parameters:: dim (int) – The dimension of the data. Default: 2

collect() → list[Track]

Collect and build all the tracks up to the last given frame.

Returns:: Tracks from the last reset to the last given frame.
Return type:: Collection[byotrack.Track]

abstractmethod motion_model() → None

Optional modelisation of motion for tracks.

It can be used to update some internal state of the tracker after the optical flow computation and before the distance computation.

abstractmethod cost(frame: ndarray | None, detections: Detections) → tuple[Tensor, float]

Compute the association cost between active tracks and detections.

It also returns the threshold to use (Depending on the dist you use, association_threshold could be related to a more meaning full quantity than the cost itself). For instance, when using a squared Euclidean distance, the association threshold could be express as the distance in pixel, and this function could square it. For likelihood association, you could provide the association threshold as a probability and use -log(threshold) as the true threshold. (See KalmanLinker and NearestNeighborLinker)

Parameters:

frame (np.ndarray | None) – The current optional frame of the video Shape: (H, W, C), dtype: float
detections (byotrack.Detections) – Detections for the given frame

Returns:

The cost matrix between active tracks and detections: Shape: (n_tracks, n_dets), dtype: float
float: The association threshold to use.: It can be different than self.association_threshold depending on the dist build here

Return type:

torch.Tensor

abstractmethod post_association(frame: ndarray | None, detections: Detections, active_mask: Tensor) → None

Update the internal state of the tracker after update_active_tracks.

It should update any internal model/data. It is also responsible to register the position of each active track in all_positions for the current time frame.

Parameters:

frame (np.ndarray | None) – The optional current frame of the video Shape: (H, W, C), dtype: float
detections (byotrack.Detections) – Detections for the given frame
active_mask (torch.Tensor) – Boolean tensor indicating True for still active tracks Shape: (N_tracks), dtype: bool

update_active_tracks(detections: Detections) → Tensor

Updates tracks handler and creates new ones for extra detections.

Tracks that are terminated are stored inside inactive_tracks and dropped from active_tracks. It is called by update before post_association.

It also handles merges and splits. In the case of some specific merges, it may change a few links. The updated links are returned, with a still_active mask for tracks and a new_track mask for detections.

Parameters:

detections (byotrack.Detections) – Detections for the given frame

Returns:

Boolean tensor indicating True for still active tracks: Shape: (N_tracks), dtype: bool

Return type:

torch.Tensor

update_detections(detections: Detections) → Detections

Optional modification of the current detections based on the current state.

This is called by update after the motion modeling but before the cost/association.

By default, it does not change anything.

Parameters:: detections (byotrack.Detections) – Detections at the current frame
Returns:: The (optionally modified) detections to use at this current frame
Return type:: byotrack.Detections

associate(frame: ndarray | None, detections: Detections) → Tensor

Produces links between the current tracks and detections.

Optionally it handles merges and splits by associating a second time.

Parameters:

frame (np.ndarray | None) – Current frame (Optional)
detections (byotrack.Detections) – Current detections

Returns:

Links (i, j): Shape: (L, 2), dtype: int32

Return type:

torch.Tensor

update(frame: ndarray | None, detections: Detections) → None

Progress in the linking step by one frame.

Will update the internal algorithm by a single frame and its detections.

Parameters:

frame (np.ndarray | None) – Optional frame of the video. Shape: ([D, ]H, W, C), dtype: float
detections (byotrack.Detections) – Detections for the given frame.

byotrack.implementation.linker.frame_by_frame.greedy_lap.greedy_assignment_solver(dist: ndarray, eta: float = inf) → ndarray

Solve assignment problem in a greedy way.

Iteratively select the minimum cost, then deleting its row/column.

Stops when the cost matrix is empty (no more rows or columns) or if the selected cost is higher than eta

Parameters:

dist (np.ndarray) – Distance matrix Shape: (N, M), dtype: float
eta (float) – Hard thresholding Default: inf (No thresholding)

Returns:

Links (i, j): Shape: (L, 2), dtype: uint16

Return type:

np.ndarray