Kalman Linker

Implementation of SKT from [9]

class byotrack.implementation.linker.frame_by_frame.kalman_linker.Cost(value)

Bases: Enum

The cost to use for association

  • LIKELIHOOD

    Build the cost matrix to maximize the likelihood of the association. The cost is defined as: the negative log likelihood that the detections j is from track i. C[i, j] = - log P(det_j | track_i) The cost limit is expected to be the smallest probability to accept.

  • MAHALANOBIS

    Mahalanobis distance, using the track uncertainty to correct the euclidean distance. The cost limit is expected to be the largest Mahalanobis distance to accept.

  • MAHALANOBIS_SQ

    Squared Mahalanobis distance. The cost limit is expected to be the largest Mahalanobis distance to accept. (not squared)

  • EUCLIDEAN

    Euclidean distance. The cost limit is expected to be the largest Euclidean distance to accept.

  • EUCLIDEAN_SQ

    Squared euclidean distance. The cost limit is expected to be the largest Euclidean distance to accept. (not squared)

class byotrack.implementation.linker.frame_by_frame.kalman_linker.TrackBuilding(value)

Bases: Enum

How to build the final tracks

  • DETECTION

    Build tracks from detections without filtering nor filling gaps

  • FILTERED

    Build tracks from the Kalman filter outputs

  • SMOOTHED

    Add a backward run of the Kalman filter (RTS) to smooth tracks position

class byotrack.implementation.linker.frame_by_frame.kalman_linker.KalmanLinkerParameters(association_threshold: float = 5.0, *, detection_std: float | Tensor = 3.0, process_std: float | Tensor = 1.5, kalman_order: int = 1, n_valid=3, n_gap=3, association_method: str | AssociationMethod = AssociationMethod.OPT_SMOOTH, anisotropy: Tuple[float, float, float] = (1.0, 1.0, 1.0), cost: str | Cost = Cost.EUCLIDEAN, track_building: str | TrackBuilding = TrackBuilding.FILTERED, split_factor: float = 0.0, merge_factor: float = 0.0)

Bases: FrameByFrameLinkerParameters

Parameters of KalmanLinker

Note

The merging and splitting features is still experimental.

association_threshold

This is the main hyperparameter, it defines the threshold on the distance used not to link tracks with detections. It prevents to link with false positive detections. Default: 5 pixels

Type:

float

detection_std

Expected measurement noise on the detection process. The detection process is modeled with a Gaussian noise with this given std. (You can provide a different noise for each dimension). See torch_kf.ckf.constant_kalman_filter. Default: 3.0 pixels

Type:

Union[float, torch.Tensor]

process_std

Expect process noise. See torch_kf.ckf.constant_kalman_filter, the process is modeled as constant order-th derivative motion. This quantify how much the supposely “constant” order-th derivative can change between two consecutive frames. A common rule of thumb is to use 3 * process_std ~= max_t(| x^(order)(t) - x^(order)(t+1)|). It can be provided for each dimension). Default: 1.5 pixels / frame^order

Type:

Union[float, torch.Tensor]

kalman_order

Order of the Kalman filter to use. 0 for brownian motions, 1 for directed brownian motion, 2 for accelerated brownian motions, etc… Default: 1

Type:

int

n_valid

Number associated detections required to validate the track after its creation. Default: 3

Type:

int

n_gap

Number of consecutive frames without association before the track termination. Default: 3

Type:

int

association_method

The frame-by-frame association to use. See AssociationMethod. It can be provided as a string. (Choice: GREEDY, OPT_HARD, OPT_SMOOTH) Default: OPT_SMOOTH

Type:

AssociationMethod

anisotropy

Anisotropy of images (Ratio of the pixel sizes for each axis, depth first). This will be used to scale distances. It will only impact EUCLIDEAN[_SQ] costs. For probabilistic cost, anisotropy should be already integrated in the stds of the kalman filter (providing one std for each dimension). Default: (1., 1., 1.)

Type:

Tuple[float, float, float]

cost_method

The cost method to use. It can be provided as a string. See CostMethod. It also indicates what is the correct unit of association_threshold. Default: EUCLIDEAN

Type:

CostMethod

track_building

Tells the linker how to build the final tracks. Either from detections, or from filtered/smoothed positions computed by the Kalman filter. See TrackBuilding. It can be provided as a string. Default: FILTERED

Type:

TrackBuilding

split_factor

Allow splitting of tracks, using a second association step. The association threshold in this case is split_factor * association_threshold. Default: 0.0 (No splits)

Type:

float

merge_factor

Allow merging of tracks, using a second association step. The association threshold in this case is merge_factor * association_threshold. Default: 0.0 (No merges)

Type:

float

class byotrack.implementation.linker.frame_by_frame.kalman_linker.KalmanLinker(specs: KalmanLinkerParameters, optflow: OpticalFlow | None = None, features_extractor: FeaturesExtractor | None = None, save_all=False)

Bases: FrameByFrameLinker

Frame by frame linker using Kalman filters

Motion is modeled with a Kalman filter of a specified order (See torch_kf.ckf) Matching is done to optimize the given cost. If optical flow is provided, it is used online to warp the predicted state positions of the kalman filter. This will work, but it is sub-optimal: consider using KOFTLinker that exploits in a finer way optical flow inside Kalman filters.

This is an implementation of Simple Kalman Tracking (SKT) from KOFT [9].

Note

This implementation requires torch-kf. (pip install torch-kf)

See FrameByFrameLinker for the other attributes.

specs

Parameters specifications of the algorithm. See KalmanLinkerParameters.

Type:

KalmanLinkerParameters

kalman_filter

The Kalman filter. (Build once the tracking starts allowing to adapt the dimension on the fly)

Type:

Optional[torch_kf.KalmanFilter]

active_states

The Kalman filter estimation for each track. Shape: mean=(N, D * (order + 1), 1), covariance=(N, D * (order + 1), dim * (order + 1)) dtype: float32

Type:

Optional[torch_kf.GaussianState]

projections

The Kalman filter projection for each track. Shape: mean=(N, D, 1), covariance=(N, D, D), precision=(N, D, D) dtype: float32

Type:

Optional[torch_kf.GaussianState]

all_states

The Kalman filter estimation for each track at each seen frame. States are only registered when save_all=True or if you build tracks from RTS smoothing. Shape: mean=(N, D * (order + 1), 1), covariance=(N, D * (order + 1), dim * (order + 1)) dtype: float32

Type:

List[torch_kf.GaussianState]

reset() None

Reset the linking algorithm

Flush all data stored from a previous linking and prepare a new linking

collect() List[Track]

Collect and build all the tracks up to the last given frame

Returns:

Tracks from the last reset to the last given frame.

Return type:

Collection[byotrack.Track]

motion_model() None

Optional modelisation of motion for tracks

It can be used to update some internal state of the tracker after the optical flow computation and before the distance computation.

cost(_: ndarray, detections: Detections) Tuple[Tensor, float]

Compute the association cost between active tracks and detections

It also returns the threshold to use (Depending on the dist you use, association_threshold could be related to a more meaning full quantity than the cost itself). For instance, when using a squared Euclidean distance, the association threshold could be express as the distance in pixel, and this function could square it. For likelihood association, you could provide the association threshold as a probability and use -log(threshold) as the true threshold. (See KalmanLinker and NearestNeighborLinker)

Parameters:
  • frame (np.ndarray) – The current frame of the video Shape: (H, W, C), dtype: float

  • detections (byotrack.Detections) – Detections for the given frame

Returns:

The cost matrix between active tracks and detections

Shape: (n_tracks, n_dets), dtype: float32

float: The association threshold to use.

It can be different than self.association_threshold depeding on the dist build here

Return type:

torch.Tensor

post_association(_: ndarray, detections: Detections, active_mask: Tensor)

Update the internal state of the tracker after update_active_tracks

It should update any internal model/data. It is also responsible to register the position of each active track in all_positions for the current time frame.

Parameters:
  • frame (np.ndarray) – The current frame of the video Shape: (H, W, C), dtype: float

  • detections (byotrack.Detections) – Detections for the given frame

  • active_mask (torch.Tensor) – Boolean tensor indicating True for still active tracks Shape: (N_tracks), dtype: bool