Kalman and Optical Flow Tracking

Implementation of KOFT from [9]. It uses Optical-flow enhanced kalman filters to link detections through time. It usually outperforms the Kalman linker with optical-flow.

class byotrack.implementation.linker.frame_by_frame.koft.KOFTLinkerParameters(association_threshold: float, *, detection_std: float | Tensor = 3.0, flow_std: float | Tensor = 1.0, process_std: float | Tensor = 1.5, kalman_order: int = 1, n_valid=3, n_gap=3, association_method: str | AssociationMethod = AssociationMethod.OPT_SMOOTH, cost: str | Cost = Cost.EUCLIDEAN, track_building: str | TrackBuilding = TrackBuilding.FILTERED, extract_flows_on_detections=False, always_measure_velocity=True)

Bases: KalmanLinkerParameters

Parameters of KOFTLinker

association_threshold

This is the main hyperparameter, it defines the threshold on the distance used not to link tracks with detections. It prevents to link with false positive detections.

Type:: float

detection_std

Expected measurement noise on the detection process. The detection process is modeled with a Gaussian noise with this given std. (You can provide a different noise for each dimension). See torch_kf.ckf.constant_kalman_filter. Default: 3.0 pixels

Type:: Union[float, torch.Tensor]

flow_std

Expected measurement noise on the optical flow process. The optical flow process is modeled with a Gaussian noise with this given std. (You can provide a different noise for each dimension). Default: 1.0 pixels

Type:: Union[float, torch.Tensor]

process_std

Expect process noise. See torch_kf.ckf.constant_kalman_filter, the process is modeled as constant order-th derivative motion. This quantify how much the supposely “constant” order-th derivative can change between two consecutive frames. A common rule of thumb is to use 3 * process_std ~= max_t(| x^(order)(t) - x^(order)(t+1)|). It can be provided for each dimension). Default: 1.5 pixels / frame^order

Type:: Union[float, torch.Tensor]

kalman_order

Order of the Kalman filter to use. 0 is not supported. 1 for directed brownian motion, 2 for accelerated brownian motions, etc… Default: 1

Type:: int

n_valid

Number of frames with a correct association required to validate the track at its creation. Default: 3

Type:: int

n_gap

Number of frames with no association before the track termination. Default: 3

Type:: int

association_method

The frame-by-frame association to use. See AssociationMethod. It can be provided as a string. (Choice: GREEDY, OPT_HARD, OPT_SMOOTH) Default: OPT_SMOOTH

Type:: AssociationMethod

cost_method

The cost method to use. It can be provided as a string. See CostMethod. It also indicates what is the correct unit of association_threshold. Default: EUCLIDEAN

Type:: CostMethod

track_building

Tells the linker how to build the final tracks. Either from detections, or from filtered/smoothed positions computed by the Kalman filter. See TrackBuilding. It can be provided as a string. Default: FILTERED

Type:: TrackBuilding

extract_flows_on_detections

If True it extracts the optical flow at the detection location if possible. Otherwise it extract the flow from the curent estimate of the track position. Default: False

Type:: bool

always_measure_velocity

Update velocity for all tracks even non-linked ones. If set to False, it implements KOFT– from the paper. This is sub-optimal, you should keep it True. Default: True

Type:: bool

class byotrack.implementation.linker.frame_by_frame.koft.KOFTLinker(specs: KOFTLinkerParameters, optflow: OpticalFlow | None = None, features_extractor: FeaturesExtractor | None = None, save_all=False)

Bases: KalmanLinker

Kalman and Optical Flow Tracking [9]

Motion is modeled with a Kalman filter of a specified order >= 1 (See torch_kf.ckf) Positions are measured through the detection process. A second update step is performed to measure the velocity of all tracks using optical flow.

Matching is done to optimize the given cost.

Note

This implementation requires torch-kf. (pip install torch-kf)

See KalmanLinker for the other attributes.

specs

Parameters specifications of the algorithm. See KOFTLinkerParameters.

Type:: KOFTLinkerParameters

last_detections

The last detections used in update. Optionnaly used to extract flows at the detection positions and not the track state. Required for motion_model

Type:: byotrack.Detections

n_initial

Number of newly started tracks on the last update. Used to correclty initialized these tracks with the velocity update.

Type:: int

reset() → None

Reset the linking algorithm

Flush all data stored from a previous linking and prepare a new linking

motion_model() → None

Optional modelisation of motion for tracks

It can be used to update some internal state of the tracker after the optical flow computation and before the distance computation.

cost(_: ndarray, detections: Detections) → Tuple[Tensor, float]

Compute the association cost between active tracks and detections

It also returns the threshold to use (Depending on the dist you use, association_threshold could be related to a more meaning full quantity than the cost itself). For instance, when using a squared Euclidean distance, the association threshold could be express as the distance in pixel, and this function could square it. For likelihood association, you could provide the association threshold as a probability and use -log(threshold) as the true threshold. (See KalmanLinker and NearestNeighborLinker)

Parameters:

frame (np.ndarray) – The current frame of the video Shape: (H, W, C), dtype: float
detections (byotrack.Detections) – Detections for the given frame

Returns:

The cost matrix between active tracks and detections: Shape: (n_tracks, n_dets), dtype: float32
float: The association threshold to use.: It can be different than self.association_threshold depeding on the dist build here

Return type:

torch.Tensor

post_association(_: ndarray, detections: Detections, links: Tensor)

Update the tracks and the internal variables of the tracker

It should call the update method of each active tracks and update any internal model/data. It should also create new track handlers for each extra detection. Finally, it is also responsible to register the position of each active track in all_positions for the current time frame.

Parameters:

frame (np.ndarray) – The current frame of the video Shape: (H, W, C), dtype: float
detections (byotrack.Detections) – Detections for the given frame
links (torch.Tensor) – The links made between active tracks and the detections Shape: (L, 2), dtype: int32