Detections
- byotrack.api.detector.detections.relabel_consecutive(segmentation: Tensor, *, inplace=True) Tensor
- byotrack.api.detector.detections.relabel_consecutive(segmentation: ndarray, *, inplace=True) ndarray
Relabel a segmentation mask so that labels are consecutives.
For N instances, labels are 0 for background and then [1:N] for each instance.
- Parameters:
segmentation (torch.Tensor | np.ndarray) – Segmentation mask Shape: ([D, ]H, W), dtype: int
inplace (bool) – Modify in place the segmentation mask Default: True
- Returns:
The same segmentation mask where labels are consecutive (from 1 to N)
- Return type:
torch.Tensor | np.ndarray
- byotrack.api.detector.detections.compress(tensor: Tensor, level=3) Tensor
Compress a tensor using zstd (Experimental).
- byotrack.api.detector.detections.decompress(tensor: Tensor, dtype=torch.int32) Tensor
Decompress a tensor using zstd (Experimental).
- class byotrack.api.detector.detections.Detections(data: dict[str, Tensor], frame_id: int = -1, *, use_median_position=True)
Bases:
objectDetections for a given frame.
Built from a data dict. The data has to contained one of “position”, “bbox” or “segmentation” keys that respectively define the positions of instances center ([k, ]i, j), the bounding boxes of instances ([front, ]top, left, [depth, ]height, width) or the instance segmentation of the image ([D, ]H, W).
It supports 2D and 3D detections. In 2D, the depth axis (k indice) is missing.
Note
All positions/bounding boxes uses the index coordinates system (not xyz). In ByoTrack, the Z axis (or depth D) is before the height and the width. We usually use the following nomenclature for indices: (k (stack), i (row) and j (columns)).
Positions are stored as floats (index coordinates): (k, i, j).
Bounding boxes are stored as ints (index coordinates): (k_0, i_0, j_0, dk, di, dj). It defines all the pixels (k, i, j) such that k_0 <= k < k_0 + dk, i_0 <= i <= i_0 + di, j_0 <= j < j_0 + dj.
The segmentation mask is stored as 2D or 3D integer tensor, where labels are consecutives from 1 to N+1. 0 is for the background.
Note
The i_th detection in the Detections has the label i+1 in the segmentation mask.
Additional optional data is also expected like “confidence” or “shape” that respectively defines the confidence for each detection and the shape of the image (H, W).
Any additional meta information on the detections can be also given.
Defines position, bbox, segmentation, and confidence properties. Each of them are either from the data or extrapolated if missing (see _extrapolate_xxx).
- frame_id
Optional frame id in the original video (-1 if no video) In ByoTrack, detections linking do not rely on this frame_id, but rather on the position inside the detections_sequence. It should only be used for debugging/visualization. Default: -1
- Type:
- shape
(tuple[int, …]): Shape of the image ([D, ]H, W). (Extrapolated if not given)
- position
Positions (k, i, j) of instances (center) inferred from the data Shape: (N, dim), dtype: float32
- Type:
torch.Tensor
- bbox
Bounding boxes of instances inferred from the data ([front, ]top, left, [depth, ]height, width) Shape: (N, 2*dim), dtype: int32
- Type:
torch.Tensor
- segmentation
Segmentation inferred from the data Shape: ([D, ]H, W), dtype: int32
- Type:
torch.Tensor
- confidence
Confidence for each instance Shape: (N,), dtype: float32
- Type:
torch.Tensor
- mass
Size of each object in pixel, inferred from the data. Shape: (N,), dtype: int32
- Type:
torch.Tensor
- use_median_position
Use median instead of mean to compute positions from segmentation. Default: True (Usually more robust)
- Type:
- save(path: str | os.PathLike) None
Save detections to a file using torch.save.
- Parameters:
path (str | os.PathLike) – Output path
- static load(path: str | os.PathLike) Detections
Load a detections for a given frame using torch.load.
- Parameters:
path (str | os.PathLike) – Input path
- static save_multi_frames_detections(detections_sequence: Sequence[Detections], path: str | os.PathLike) None
Save detections for a sequence of frames.
It will save the detections as:
path/{frame_id}.pt ...
- Parameters:
detections_sequence (Sequence[Detections]) – Detections for each frame Each detections should have a different frame_id
path (str | os.PathLike) – Output folder
- static load_multi_frames_detections(path: str | os.PathLike) list[Detections]
Load detections for a sequence of frames.
Expect the following file structure:
path/{0}.pt ... {i}.pt ... {n}.pt
- Parameters:
path (str | os.PathLike) – Input folder
- Returns:
Detections for each frame (sorted by frame id)
- Return type: