Detections

byotrack.api.detector.detections.relabel_consecutive(segmentation: Tensor, *, inplace=True) Tensor
byotrack.api.detector.detections.relabel_consecutive(segmentation: ndarray, *, inplace=True) ndarray

Relabel a segmentation mask so that labels are consecutives.

For N instances, labels are 0 for background and then [1:N] for each instance.

Parameters:
  • segmentation (torch.Tensor | np.ndarray) – Segmentation mask Shape: ([D, ]H, W), dtype: int

  • inplace (bool) – Modify in place the segmentation mask Default: True

Returns:

The same segmentation mask where labels are consecutive (from 1 to N)

Return type:

torch.Tensor | np.ndarray

byotrack.api.detector.detections.compress(tensor: Tensor, level=3) Tensor

Compress a tensor using zstd (Experimental).

byotrack.api.detector.detections.decompress(tensor: Tensor, dtype=torch.int32) Tensor

Decompress a tensor using zstd (Experimental).

class byotrack.api.detector.detections.Detections(data: dict[str, Tensor], frame_id: int = -1, *, use_median_position=True)

Bases: object

Detections for a given frame.

Built from a data dict. The data has to contained one of “position”, “bbox” or “segmentation” keys that respectively define the positions of instances center ([k, ]i, j), the bounding boxes of instances ([front, ]top, left, [depth, ]height, width) or the instance segmentation of the image ([D, ]H, W).

It supports 2D and 3D detections. In 2D, the depth axis (k indice) is missing.

Note

All positions/bounding boxes uses the index coordinates system (not xyz). In ByoTrack, the Z axis (or depth D) is before the height and the width. We usually use the following nomenclature for indices: (k (stack), i (row) and j (columns)).

Positions are stored as floats (index coordinates): (k, i, j).

Bounding boxes are stored as ints (index coordinates): (k_0, i_0, j_0, dk, di, dj). It defines all the pixels (k, i, j) such that k_0 <= k < k_0 + dk, i_0 <= i <= i_0 + di, j_0 <= j < j_0 + dj.

The segmentation mask is stored as 2D or 3D integer tensor, where labels are consecutives from 1 to N+1. 0 is for the background.

Note

The i_th detection in the Detections has the label i+1 in the segmentation mask.

Additional optional data is also expected like “confidence” or “shape” that respectively defines the confidence for each detection and the shape of the image (H, W).

Any additional meta information on the detections can be also given.

Defines position, bbox, segmentation, and confidence properties. Each of them are either from the data or extrapolated if missing (see _extrapolate_xxx).

data

Detections data.

Type:

dict[str, toch.Tensor]

length

Number of detections

Type:

int

dim

Dimension of the detections: 2d or 3d.

Type:

int

frame_id

Optional frame id in the original video (-1 if no video) In ByoTrack, detections linking do not rely on this frame_id, but rather on the position inside the detections_sequence. It should only be used for debugging/visualization. Default: -1

Type:

int

shape

(tuple[int, …]): Shape of the image ([D, ]H, W). (Extrapolated if not given)

position

Positions (k, i, j) of instances (center) inferred from the data Shape: (N, dim), dtype: float32

Type:

torch.Tensor

bbox

Bounding boxes of instances inferred from the data ([front, ]top, left, [depth, ]height, width) Shape: (N, 2*dim), dtype: int32

Type:

torch.Tensor

segmentation

Segmentation inferred from the data Shape: ([D, ]H, W), dtype: int32

Type:

torch.Tensor

confidence

Confidence for each instance Shape: (N,), dtype: float32

Type:

torch.Tensor

mass

Size of each object in pixel, inferred from the data. Shape: (N,), dtype: int32

Type:

torch.Tensor

use_median_position

Use median instead of mean to compute positions from segmentation. Default: True (Usually more robust)

Type:

bool

save(path: str | os.PathLike) None

Save detections to a file using torch.save.

Parameters:

path (str | os.PathLike) – Output path

static load(path: str | os.PathLike) Detections

Load a detections for a given frame using torch.load.

Parameters:

path (str | os.PathLike) – Input path

static save_multi_frames_detections(detections_sequence: Sequence[Detections], path: str | os.PathLike) None

Save detections for a sequence of frames.

It will save the detections as:

path/{frame_id}.pt
     ...
Parameters:
  • detections_sequence (Sequence[Detections]) – Detections for each frame Each detections should have a different frame_id

  • path (str | os.PathLike) – Output folder

static load_multi_frames_detections(path: str | os.PathLike) list[Detections]

Load detections for a sequence of frames.

Expect the following file structure:

path/{0}.pt
     ...
     {i}.pt
     ...
     {n}.pt
Parameters:

path (str | os.PathLike) – Input folder

Returns:

Detections for each frame (sorted by frame id)

Return type:

list[Detections]