Video

class byotrack.video.video.VideoTransformConfig(aggregate: bool = False, normalize: bool = False, selected_channel: int | None = None, q_min: float = 0.0, q_max: float = 1.0)

Bases: object

Configuration for video transformations

aggregate

Aggregate channels

Type:: bool

normalize

Scale and Normalize the video in [0, 1]

Type:: bool

selected_channel

Channel to use for aggregation If None, channel average is done. If any, it performs channel selection

Type:: Optional[int]

q_min

Minimum quantile to use when scaling the video

Type:: float

q_max

Maximum quantile to use when scaling the video

Type:: float

class byotrack.video.video.Video(data_source: str | PathLike | VideoReader)

Bases: Sequence[ndarray]

Video: Iterable, indexable and sliceable sequence of frames wrapping a VideoReader

It wraps VideoReader in order to add video transformation (Channel Aggregation, Scaling, Normalization) and to add useful pythonic protocols (Sliceable, Indexable, Iterable).

Return images in BGR by default like opencv as frames are mostly used with opencv afterwards for display. Can also return grayscale image (H, W, 1)

Example

import byotrack

# Read a video (Usually BGR)
video = byotrack.Video(video_path)

# Add a transform that will aggregate channel and normalize in [0, 1] the intensities
transform_config = byotrack.VideoTransformConfig(aggregate=True, normalize=True, q_min=0.01, q_max=0.999)
video.set_transform(transform_config)

# Iterate through the video
for frame in video:
    pass

# Temporal slicing

sliced = video[10:50:3]  # Take one frame every three from frame 10 to frame 50.

shape

Shape of the video (Time, Height, Width, Channel)

Type:: Tuple[int, int, int, int]

reader

Underlying video reader

Type:: byotrack.VideoReader

set_transform(transform_config: VideoTransformConfig) → None

Set the transform (channel_selector and normalizer)

Parameters:: config (VideoTransformConfig) – Configuration of the transformations

transform(frame: ndarray) → ndarray: Transform a frame using channel aggregation and normalization

__getitem__(index: int) → ndarray

__getitem__(slice_: slice) → Video

Indexing and slicing operations

When indexed, it returns the ith frame in the slice When sliced, it duplicates the video (wrapper) with the right slice

Parameters:: key (int | slice) – index or slice of the video
Returns:: Frame at index or a shallow copy of the video with the right slice
Return type:: np.ndarray | Video

Video Reader

class byotrack.video.reader.MetaVideoReader(cls_name: str, bases: tuple, attributes: dict)

Bases: type

MetaClass for Video Readers

Each VideoReader has to define a list of supported extensions. The last constructed VideoReader to claim an extension will be used to open the video. If no one has claimed an extension the default OpenCVVideoReader is used.

class byotrack.video.reader.VideoReader(path: str | PathLike, **kwargs)

Bases: object

Unified video reader api

Close to OpenCV API but few key differences:

There is always a frame loaded
Frame ids goes from 0 to length - 1
Read method is very different:
- It retrieves the current frame then grabs the next (The other way around in opencv)
- It returns therefore a ndarray and a bool rather than a bool and a ndarray
- The boolean returned indicated if we can continue to read and not if the read operation has failed
Easy to check main attributes like:
- frame_id
- length
- shape
- fps if known (-1 otherwise)

Return images in BGR by default like opencv as frames are mostly used with opencv afterwards for display. Can also return grayscale image (H, W, 1)

supported_extensions

Static attribute used by open method to automatically choose which VideoReader to use.

Type:: List[str]

path

Path of the current video

Type:: str | bytes | os.PathLike

released

True when release has been called (close and release memory)

Type:: bool

fps

Frame rate (-1 if unknown)

Type:: int

shape

Spatial dimensions of frames

Type:: Tuple[int, int]

channels

Number of channels

Type:: int

length

Number of frames

Type:: int

frame_id

Current frame id

Type:: int

release() → None: Close the file and free memory

grab() → bool

Grab the next frame

Can be faster than self.seek(self.frame_id + 1)

Returns:: True if able to grab next frame
Return type:: bool

retrieve() → ndarray

Retrieve the current frame

Returns:

The current frame: Shape: (H, W, 3) or (H, W, 1) (Grayscale)

Return type:

np.ndarray

read() → Tuple[ndarray, bool]

Consume a frame. Is equivalent to retrieve + grab

As in this implementation there is always a current frame. It reverses open cv implementation It first retrieves then grab next frame

Returns:

The current frame: Shape: (H, W, 3) or (H, W, 1) (Grayscale)

bool: Whether there is a next frame to read

Return type:

np.ndarray

seek(frame_id: int) → None

Seek frame_id (will update the current frame)

Valid frame ids from 0 to length - 1

Raises:: EOFError if seeking an invalid frame –

tell() → int

Returns self.frame_id

Returns:: current frame_id
Return type:: int

static open(path: str | PathLike, **kwargs) → VideoReader

Open a video file

Use the extension to know which VideoReader to use

Parameters:

path (str | os.PathLike) – File to open
kwargs – Any additional args for the underlying video reader

Returns:

VideoReader

static ensure_3d(frame: ndarray) → ndarray

Ensure that frame is a 3 dimensional array

Parameters:: frame (np.ndarray) – Frame to check
Returns:: Valid frame with 3 dimensions (H, W, C)
Return type:: np.ndarray

class byotrack.video.reader.OpenCVVideoReader(path: str | PathLike, **kwargs)

Bases: VideoReader

Wrapper around opencv VideoCapture

Default VideoReader when opening a file.

release() → None: Close the file and free memory

grab() → bool

Grab the next frame

Can be faster than self.seek(self.frame_id + 1)

Returns:: True if able to grab next frame
Return type:: bool

retrieve() → ndarray

Retrieve the current frame

Returns:

The current frame: Shape: (H, W, 3) or (H, W, 1) (Grayscale)

Return type:

np.ndarray

seek(frame_id: int) → None

Seek frame_id (will update the current frame)

Valid frame ids from 0 to length - 1

Raises:: EOFError if seeking an invalid frame –

class byotrack.video.reader.TiffVideoReader(path: str | PathLike, **kwargs)

Bases: VideoReader

Special VideoReader for multi images tif files, not supported by opencv

Uses PIL as backend

release() → None: Close the file and free memory

grab() → bool

Grab the next frame

Can be faster than self.seek(self.frame_id + 1)

Returns:: True if able to grab next frame
Return type:: bool

retrieve() → ndarray

Retrieve the current frame

Returns:

The current frame: Shape: (H, W, 3) or (H, W, 1) (Grayscale)

Return type:

np.ndarray

seek(frame_id: int) → None

Seek frame_id (will update the current frame)

Valid frame ids from 0 to length - 1

Raises:: EOFError if seeking an invalid frame –

Video transforms

class byotrack.video.transforms.ChannelSelect(channel: int)

Bases: object

Select a given channel

channel

Channel to keep (0, 1 or 2)

Type:: int

Parameters:

frame (np.ndarray) – Frame of the video Shape: (…, H, W, C)

Returns:

Filtered frame with a single channel: Shape: (…, H, W, 1)

Return type:

np.ndarray

class byotrack.video.transforms.ChannelAvg

Bases: object

Average channels into a single one

Parameters:

frame (np.ndarray) – Frame of the video Shape: (…, H, W, C)

Returns:

Average of channels: Shape: (…, H, W, 1)

Return type:

np.ndarray

class byotrack.video.transforms.ScaleAndNormalize(q_min: float, q_max: float)

Bases: object

Scale and Normalize each channel into [0, 1]

min and max values are computed using quantile of the video to improve stability

q_min

Quantile of the minimum value to consider

Type:: float

q_max

Quantile of the maximum value to consider

Type:: float

mini

Minimum value kept (one for each channel) Shape: (C, )

Type:: np.ndarray

maxi

Maximum value kept (one for each channel) Shape (C, )

Type:: np.ndarray

Parameters:

frame (np.ndarray) – Frame of the video Shape: (…, H, W, C)

Returns:

Normalized version of the frame in [0, 1]: Shape: (…, H, W, C)

Return type:

np.ndarray

update_stats(frames: ndarray) → None

Update mini and maxi values based on the given frames

Parameters:: frames (np.ndarray) – Several frames of the same video to compute the stats Shape: (N, H, W, C)