Video

Video

class byotrack.video.video.VideoTransformConfig(aggregate: bool = False, normalize: bool = False, selected_channel: int | None = None, q_min: float = 0.0, q_max: float = 1.0)

Bases: object

Configuration for video transformations

aggregate

Aggregate channels

Type:

bool

normalize

Scale and Normalize the video in [0, 1]

Type:

bool

selected_channel

Channel to use for aggregation If None, channel average is done. If any, it performs channel selection

Type:

Optional[int]

q_min

Minimum quantile to use when scaling the video

Type:

float

q_max

Maximum quantile to use when scaling the video

Type:

float

class byotrack.video.video.Video(data_source: str | PathLike | VideoReader)

Bases: Sequence[ndarray]

Video: Iterable, indexable and sliceable sequence of frames wrapping a VideoReader

It wraps VideoReader in order to add video transformation (Channel Aggregation, Scaling, Normalization) and to add useful pythonic protocols (Sliceable, Indexable, Iterable).

Return images in BGR by default like opencv as frames are mostly used with opencv afterwards for display. Can also return grayscale image (H, W, 1)

Example

import byotrack

# Read a video (Usually BGR)
video = byotrack.Video(video_path)

# Add a transform that will aggregate channel and normalize in [0, 1] the intensities
transform_config = byotrack.VideoTransformConfig(aggregate=True, normalize=True, q_min=0.01, q_max=0.999)
video.set_transform(transform_config)

# Iterate through the video
for frame in video:
    pass

# Temporal slicing

sliced = video[10:50:3]  # Take one frame every three from frame 10 to frame 50.
shape

Shape of the video (Time, Height, Width, Channel)

Type:

Tuple[int, int, int, int]

reader

Underlying video reader

Type:

byotrack.VideoReader

set_transform(transform_config: VideoTransformConfig) None

Set the transform (channel_selector and normalizer)

Parameters:

config (VideoTransformConfig) – Configuration of the transformations

transform(frame: ndarray) ndarray

Transform a frame using channel aggregation and normalization

__getitem__(index: int) ndarray
__getitem__(slice_: slice) Video

Indexing and slicing operations

When indexed, it returns the ith frame in the slice When sliced, it duplicates the video (wrapper) with the right slice

Parameters:

key (int | slice) – index or slice of the video

Returns:

Frame at index or a shallow copy of the video with the right slice

Return type:

np.ndarray | Video

Video Reader

class byotrack.video.reader.MetaVideoReader(cls_name: str, bases: tuple, attributes: dict)

Bases: type

MetaClass for Video Readers

Each VideoReader has to define a list of supported extensions. The last constructed VideoReader to claim an extension will be used to open the video. If no one has claimed an extension the default OpenCVVideoReader is used.

class byotrack.video.reader.VideoReader(path: str | PathLike, **kwargs)

Bases: object

Unified video reader api

Close to OpenCV API but few key differences:

  • There is always a frame loaded

  • Frame ids goes from 0 to length - 1

  • Read method is very different:

    • It retrieves the current frame then grabs the next (The other way around in opencv)

    • It returns therefore a ndarray and a bool rather than a bool and a ndarray

    • The boolean returned indicated if we can continue to read and not if the read operation has failed

  • Easy to check main attributes like:

    • frame_id

    • length

    • shape

    • fps if known (-1 otherwise)

Return images in BGR by default like opencv as frames are mostly used with opencv afterwards for display. Can also return grayscale image (H, W, 1)

supported_extensions

Static attribute used by open method to automatically choose which VideoReader to use.

Type:

List[str]

path

Path of the current video

Type:

str | bytes | os.PathLike

released

True when release has been called (close and release memory)

Type:

bool

fps

Frame rate (-1 if unknown)

Type:

int

shape

Spatial dimensions of frames

Type:

Tuple[int, int]

channels

Number of channels

Type:

int

length

Number of frames

Type:

int

frame_id

Current frame id

Type:

int

release() None

Close the file and free memory

grab() bool

Grab the next frame

Can be faster than self.seek(self.frame_id + 1)

Returns:

True if able to grab next frame

Return type:

bool

retrieve() ndarray

Retrieve the current frame

Returns:

The current frame

Shape: (H, W, 3) or (H, W, 1) (Grayscale)

Return type:

np.ndarray

read() Tuple[ndarray, bool]

Consume a frame. Is equivalent to retrieve + grab

As in this implementation there is always a current frame. It reverses open cv implementation It first retrieves then grab next frame

Returns:

The current frame

Shape: (H, W, 3) or (H, W, 1) (Grayscale)

bool: Whether there is a next frame to read

Return type:

np.ndarray

seek(frame_id: int) None

Seek frame_id (will update the current frame)

Valid frame ids from 0 to length - 1

Raises:

EOFError if seeking an invalid frame

tell() int

Returns self.frame_id

Returns:

current frame_id

Return type:

int

static open(path: str | PathLike, **kwargs) VideoReader

Open a video file

Use the extension to know which VideoReader to use

Parameters:
  • path (str | os.PathLike) – File to open

  • kwargs – Any additional args for the underlying video reader

Returns:

VideoReader

static ensure_3d(frame: ndarray) ndarray

Ensure that frame is a 3 dimensional array

Parameters:

frame (np.ndarray) – Frame to check

Returns:

Valid frame with 3 dimensions (H, W, C)

Return type:

np.ndarray

class byotrack.video.reader.OpenCVVideoReader(path: str | PathLike, **kwargs)

Bases: VideoReader

Wrapper around opencv VideoCapture

Default VideoReader when opening a file.

release() None

Close the file and free memory

grab() bool

Grab the next frame

Can be faster than self.seek(self.frame_id + 1)

Returns:

True if able to grab next frame

Return type:

bool

retrieve() ndarray

Retrieve the current frame

Returns:

The current frame

Shape: (H, W, 3) or (H, W, 1) (Grayscale)

Return type:

np.ndarray

seek(frame_id: int) None

Seek frame_id (will update the current frame)

Valid frame ids from 0 to length - 1

Raises:

EOFError if seeking an invalid frame

class byotrack.video.reader.TiffVideoReader(path: str | PathLike, **kwargs)

Bases: VideoReader

Special VideoReader for multi images tif files, not supported by opencv

Uses PIL as backend

release() None

Close the file and free memory

grab() bool

Grab the next frame

Can be faster than self.seek(self.frame_id + 1)

Returns:

True if able to grab next frame

Return type:

bool

retrieve() ndarray

Retrieve the current frame

Returns:

The current frame

Shape: (H, W, 3) or (H, W, 1) (Grayscale)

Return type:

np.ndarray

seek(frame_id: int) None

Seek frame_id (will update the current frame)

Valid frame ids from 0 to length - 1

Raises:

EOFError if seeking an invalid frame

Video transforms

class byotrack.video.transforms.ChannelSelect(channel: int)

Bases: object

Select a given channel

channel

Channel to keep (0, 1 or 2)

Type:

int

Parameters:

frame (np.ndarray) – Frame of the video Shape: (…, H, W, C)

Returns:

Filtered frame with a single channel

Shape: (…, H, W, 1)

Return type:

np.ndarray

class byotrack.video.transforms.ChannelAvg

Bases: object

Average channels into a single one

Parameters:

frame (np.ndarray) – Frame of the video Shape: (…, H, W, C)

Returns:

Average of channels

Shape: (…, H, W, 1)

Return type:

np.ndarray

class byotrack.video.transforms.ScaleAndNormalize(q_min: float, q_max: float)

Bases: object

Scale and Normalize each channel into [0, 1]

min and max values are computed using quantile of the video to improve stability

q_min

Quantile of the minimum value to consider

Type:

float

q_max

Quantile of the maximum value to consider

Type:

float

mini

Minimum value kept (one for each channel) Shape: (C, )

Type:

np.ndarray

maxi

Maximum value kept (one for each channel) Shape (C, )

Type:

np.ndarray

Parameters:

frame (np.ndarray) – Frame of the video Shape: (…, H, W, C)

Returns:

Normalized version of the frame in [0, 1]

Shape: (…, H, W, C)

Return type:

np.ndarray

update_stats(frames: ndarray) None

Update mini and maxi values based on the given frames

Parameters:

frames (np.ndarray) – Several frames of the same video to compute the stats Shape: (N, H, W, C)