Video
Video
- class byotrack.video.video.VideoTransformConfig(aggregate: bool = False, normalize: bool = False, selected_channel: int | None = None, q_min: float = 0.0, q_max: float = 1.0, smooth_clip: float = 0.0, compute_stats_on: int = 50)
Bases:
objectConfiguration for video transformations
- selected_channel
Channel to use for aggregation If None, channel average is done. If any, it performs channel selection
- Type:
Optional[int]
- smooth_clip
Smoothness of the clipping process (log clipping) See ScaleAndNormalize: it logs clip the highest values on q_max. If 0.0, hard clipping is done.
- Type:
- class byotrack.video.video.Video(data_source: str | PathLike | VideoReader)
Bases:
Sequence[ndarray]Video: Iterable, indexable and sliceable sequence of frames wrapping a VideoReader
It wraps VideoReader in order to add video transformation (Channel Aggregation, Scaling, Normalization) and to add useful pythonic protocols (Sliceable, Indexable, Iterable).
Return images in BGR by default like opencv as frames are mostly used with opencv afterwards for display. Can also return grayscale image (H, W, 1)
Example
import byotrack # Read a video (Usually BGR) video = byotrack.Video(video_path) # Add a transform that will aggregate channel and normalize in [0, 1] the intensities transform_config = byotrack.VideoTransformConfig(aggregate=True, normalize=True, q_min=0.01, q_max=0.999) video.set_transform(transform_config) # Iterate through the video for frame in video: pass # Temporal slicing sliced = video[10:50:3] # Take one frame every three from frame 10 to frame 50. # Spatial slicing sliced = video[:, 100:200, 150:250] # All frames on the roi (100:200 x 150:250)
- reader
Underlying video reader
- Type:
byotrack.VideoReader
- set_transform(transform_config: VideoTransformConfig) None
Set the transform (channel_selector and normalizer)
- Parameters:
transform_config (byotrack.VideoTransformConfig) – Configuration of the transformations
- transform(frame: ndarray) ndarray
Transform a frame using channel aggregation and normalization
Video Reader
- class byotrack.video.reader.MetaVideoReader(cls_name: str, bases: tuple, attributes: dict)
Bases:
typeMetaClass for Video Readers
Each VideoReader has to define a list of supported extensions. The last constructed VideoReader to claim an extension will be used to open the video. If no one has claimed an extension the default OpenCVVideoReader is used.
- class byotrack.video.reader.VideoReader(path: str | PathLike, **kwargs)
Bases:
objectUnified video reader api
Close to OpenCV API but few key differences:
There is always a frame loaded
Frame ids goes from 0 to length - 1
Read method is very different:
It retrieves the current frame then grabs the next (The other way around in opencv)
It returns therefore a ndarray and a bool rather than a bool and a ndarray
The boolean returned indicated if we can continue to read and not if the read operation has failed
Easy to check main attributes like:
frame_id
length
shape
fps if known (-1 otherwise)
Return images in BGR by default like opencv as frames are mostly used with opencv afterwards for display. Can also return grayscale image (H, W, 1)
- supported_extensions
Static attribute used by open method to automatically choose which VideoReader to use.
- Type:
List[str]
- path
Path of the current video
- Type:
str | bytes | os.PathLike
- grab() bool
Grab the next frame
Can be faster than self.seek(self.frame_id + 1)
- Returns:
True if able to grab next frame
- Return type:
- retrieve() ndarray
Retrieve the current frame
- Returns:
- The current frame
Shape: (H, W, 3) or (H, W, 1) (Grayscale)
- Return type:
np.ndarray
- read() Tuple[ndarray, bool]
Consume a frame. Is equivalent to retrieve + grab
As in this implementation there is always a current frame. It reverses open cv implementation It first retrieves then grab next frame
- Returns:
- The current frame
Shape: (H, W, 3) or (H, W, 1) (Grayscale)
bool: Whether there is a next frame to read
- Return type:
np.ndarray
- seek(frame_id: int) None
Seek frame_id (will update the current frame)
Valid frame ids from 0 to length - 1
- Raises:
EOFError if seeking an invalid frame –
- static open(path: str | PathLike, **kwargs) VideoReader
Open a video file
Use the extension to know which VideoReader to use
- Parameters:
path (str | os.PathLike) – File to open
kwargs – Any additional args for the underlying video reader
- Returns:
VideoReader
- static ensure_3d(frame: ndarray) ndarray
Ensure that frame is a 3 dimensional array
- Parameters:
frame (np.ndarray) – Frame to check
- Returns:
Valid frame with 3 dimensions (H, W, C)
- Return type:
np.ndarray
- class byotrack.video.reader.OpenCVVideoReader(path: str | PathLike, **kwargs)
Bases:
VideoReaderWrapper around opencv VideoCapture
Default VideoReader when opening a file.
- grab() bool
Grab the next frame
Can be faster than self.seek(self.frame_id + 1)
- Returns:
True if able to grab next frame
- Return type:
- retrieve() ndarray
Retrieve the current frame
- Returns:
- The current frame
Shape: (H, W, 3) or (H, W, 1) (Grayscale)
- Return type:
np.ndarray
- class byotrack.video.reader.TiffVideoReader(path: str | PathLike, **kwargs)
Bases:
VideoReaderSpecial VideoReader for multi images tif files, not supported by opencv
Uses PIL as backend
- grab() bool
Grab the next frame
Can be faster than self.seek(self.frame_id + 1)
- Returns:
True if able to grab next frame
- Return type:
- retrieve() ndarray
Retrieve the current frame
- Returns:
- The current frame
Shape: (H, W, 3) or (H, W, 1) (Grayscale)
- Return type:
np.ndarray
Video transforms
- class byotrack.video.transforms.ChannelSelect(channel: int)
Bases:
objectSelect a given channel
- Parameters:
frame (np.ndarray) – Frame of the video Shape: (…, H, W, C)
- Returns:
- Filtered frame with a single channel
Shape: (…, H, W, 1)
- Return type:
np.ndarray
- class byotrack.video.transforms.ChannelAvg
Bases:
objectAverage channels into a single one
- Parameters:
frame (np.ndarray) – Frame of the video Shape: (…, H, W, C)
- Returns:
- Average of channels
Shape: (…, H, W, 1)
- Return type:
np.ndarray
- class byotrack.video.transforms.ScaleAndNormalize(q_min: float, q_max: float, smooth_clip: float = 0)
Bases:
objectScale and Normalize each channel into [0, 1]
min and max values are computed using quantile of the video to improve stability
- mini
Minimum value kept (one for each channel) Shape: (C, )
- Type:
np.ndarray
- maxi
Maximum value kept (one for each channel) Shape: (C, )
- Type:
np.ndarray
- smooth_clip
Smoothness of the clipping process If 0, values are clipped on mini/maxi Else, values above maxi are log clipped: v = 1 + a log((v - 1)/a + 1) for v > 1, with a the smooth_clip factor Typical values are between 0 and 1. Default: 0 (hard clipping)
- Type:
- max
True maximum values (one for each channel) when using smooth clipping Shape: (C, )
- Type:
np.ndarray
- Parameters:
frame (np.ndarray) – Frame of the video Shape: (…, H, W, C)
- Returns:
- Normalized version of the frame in [0, 1]
Shape: (…, H, W, C)
- Return type:
np.ndarray