Video
Video
- class byotrack.video.video.VideoTransformConfig(aggregate: bool = False, normalize: bool = False, selected_channel: int | None = None, q_min: float = 0.0, q_max: float = 1.0, smooth_clip: float = 0.0, compute_stats_on: int = 50)
Bases:
objectConfiguration for video transformations.
- selected_channel
Channel to use for aggregation If None, channel average is done. If any, it performs channel selection
- Type:
int | None
- smooth_clip
Smoothness of the clipping process (log clipping) See ScaleAndNormalize: it logs clip the highest values on q_max. If 0.0, hard clipping is done.
- Type:
- class byotrack.video.video.Video(data_source: str | PathLike | VideoReader, **kwargs: Any)
Bases:
Sequence[ndarray]Video: Iterable, indexable and sliceable sequence of frames wrapping a VideoReader.
It wraps VideoReader in order to add video transformation (Channel Aggregation, Scaling, Normalization) and to add useful pythonic protocols (Sliceable, Indexable, Iterable).
Frames are 2D or 3D with a channel axis. It behaves similarly as a 5D/4D numpy array of shape (T[, D], H, W, C).
Example
import byotrack # Read a video (Usually 2D RGB) video = byotrack.Video(video_path) # Add a transform that will aggregate channel and normalize in [0, 1] the intensities transform_config = byotrack.VideoTransformConfig(aggregate=True, normalize=True, q_min=0.01, q_max=0.999) video.set_transform(transform_config) # Iterate through the video for frame in video: pass # Temporal slicing sliced = video[10:50:3] # Take one frame every three from frame 10 to frame 50. # Spatial slicing sliced = video[:, 100:200, 150:250] # All frames on the roi (100:200 x 150:250)
- reader
Underlying video reader
- Type:
byotrack.VideoReader
- set_transform(transform_config: VideoTransformConfig) None
Set the transform (channel_selector and normalizer).
- Parameters:
transform_config (byotrack.VideoTransformConfig) – Configuration of the transformations
- transform(frame: ndarray) ndarray
Transform a frame using channel aggregation and normalization.
Video Reader
- byotrack.video.reader.slice_length(slice_: slice, shape: int) int
Compute the number of element in a slice.
- class byotrack.video.reader.MetaVideoReader(cls_name: str, bases: tuple, attributes: dict)
Bases:
typeMetaClass for Video Readers.
Each VideoReader has to define a list of supported extensions. The last constructed VideoReader to claim an extension will be used to open the video. If no one has claimed an extension the default OpenCVVideoReader is used.
- class byotrack.video.reader.VideoReader(path: str | os.PathLike, **kwargs: Any)
Bases:
objectUnified video reader api.
Close to OpenCV API but few key differences:
There is always a frame loaded
Frame ids goes from 0 to length - 1
Frames are loaded in RGB.
It support any number of channels and 2D/3D
Read method is very different:
It retrieves the current frame then grabs the next (The other way around in opencv)
It returns therefore a ndarray and a bool rather than a bool and a ndarray
The boolean returned indicated if we can continue to read and not if the read operation has failed
Easy to check main attributes like:
frame_id
length
channels
shape
fps if known (-1 otherwise)
- supported_extensions
Static attribute used by open method to automatically choose which VideoReader to use.
- path
Path of the current video
- Type:
- grab() bool
Grab the next frame.
Can be faster than self.seek(self.frame_id + 1)
- Returns:
True if able to grab next frame
- Return type:
- retrieve() ndarray
Retrieve the current frame.
- Returns:
- The current frame
Shape: ([D, ]H, W, C)
- Return type:
np.ndarray
- read() tuple[ndarray, bool]
Consume a frame. Is equivalent to retrieve + grab.
As in this implementation there is always a current frame. It reverses OpenCV implementation It first retrieves then grab next frame
- Returns:
The current frame - Shape: ([D, ]H, W, C) bool: Whether there is a next frame to read
- Return type:
np.ndarray
- seek(frame_id: int) None
Seek frame_id (will update the current frame).
Valid frame ids from 0 to length - 1
- Raises:
EOFError if seeking an invalid frame –
- static open(path: str | os.PathLike, **kwargs: Any) VideoReader
Open a video file.
Use the extension to know which VideoReader to use
- Parameters:
path (str | os.PathLike) – File to open
kwargs – Any additional args for the underlying video reader
- Returns:
VideoReader
- class byotrack.video.reader.OpenCVVideoReader(path: str | os.PathLike, **kwargs: Any)
Bases:
VideoReaderWrapper around opencv VideoCapture.
Default VideoReader when opening a file.
It only supports 2D images (grayscale or RGB).
- video
VideoCapture from opencv
- Type:
cv2.VideoCapture
- grab() bool
Grab the next frame.
Can be faster than self.seek(self.frame_id + 1)
- Returns:
True if able to grab next frame
- Return type:
- retrieve() ndarray
Retrieve the current frame.
- Returns:
- The current frame
Shape: ([D, ]H, W, C)
- Return type:
np.ndarray
- class byotrack.video.reader.PILVideoReader(path: str | os.PathLike, **kwargs: Any)
Bases:
VideoReaderOld PIL video reader. Works well for 2D multi frames Tiff files that are not supported by OpenCV.
IT only supports 2D videos. TiffVideoReader should have a larger support for TiffFiles
See VideoReader for inherited attributes.
- video
PIL image (animated)
- Type:
PIL.Image.Image
- grab() bool
Grab the next frame.
Can be faster than self.seek(self.frame_id + 1)
- Returns:
True if able to grab next frame
- Return type:
- retrieve() ndarray
Retrieve the current frame.
- Returns:
- The current frame
Shape: ([D, ]H, W, C)
- Return type:
np.ndarray
- class byotrack.video.reader.TiffVideoReader(path: str | os.PathLike, level=0, axes: str | None = None, ax_slice: dict[str, slice] | None = None, **kwargs: Any)
Bases:
VideoReaderTiff video reader with tifffile. Handle 2D and 3D videos with any channels.
Axes are inferred from the tifffile metadata and convert into (T, [D, ]H, W, C) (<=> T[Z]YXC). We may not support all formats, or your specific metadata can be wrong/missing. In this case, you can also provide the expected axes of the tifffile using an ordered string.
For example: “TYX” for 2D videos without channel, “TCZYX” for 3D videos with channels (ordered by time, channel then stack), “ZTYX” for 3D videos without channels (ordered by stack then time).
Note
With tifffile syntax, we use X for width, Y for height, Z for depth and C/S for channels (C and S are not supported together) and T for time. Any other letter (I, O, Q, …) is first interpret as T if it is missing, then Z if it is missing, and finally it will yield an error.
It also supports to read the tiff at a specific resolution level
See VideoReader for inherited attributes.
- grab() bool
Grab the next frame.
Can be faster than self.seek(self.frame_id + 1)
- Returns:
True if able to grab next frame
- Return type:
- seek(frame_id: int) None
Seek frame_id (will update the current frame).
Valid frame ids from 0 to length - 1
- Raises:
EOFError if seeking an invalid frame –
- retrieve() ndarray
Retrieve the current frame.
- Returns:
- The current frame
Shape: ([D, ]H, W, C)
- Return type:
np.ndarray
- class byotrack.video.reader.FrameTiffLoader(level=0, axes: str | None = None, ax_slice: dict[str, slice] | None = None)
Bases:
objectLoad a single frame stored in a TiffFile with tifffile.
It handle 2D and 3D videos with any channels. Axes are inferred from the tifffile metadata and convert into ([D, ]H, W, C) (<=> [Z]YXC).
We may not support all formats, or your specific metadata can be wrong/missing. In this case, you can also provide the expected axes of the tifffile using an ordered string.
For example: “YX” for 2D videos without channel, “CZYX” for 3D videos with channels (ordered by channel then stack).
Note
With tifffile syntax, we use X for width, Y for height, Z for depth and C/S for channels (C and S are not supported together). Any other letter (T, I, O, Q, …) is either interpreted as Z if it is missing, or it will yield an error.
It also supports to read the tiff at a specific resolution level.
- byotrack.video.reader.pil_loader(path: str | os.PathLike) np.ndarray
Load an image with PIL. Shape: (H, W, C).
It only supports 2D images
- class byotrack.video.reader.MultiFrameReader(path: str | os.PathLike, paths: list[str | os.PathLike] | None = None, extension: str | None = None, frame_loader: Callable[[str | os.PathLike], np.ndarray] | None = None, **kwargs: Any)
Bases:
VideoReaderRead video from a list of files inside a folder.
By default, it will find the alphanumerically sorted list of paths that shares the most common extension in the folder. The extension may be provided by the user.
You can provide your own list of paths (absolute paths). The folder path is then ignored.
Finally, you may also provide your own loading function to load each frame as a numpy array.
See VideoReader for inherited attributes.
- paths
Sorted list of Paths to each frame of the video.
- Type:
- frame_loader
Loads frame from their associated files.
- Type:
Callable[[str | os.PathLike], np.ndarray]
- grab() bool
Grab the next frame.
Can be faster than self.seek(self.frame_id + 1)
- Returns:
True if able to grab next frame
- Return type:
- seek(frame_id: int) None
Seek frame_id (will update the current frame).
Valid frame ids from 0 to length - 1
- Raises:
EOFError if seeking an invalid frame –
- retrieve() ndarray
Retrieve the current frame.
- Returns:
- The current frame
Shape: ([D, ]H, W, C)
- Return type:
np.ndarray
Video transforms
- class byotrack.video.transforms.ChannelSelect(channel: int)
Bases:
objectSelect a given channel.
- Parameters:
frame (np.ndarray) – Frame of a video Shape: (…, C)
- Returns:
- Filtered frame with a single channel
Shape: (…, 1)
- Return type:
np.ndarray
- class byotrack.video.transforms.ChannelAvg
Bases:
objectAverage channels into a single one.
- Parameters:
frame (np.ndarray) – Frame of a video Shape: (…, C)
- Returns:
- Average of channels
Shape: (…, 1)
- Return type:
np.ndarray
- class byotrack.video.transforms.ScaleAndNormalize(q_min: float, q_max: float, smooth_clip: float = 0, compute_stats_on: int = 50)
Bases:
objectScale and Normalize each channel into [0, 1].
min and max values are computed using quantile of the video to improve stability
- mini
Minimum value kept (one for each channel) Shape: (C, ), dtype: float32
- Type:
np.ndarray
- maxi
Maximum value kept (one for each channel) Shape: (C, ), dtype: float32
- Type:
np.ndarray
- smooth_clip
Smoothness of the clipping process If 0, values are clipped on mini/maxi Else, values above maxi are log clipped: v = 1 + a log((v - 1)/a + 1) for v > 1, with a the smooth_clip factor Typical values are between 0 and 1. Default: 0 (hard clipping)
- Type:
- max
True maximum values (one for each channel) when using smooth clipping Shape: (C, ), dtype: float32
- Type:
np.ndarray
- compute_stats_on
Max number of frames to compute stats on. It prevents heavy computations that can occurs with large videos. Default: 50
- Type:
- Parameters:
frame (np.ndarray) – Frame of the video Shape: (…, C)
- Returns:
- Normalized version of the frame in [0, 1]
Shape: (…, C), dtype: float32
- Return type:
np.ndarray