ScanImageTiff#

ScanImageTiffImagingExtractors#

Specialized extractor for reading TIFF files produced via ScanImage.

Classes#

ScanImageLegacyImagingExtractor

Specialized extractor for reading TIFF files produced via ScanImage.

class ScanImageImagingExtractor(file_path: str | Path | None = None, channel_name: str | None = None, file_paths: list[str | Path] | None = None, slice_sample: int | None = None, plane_index: int | None = None, interleave_slice_samples: bool = False)[source]#

Bases: ImagingExtractor

Specialized extractor for reading TIFF files produced via ScanImage software.

This extractor is designed to handle the structure of ScanImage TIFF files, which can contain multi channel and both planar and volumetric data. It also supports both single-file and multi-file datasets generated by ScanImage in various acquisition modes (grab, focus, loop).

The extractor creates a mapping between each frame in the dataset and its corresponding physical file and IFD (Image File Directory) location. This mapping enables efficient retrieval of specific frames without loading the entire dataset into memory, making it suitable for large datasets.

For datasets with multiple frames per slice, either a slice_sample parameter must be provided or interleave_slice_samples must be set to True to explicitly opt into interleaving behavior.

Key features: - Handles multi-channel data with channel selection - Supports volumetric (multi-plane) imaging data - Automatically detects and loads multi-file datasets based on ScanImage naming conventions - Extracts and provides access to ScanImage metadata - Efficiently retrieves frames using lazy loading - Handles flyback frames in volumetric data by ignoring them in the mapping

Initialize the ScanImageImagingExtractor.

Parameters:
  • file_path (PathType, optional) – Path to the ScanImage TIFF file. If this is part of a multi-file series, this should be the first file. Either file_path or file_paths must be provided.

  • channel_name (str, optional) – Name of the channel to extract (e.g., “Channel 1”, “Channel 2”). - If None and only one channel is available, that channel will be used. - If None and multiple channels are available, an error will be raised. - Use get_available_channel_names(file_path) to see available channels before creating the extractor.

  • file_paths (list[PathType], optional) – List of file paths to use. If provided, this overrides the automatic file detection heuristics. Use this parameter when: - Automatic detection doesn’t work correctly - You need to specify a custom subset of files - You need to control the exact order of files The file paths must be provided in the temporal order of the frames in the dataset.

  • slice_sample (int, optional) – Controls how to handle multiple frames per slice in volumetric data: - If an integer (0 to frames_per_slice-1): Uses only that specific frame for each slice, effectively selecting a single sample from each acquisition. - If None (default): Requires interleave_slice_samples=True when frames_per_slice > 1. - This parameter has no effect when frames_per_slice = 1. - Use get_frames_per_slice(file_path) to check the number of frames per slice.

  • interleave_slice_samples (bool, optional) – Controls whether to interleave all slice samples as separate time points when frames_per_slice > 1: - If True: Interleaves all slice samples as separate time points, increasing the effective number of samples by frames_per_slice. This treats each slice_sample as a distinct sample. - If False (default): Requires a specific slice_sample to be provided when frames_per_slice > 1. - This parameter has no effect when frames_per_slice = 1 or when slice_sample is provided.

  • plane_index (int, optional) – Must be between 0 and num_planes-1. Used to extract a specific plane from volumetric data. When provided: - The resulting extractor will be planar (is_volumetric = False) - Each sample will contain only data for the specified plane - The shape of returned data will be (samples, height, width) instead of (samples, height, width, planes) - This parameter has no effect on planar (non-volumetric) data.

Examples

# Basic usage with a single file, single channel >>> extractor = ScanImageImagingExtractor(file_path=’path/to/file.tif’)

# Multi-channel data, selecting a specific channel >>> channel_names = ScanImageImagingExtractor.get_available_channel_names(‘path/to/file.tif’) >>> extractor = ScanImageImagingExtractor(file_path=’path/to/file.tif’, channel_name=channel_names[0])

# Volumetric data with multiple frames per slice, selecting a specific slice sample >>> frames_per_slice = ScanImageImagingExtractor.get_frames_per_slice(‘path/to/file.tif’) >>> extractor = ScanImageImagingExtractor(file_path=’path/to/file.tif’, slice_sample=0)

# Volumetric data, extracting a specific plane >>> extractor = ScanImageImagingExtractor(file_path=’path/to/file.tif’, plane_index=2)

# Explicitly specifying multiple files >>> extractor = ScanImageImagingExtractor( … file_paths=[‘path/to/file1.tif’, ‘path/to/file2.tif’, ‘path/to/file3.tif’], … channel_name=’Channel 1’ … )

get_series(start_sample: int | None = None, end_sample: int | None = None) ndarray[source]#

Get data as a time series from start_sample to end_sample.

This method retrieves frames at the specified range from the ScanImage TIFF file(s). It uses the mapping created during initialization to efficiently locate and load only the requested frames, without loading the entire dataset into memory.

For volumetric data (multiple planes), the returned array will have an additional dimension for the planes. For planar data (single plane), the plane dimension is squeezed out.

Parameters:
  • start_sample (int)

  • end_sample (int)

Returns:

Array of data with shape (num_samples, height, width) if num_planes is 1, or (num_samples, height, width, num_planes) if num_planes > 1.

For example, for a non-volumetric dataset with 512x512 frames, requesting 3 samples would return an array with shape (3, 512, 512).

For a volumetric dataset with 5 planes and 512x512 frames, requesting 3 samples would return an array with shape (3, 512, 512, 5).

Return type:

numpy.ndarray

get_image_shape() tuple[int, int][source]#

Get the shape of the video frame (num_rows, num_columns).

Returns:

Shape of the video frame (num_rows, num_columns).

Return type:

tuple

get_frame_shape() tuple[int, int][source]#

Get the shape of a single frame (num_rows, num_columns).

Returns:

Shape of a single frame (num_rows, num_columns).

Return type:

tuple

get_sample_shape()[source]#

Get the shape of a sample.

Returns:

Shape of a single sample. If the data is volumetric, the shape is hape of a single sample (num_rows, num_columns). (num_rows, num_columns, num_planes). Otherwise, the shape is (num_rows, num_columns).

Return type:

tuple of int

get_volume_shape() tuple[int, int, int][source]#

Get the shape of a single volume (num_rows, num_columns, num_planes).

Returns:

Shape of a single volume (num_rows, num_columns, num_planes).

Return type:

tuple

get_num_samples() int[source]#

Get the number of samples in the video.

Returns:

Number of samples in the video.

Return type:

int

get_sampling_frequency() float[source]#

Get the sampling frequency in Hz.

Returns:

Sampling frequency in Hz.

Return type:

float

get_num_planes() int[source]#

Get the number of depth planes.

For volumetric data, this returns the number of Z-planes in each volume. For planar data, this returns 1.

Returns:

Number of depth planes.

Return type:

int

static get_available_channel_names(file_path: str | Path) list[source]#

Get the channel names available in a ScanImage TIFF file.

This static method extracts the channel names from a ScanImage TIFF file without needing to create an extractor instance. This is useful for determining which channels are available before creating an extractor.

Parameters:

file_path (PathType) – Path to the ScanImage TIFF file.

Returns:

list of channel names available in the file.

Return type:

list

Examples

>>> channel_names = ScanImageImagingExtractor.get_available_channel_names('path/to/file.tif')
>>> print(f"Available channels: {channel_names}")
get_dtype() dtype[source]#

Get the data type of the video.

Returns:

Data type of the video.

Return type:

dtype

get_times() ndarray[source]#

Get the timestamps for each frame.

Returns:

Array of timestamps in seconds for each frame.

Return type:

numpy.ndarray

Notes

This method extracts timestamps from the ScanImage TIFF file(s) for the selected channel. It uses the mapping created during initialization to efficiently locate and extract timestamps for each sample.

get_native_timestamps(start_sample: int | None = None, end_sample: int | None = None) ndarray | None[source]#

Retrieve the original unaltered timestamps for the data in this interface.

Parameters:
  • start_sample (int, optional) – The starting sample index. If None, starts from the beginning.

  • end_sample (int, optional) – The ending sample index. If None, goes to the end.

Returns:

timestamps – The timestamps for the data stream, or None if native timestamps are not available.

Return type:

numpy.ndarray or None

static extract_timestamp_from_page(page) float[source]#

Extract timestamp from a ScanImage TIFF page.

Parameters:

page (tifffile.TiffPage) – The TIFF page to extract the timestamp from.

Returns:

The timestamp in seconds or None if no timestamp is found.

Return type:

float

static get_available_num_planes(file_path: str | Path) int[source]#

Get the number of depth planes from a ScanImage TIFF file.

For volumetric data, this returns the number of Z-planes in each volume. For planar data, this returns 1.

Parameters:

file_path (PathType) – Path to the ScanImage TIFF file.

Returns:

Number of depth planes.

Return type:

int

static get_frames_per_slice(file_path: str | Path) int[source]#

Get the number of frames per slice from a ScanImage TIFF file.

ScanImage can sample multiple frames per each slice.

Parameters:

file_path (PathType) – Path to the ScanImage TIFF file.

Returns:

Number of frames per slice.

Return type:

int

get_original_frame_indices(plane_index: int | None = None) ndarray[source]#

Map each extractor sample back to its corresponding raw frame index in the TIFF file(s).

The extractor presents imaging data as a sequence of samples, abstracting away the underlying file structure (channel interleaving, flyback frames, multi-file splits, volumetric plane ordering). This method reverses that abstraction, returning the raw Image File Directory (IFD) index for each sample.

This is primarily useful for temporal alignment with external acquisition systems. When an external device (e.g., a DAQ) records one sync pulse per raw frame, these indices let you look up the corresponding sync timestamp for each extractor sample.

Parameters:

plane_index (int, optional) – For volumetric data, which Z-plane’s frame index to return for each volume. Defaults to the last plane, as acquisition systems commonly assign the volume timestamp at the end of the volume scan. Set to 0 if your system timestamps at the start of each volume.

Returns:

Array of shape (num_samples,) with dtype int64. Each element is a global IFD index across all files in the dataset.

Return type:

np.ndarray

Notes

The returned indices account for:

  • Channel interleaving (CZT frame ordering in ScanImage)

  • Flyback frame exclusion

  • Multi-file IFD offsets (indices are global, not per-file)

  • Plane selection in volumetric data

For multi-channel data, note that the raw frame indices include the channel dimension. If your sync system fires once per plane (not once per channel per plane), divide the returned indices by the number of channels to get the sync pulse index.

Examples

Aligning with sync pulses from an external DAQ:

>>> frame_indices = extractor.get_original_frame_indices()
>>> # If sync fires once per plane (not per channel), adjust:
>>> sync_indices = frame_indices // num_channels
>>> aligned_timestamps = sync_timestamps[sync_indices]
__del__()[source]#

Close file handles when the extractor is garbage collected.

class ScanImageLegacyImagingExtractor(file_path: str | Path, sampling_frequency: float)[source]#

Bases: ImagingExtractor

Specialized extractor for reading TIFF files produced via ScanImage.

This implementation is for legacy purposes and is not recommended for use. Please use ScanImageTiffSinglePlaneImagingExtractor or ScanImageTiffMultiPlaneImagingExtractor instead.

Create a ScanImageLegacyImagingExtractor instance from a TIFF file produced by ScanImage.

This extractor allows for lazy accessing of slices, unlike TiffImagingExtractor. However, direct slicing of the underlying data structure is not equivalent to a numpy memory map.

Parameters:
  • file_path (PathType) – Path to the TIFF file.

  • sampling_frequency (float) – The frequency at which the frames were sampled, in Hz.

get_series(start_sample=None, end_sample=None) ndarray[source]#

Get the series of samples.

Parameters:
  • start_sample (int, optional) – Start sample index (inclusive).

  • end_sample (int, optional) – End sample index (exclusive).

Returns:

series – The series of samples.

Return type:

numpy.ndarray

Notes

Importantly, we follow the convention that the dimensions of the array are returned in their matrix order, More specifically: (time, height, width)

Which is equivalent to: (samples, rows, columns)

For volumetric data, the dimensions are: (time, height, width, planes)

Which is equivalent to: (samples, rows, columns, planes)

Note that this does not match the cartesian convention: (t, x, y)

Where x is the columns width or and y is the rows or height.

get_image_shape() tuple[int, int][source]#

Get the shape of the video frame (num_rows, num_columns).

Returns:

image_shape – Shape of the video frame (num_rows, num_columns).

Return type:

tuple

get_num_samples() int[source]#

Get the number of samples in the video.

Returns:

num_samples – Number of samples in the video.

Return type:

int

get_sampling_frequency() float[source]#

Get the sampling frequency in Hz.

Returns:

sampling_frequency – Sampling frequency in Hz.

Return type:

float

get_native_timestamps(start_sample: int | None = None, end_sample: int | None = None) ndarray | None[source]#

Retrieve the original unaltered timestamps for the data in this interface.

This function should retrieve the data on-demand by re-initializing the IO. Can be overridden to return None if the extractor does not have native timestamps.

Parameters:
  • start_sample (int, optional) – The starting sample index. If None, starts from the beginning.

  • end_sample (int, optional) – The ending sample index. If None, goes to the end.

Returns:

timestamps – The timestamps for the data stream, or None if native timestamps are not available.

Return type:

numpy.ndarray or None

ScanImageTiffUtils#

Utility functions for ScanImage TIFF Extractors.

extract_extra_metadata(file_path: str | Path) dict[source]#

Extract metadata from a ScanImage TIFF file.

Parameters:

file_path (PathType) – Path to the TIFF file.

Returns:

extra_metadata – Dictionary of metadata extracted from the TIFF file.

Return type:

dict

Notes

Known to work on SI versions v3.8.0, v2019bR0, v2022.0.0, and v2023.0.0

parse_matlab_vector(matlab_vector: str) list[source]#

Parse a MATLAB vector string into a list of integer values.

Parameters:

matlab_vector (str) – MATLAB vector string.

Returns:

vector – List of integer values.

Return type:

list of int

Raises:

ValueError – If the MATLAB vector string cannot be parsed.

Notes

MATLAB vector string is of the form “[1 2 3 … N]” or “[1,2,3,…,N]” or “[1;2;3;…;N]”. There may or may not be whitespace between the values. Ex. “[1, 2, 3]” or “[1,2,3]”.

read_scanimage_metadata(file_path: str | Path) dict[source]#

Read and parse metadata from a ScanImage TIFF file.

This function extracts both the non-varying frame metadata and ROI group metadata (if available) from a ScanImage TIFF file and processes them to extract key imaging parameters.

The function returns a python dictionary with fields already parsed to python objects (in opposition to a string)

Parameters:

file_path (PathType) – Path to the ScanImage TIFF file.

Returns:

metadata_dict – Dictionary containing three nested dictionaries: - scan_image_non_varying_frame_metadata: Raw non-varying frame metadata - scan_image_roi_group_metadata: Raw ROI group metadata (if present) - roiextractors_parsed_metadata: Parsed metadata with standardized keys:

  • sampling_frequency: Frame or volume scan rate in Hz

  • num_channels: Number of available imaging channels

  • num_planes: Number of imaging planes (slices)

  • frames_per_slice: Number of frames per Z slice

  • channel_names: List of available channel names

  • roi_metadata: ROI definitions (if present)

Return type:

dict

Notes

The ScanImage TIFF format includes: 1. TIFF Header Section: Defines byte order and offsets 2. ScanImage Static Metadata Section: Contains metadata applicable to all frames

  • Non-Varying Frame Data: System configuration for all frames

  • ROI Group Data: Defined regions of interest (if available)

  1. Frame Sections: One per image frame, containing:
    • IFD Header: Image File Directory with tags and values

    • Frame-specific data: Timestamps and other frame metadata

    • Image data: The actual image pixels

The function uses the tifffile module’s read_scanimage_metadata function to extract the raw metadata, then processes it to standardize key imaging parameters.

parse_metadata(metadata: dict) dict[source]#

Parse metadata dictionary to extract relevant information and store it standard keys for ImagingExtractors.

Currently supports - sampling_frequency - num_planes - frames_per_slice - channel_names - num_channels

Parameters:

metadata (dict) – Dictionary of metadata extracted from the TIFF file.

Returns:

metadata_parsed – Dictionary of parsed metadata.

Return type:

dict

Notes

Known to work on SI versions v2019bR0, v2022.0.0, and v2023.0.0. Fails on v3.8.0. SI.hChannels.channelsActive = string of MATLAB-style vector with channel integers (see parse_matlab_vector). SI.hChannels.channelName = “{‘channel_name_1’ ‘channel_name_2’ … ‘channel_name_M’}”

where M is the number of channels (active or not).

parse_metadata_v3_8(metadata: dict) dict[source]#

Parse metadata dictionary to extract relevant information and store it standard keys for ImagingExtractors.

Requires old version of metadata (v3.8). Currently supports - sampling frequency - num_channels - num_planes

Parameters:

metadata (dict) – Dictionary of metadata extracted from the TIFF file.

Returns:

metadata_parsed – Dictionary of parsed metadata.

Return type:

dict

extract_timestamps_from_file(file_path: str | Path) ndarray[source]#

Extract the frame timestamps from a ScanImage TIFF file.

Parameters:

file_path (PathType) – Path to the TIFF file.

Returns:

timestamps – Array of frame timestamps in seconds.

Return type:

numpy.ndarray

Raises:

AssertionError – If the frame timestamps are not found in the TIFF file.

Notes

Known to work on SI versions v2019bR0, v2022.0.0, and v2023.0.0. Fails on v3.8.0.