Use a custom reader
TorchIO reads NIfTI files with NiBabel and everything else with
SimpleITK. If your data is in a format neither supports (e.g., a
custom binary format or a NumPy .npy file), you can pass a custom
reader.
Write a reader function
A reader is a callable that takes a Path and returns a tuple of
(tensor, affine_array):
from pathlib import Path
import numpy as np
import torch
def npy_reader(path: Path) -> tuple[torch.Tensor, np.ndarray]:
data = np.load(path)
tensor = torch.from_numpy(data).unsqueeze(0) # add channel dim
affine = np.eye(4) # identity affine (1mm isotropic)
return tensor, affine
The tensor must be 4D with shape (C, I, J, K).
Use it
import torchio as tio
image = tio.ScalarImage("brain.npy", reader=npy_reader)
print(image.shape) # triggers the reader
print(image.spacing) # (1.0, 1.0, 1.0) from the identity affine
Simple readers load eagerly
A reader that only returns (tensor, affine) cannot read metadata or
regions lazily, so operations like .shape, .dtype, and slicing trigger
a full load through your reader. This is unchanged behavior. To opt into
lazy access, make your reader a lazy reader (below).
Lazy custom readers
Advanced, rarely needed
Most custom readers can stay simple and eager. Only reach for a lazy reader when your format can cheaply read metadata or sub-regions and you actually care about avoiding full loads (for example, very large volumes).
If your format supports reading the shape, affine, dtype, or sub-regions
without loading everything, implement create_backend so TorchIO can access
it lazily. A lazy reader is any object that has a create_backend method
returning an object implementing the
ImageDataBackend protocol.
Building on the .npy reader above, this version is lazy: it returns a
memory-mapped backend, so .shape reads only the header and slicing reads only
the requested block.
from pathlib import Path
import numpy as np
import torch
from torchio.data.backends import BackendRequest, ImageDataBackend, normalize_index
class NpyBackend:
"""Lazy, memory-mapped backend for single-channel ``.npy`` volumes."""
def __init__(self, path: Path):
self._memmap = np.load(path, mmap_mode="r") # shape (I, J, K), unread
@property
def shape(self):
i, j, k = self._memmap.shape
return (1, i, j, k)
@property
def affine(self):
return torch.eye(4, dtype=torch.float64)
@property
def dtype(self):
return self._memmap.dtype
def __getitem__(self, index):
sc, si, sj, sk = normalize_index(index)
return torch.from_numpy(np.array(self._memmap[si, sj, sk]))[None][sc]
def to_tensor(self):
return torch.from_numpy(np.array(self._memmap))[None]
class LazyNpyReader:
"""A custom reader for ``.npy`` that supports lazy access."""
def __call__(self, path: Path, **kwargs) -> tuple:
# Eager fallback, used only if create_backend is unavailable.
backend = self.create_backend(BackendRequest(path=path))
return backend.to_tensor(), backend.affine.numpy()
def create_backend(self, request: BackendRequest) -> ImageDataBackend:
return NpyBackend(request.path)
image = tio.ScalarImage("volume.npy", reader=LazyNpyReader())
print(image.shape) # read from the header, no full load
With a lazy reader, .shape, .affine, .dtype, and image.dataobj[...]
slicing all go through your backend without materializing the full tensor.
Passing reader=... is per image. If instead you want every .npy file to
use this backend, register it once globally with
register_backend; see
Lazy loading and backends for the backend
contract and that registry-based alternative.