Stream a remote NIfTI-Zarr
When a .nii.zarr volume lives on cloud storage (Azure Blob, S3, GCS),
TorchIO can open it without downloading the entire file. Only the
metadata and the chunks you actually read are fetched over the network.
This is especially useful for large volumes (e.g., 10 GB whole-brain microscopy) where you only need a small region of interest.
Prerequisites
Install TorchIO with the zarr extra and the storage backend you need:
How it works
sequenceDiagram
participant User
participant TorchIO
participant ZarrBackend
participant Cloud as Cloud Storage
User->>TorchIO: ScalarImage("az://…/brain.nii.zarr")
Note over TorchIO: URI stored, no download
User->>TorchIO: image.shape
TorchIO->>ZarrBackend: open remote store
ZarrBackend->>Cloud: fetch header + metadata
Cloud-->>ZarrBackend: ~KB
ZarrBackend-->>TorchIO: (1, 512, 512, 512)
User->>TorchIO: image[:, 100:200, 100:200, 100:200]
TorchIO->>ZarrBackend: slice
ZarrBackend->>Cloud: fetch overlapping chunks
Cloud-->>ZarrBackend: ~MB
ZarrBackend-->>TorchIO: Tensor (1, 100, 100, 100)
When you pass a remote .nii.zarr URI, TorchIO:
- Stores the URI: no bytes are downloaded yet.
- On first metadata access (
.shape,.affine, …), opens a remote zarr store via fsspec and reads only the header. - On slicing, fetches only the chunks that overlap with your region of interest.
Authenticate to Azure and crop a region
import os
import torchio as tio
# Option 1: Authenticate via environment variables (recommended for CI/HPC).
# adlfs picks these up automatically.
os.environ["AZURE_STORAGE_ACCOUNT_NAME"] = "myaccount"
os.environ["AZURE_STORAGE_ACCOUNT_KEY"] = "my-secret-key" # or use SAS, etc.
image = tio.ScalarImage("az://mycontainer/dataset/brain.nii.zarr")
# Nothing has been downloaded yet
print(image.shape) # e.g. (1, 512, 512, 512), only metadata fetched
print(image.spacing) # from the NIfTI header stored in the zarr
# Crop a 100×100×100 ROI, only the overlapping chunks are fetched
roi = image[:, 200:300, 200:300, 200:300]
print(roi.shape) # (1, 100, 100, 100)
print(roi.data.mean())
# Option 2: Pass credentials via reader_kwargs.
# These are forwarded to niizarr → zarr → fsspec → adlfs.
image = tio.ScalarImage(
"az://mycontainer/dataset/brain.nii.zarr",
reader_kwargs={"account_name": "myaccount", "account_key": "my-key"},
)
# Apply a TorchIO transform to the ROI
crop = tio.CropOrPad(128)
cropped = crop(roi)
print(cropped.shape) # (1, 128, 128, 128)
Azure authentication methods
The adlfs library supports several authentication methods.
Set the appropriate environment variables or pass them via
reader_kwargs:
| Method | Environment variables |
|---|---|
| Account key | AZURE_STORAGE_ACCOUNT_NAME, AZURE_STORAGE_ACCOUNT_KEY |
| SAS token | AZURE_STORAGE_ACCOUNT_NAME, AZURE_STORAGE_SAS_TOKEN |
| Connection string | AZURE_STORAGE_CONNECTION_STRING |
| Default credential (Azure CLI / Managed Identity) | AZURE_STORAGE_ACCOUNT_NAME |
See the adlfs documentation for the full list.
Other cloud providers
Authentication is handled by s3fs, which reads ~/.aws/credentials
or the AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY environment
variables.
Authentication is handled by gcsfs, which uses Application Default
Credentials or GOOGLE_APPLICATION_CREDENTIALS.
Comparison with non-Zarr remote files
For non-Zarr remote files (e.g., az://…/brain.nii.gz), TorchIO
downloads the entire file to a temporary local path before reading.
This is the expected behavior because formats like .nii.gz do not
support partial reads over the network.
| Source | What happens |
|---|---|
az://…/brain.nii.gz |
Full download, then local read |
az://…/brain.nii.zarr |
Streaming: only metadata + requested chunks |
If you are working with large remote volumes, converting to .nii.zarr
first is strongly recommended. See Save as NIfTI-Zarr.