CLI Reference

submit-aml

Submit a job to be run on Azure Machine Learning.

Unrecognized arguments are ignored and propagated to the script.

submit-aml \
    --script run.py \
    --experiment-name "my-experiment" \
    --mount "vindr_dir=VINDR-CXR-V2" \
        --my-script-arg "hello"

Usage:

submit [OPTIONS]

Options:

  -e, --experiment-name TEXT      Name of the Azure ML experiment to which the
                                  job will be submitted. If not provided, the
                                  name of the current directory name will be
                                  used.
  -r, --run-name TEXT             Display name of the Azure ML run.
  --workspace TEXT                Name of the Azure ML workspace.
  -g, --resource-group TEXT       Name of the Azure ML resource group.
  --subscription TEXT             Subscription ID of the workspace.
  --credential [azcli|msi]        Credential type to use for Azure
                                  authentication. Use "azcli" for Azure CLI or
                                  "msi" for managed identity.  [default:
                                  azcli]
  --description TEXT              Description for the Azure ML job. If not
                                  provided, the local command will be used.
  -c, --compute-target TEXT       Name of the Azure ML compute target to run
                                  the job on.
  -i, --docker-image TEXT         Base Docker image to use for the job.
                                  [default: mcr.microsoft.com/azureml/openmpi4
                                  .1.0-cuda11.8-cudnn8-ubuntu22.04]
  --build-context / --no-build-context
                                  Whether to build a Docker context from the
                                  project directory.  [default: build-context]
  --docker-run TEXT               Extra command to run in Docker build before
                                  syncing the environment.
  --aml-environment TEXT          Name of an existing Azure ML environment to
                                  use for the job. If provided, the Docker
                                  image and build context arguments will be
                                  ignored.
  --shared-memory INTEGER         Amount of shared memory for the Docker
                                  container (in GB)  [default: 256]
  -n, --num-nodes INTEGER         Number of nodes to use for the job.
                                  [default: 1]
  -d, --download TEXT             [DEPRECATED] Use --download-asset,
                                  --download-datastore or --download-job
                                  instead. Azure ML dataset, datastore folder
                                  or job output folder to download. To
                                  download an Azure ML dataset, the argument
                                  should take the form: alias, name and
                                  version of the dataset; for example:
                                  'vindr_dir=VINDR-CXR-V2:1'. If the version
                                  is omitted, the last one will be used. To
                                  download a datastore folder, use
                                  'alias=datastore/folder'. To download the
                                  output folder of a previous job, prefer
                                  --download-job; on this deprecated flag use
                                  the 'alias=job_dir:<job_id>:<path>' form,
                                  for example 'checkpoint=job_dir:crusty_hat_4
                                  3s6lmvb25:outputs/checkpoint-10000' (the
                                  bare 'alias=<job_id>:<path>' form is only
                                  recognised as a job when <path> contains a
                                  '/', otherwise it is read as a data asset).
                                  The alias can be used to pass input datasets
                                  to the script, e.g., '${{inputs.vindr_dir}}'
                                  or '${{inputs.checkpoint}}'. This option can
                                  be used multiple times.
  -m, --mount TEXT                [DEPRECATED] Use --mount-asset, --mount-
                                  datastore or --mount-job instead. Azure ML
                                  dataset, datastore folder or job output
                                  folder to mount. For an Azure ML dataset,
                                  the alias, name and version should be
                                  provided; for a datastore folder, use
                                  'alias=datastore/folder'; while for a job
                                  output folder, the alias, job ID and path in
                                  the job outputs should be provided. See the
                                  --download option for more information.
  --mount-asset TEXT              Registered Azure ML data asset to mount,
                                  expressed as "alias=name[:version]". For
                                  example: "vindr_dir=VINDR-CXR-V2:1". If the
                                  version is omitted, the latest one is used.
                                  Pass it to the script with
                                  '${{inputs.vindr_dir}}'. This option can be
                                  used multiple times.
  --download-asset TEXT           Registered Azure ML data asset to download.
                                  Same format as --mount-asset. This option
                                  can be used multiple times.
  --mount-datastore TEXT          Datastore folder to mount, expressed as
                                  "alias=datastore/path/to/folder". For
                                  example: "ref=mystore/exports/reference".
                                  Pass it to the script with
                                  '${{inputs.ref}}'. This option can be used
                                  multiple times.
  --download-datastore TEXT       Datastore folder to download. Same format as
                                  --mount-datastore. This option can be used
                                  multiple times.
  --mount-job TEXT                Output of a previous job to mount, expressed
                                  as "alias=<job_id>:<path/in/run/artifacts>".
                                  The path may point at any run artifact, not
                                  just files under outputs/. For example: "che
                                  ckpoint=crusty_hat_43s6lmvb25:models/best.pt
                                  h". Pass it to the script with
                                  '${{inputs.checkpoint}}'. This option can be
                                  used multiple times.
  --download-job TEXT             Output of a previous job to download. Same
                                  format as --mount-job. This option can be
                                  used multiple times.
  -o, --output TEXT               [DEPRECATED] Use --output-datastore or
                                  --output-asset instead. Alias, datastore and
                                  path to folder into which outputs will be
                                  written, expressed as
                                  "alias=datastore/path/to/dir". For example:
                                  "out_dir=mydatastore/my_dataset". The alias
                                  can be used to pass outputs to the script,
                                  e.g., "${{outputs.out_dir}}". See the
                                  example for more information. This option
                                  can be used multiple times.
  --output-datastore TEXT         Datastore folder into which outputs will be
                                  written, expressed as
                                  "alias=datastore/path/to/dir". For example:
                                  "out_dir=mydatastore/my_dataset". Pass it to
                                  the script with '${{outputs.out_dir}}'. This
                                  option can be used multiple times.
  --output-asset TEXT             Register the outputs as an Azure ML data
                                  asset, expressed as "alias=name[:version]".
                                  For example: "out_dir=my-results". The blobs
                                  are written to the workspace's default
                                  datastore and registered as a data asset; if
                                  the version is omitted, Azure ML auto-
                                  increments it. Pass it to the script with
                                  '${{outputs.out_dir}}'. This option can be
                                  used multiple times.
  --command-prefix TEXT           Prefix to prepend to the command. For
                                  example, `uv run`.  [default: uv run --no-
                                  default-groups]
  --executable TEXT               The executable, e.g., `python`, `'torchrun
                                  --nproc-per-node auto'`, `bash`, or `nvidia-
                                  smi`.  [default: python]
  -s, --script PATH               Path to the script that will be run on Azure
                                  ML.
  --sweep TEXT                    Azure ML hyperparameter for sweep jobs.
                                  Examples: "seed=[0, 1, 2]",
                                  "model/unet=['tiny', 'small']",
                                  "+trainer.max_epochs=[10, 20]",
                                  "model.learning_rate=[1.0e-4, 2.0e-4]". If a
                                  `--sweep-prefix` is passed, the sweep
                                  arguments will be added to the command with
                                  the prefix. The keys are adapted to be
                                  compatible with Azure ML Inputs and will be
                                  available as environment variables in the
                                  job. For the examples above, the environment
                                  variables will be `AZUREML_SWEEP_seed`,
                                  `AZUREML_SWEEP_model_unet`,
                                  `AZUREML_SWEEP_trainer_max_epochs`, and
                                  `AZUREML_SWEEP_model_learning_rate`.
  --sweep-prefix TEXT             Prefix to prepend to the sweep arguments in
                                  the command. If not provided, the sweep
                                  arguments will not be added to the command.
  --max-concurrent-trials INTEGER
                                  Maximum number of concurrent trials for the
                                  sweep job.
  -l, --stream-logs               Wait for completion and stream the logs of
                                  the job.
  --source-dir PATH               Path to the directory containing the source
                                  code for the job. If not provided, the
                                  current directory is used.
  -P, --project-dir PATH          Directory containing a pyproject.toml,
                                  uv.lock and .python-version file. These
                                  files will be used to build the Docker
                                  image. If not provided, the current
                                  directory is used.
  --num-gpus INTEGER              Number of requested GPUs per node. This
                                  should typically match the number of GPUs in
                                  the compute target. If provided, the
                                  `PyTorchDistribution` will be selected.
                                  Otherwise, the `MpiDistribution` will be
                                  used and  `--executable` should be set to
                                  `'torchrun --nproc-per-node auto'` for
                                  multi-GPU PyTorch runs. Must not be set for
                                  Lightning jobs. More information at
                                  https://learn.microsoft.com/en-
                                  us/azure/machine-learning/how-to-train-
                                  distributed-gpu?view=azureml-api-2.
  --debug / --no-debug            Install debugpy on AML and run the command
                                  using debugpy. The job will not start until
                                  a remote debugger is attached. More
                                  information at
                                  https://learn.microsoft.com/en-
                                  us/azure/machine-learning/how-to-
                                  interactive-jobs?view=azureml-
                                  api-2&tabs=ui#attach-a-debugger-to-a-job.
                                  [default: no-debug]
  --tensorboard / --no-tensorboard
                                  Enable a TensorBoard interactive service for
                                  the job.  [default: tensorboard]
  --tensorboard-dir PATH          Directory in which the TensorBoard logs are
                                  expected to be stored.  [default:
                                  logs/tensorboard]
  --profiler / --no-profiler      Enable profiling on Azure ML. Needs CUDA >=
                                  12 and PyTorch >= 2.  [default: no-profiler]
  -G, --dependency-group TEXT     Dependency groups to install in the Docker
                                  image. If not provided, no dependency groups
                                  are installed. The groups are defined in the
                                  pyproject.toml file. This option can be used
                                  multiple times.
  --extra TEXT                    Optional dependency groups (extras) to
                                  install in the Docker image. If not
                                  provided, no extras are installed. The
                                  optional groups are defined in the
                                  pyproject.toml file. This option can be used
                                  multiple times.
  --conda-env-file PATH           Path to a conda environment YAML file (e.g.,
                                  environment.yml). If provided, a conda
                                  environment will be used instead of Docker
                                  build context. Cannot be used together with
                                  --build-context, --aml-environment, or uv-
                                  specific options.
  --only-env                      Exit after instantiating the environment.
                                  This is useful during development so that
                                  the AML environment build runs immediately
                                  and the job starts faster once the script is
                                  ready to be submitted.
  -E, --set TEXT                  Environment variables to set on the job. The
                                  format is `KEY=VALUE`. This option can be
                                  used multiple times.
  -D, --dry-run                   Exit before submitting the job.
  -t, --tag TEXT                  Tags to set on the Azure ML job. The format
                                  is `KEY=VALUE`. This option can be used
                                  multiple times.
  --install-completion            Install completion for the current shell.
  --show-completion               Show completion for the current shell, to
                                  copy it or customize the installation.