Skip to content

app-mlperf-training-nvidia

Automatically generated README for this automation recipe: app-mlperf-training-nvidia

Category: Modular MLPerf training benchmark pipeline

License: Apache 2.0

  • CM meta description for this script: _cm.yaml
  • Output cached? False

Reuse this script in your project

Install MLCommons CM automation meta-framework

Pull CM repository with this automation recipe (CM script)

cm pull repo mlcommons@cm4mlops

cmr "app vision language mlcommons mlperf training nvidia" --help

Run this script

Run this script via CLI
cm run script --tags=app,vision,language,mlcommons,mlperf,training,nvidia[,variations] [--input_flags]
Run this script via CLI (alternative)
cmr "app vision language mlcommons mlperf training nvidia [variations]" [--input_flags]
Run this script from Python
import cmind

r = cmind.access({'action':'run'
              'automation':'script',
              'tags':'app,vision,language,mlcommons,mlperf,training,nvidia'
              'out':'con',
              ...
              (other input keys for this script)
              ...
             })

if r['return']>0:
    print (r['error'])
Run this script via Docker (beta)
cm docker script "app vision language mlcommons mlperf training nvidia[variations]" [--input_flags]

Variations

  • No group (any combination of variations can be selected)

    Click here to expand this section.

    • _bert
      • ENV variables:
        • CM_MLPERF_MODEL: bert
  • Group "device"

    Click here to expand this section.

    • _cuda (default)
      • ENV variables:
        • CM_MLPERF_DEVICE: cuda
        • USE_CUDA: True
    • _tpu
      • ENV variables:
        • CM_MLPERF_DEVICE: tpu
        • CUDA_VISIBLE_DEVICES: ``
        • USE_CUDA: False
  • Group "framework"

    Click here to expand this section.

    • _pytorch
      • ENV variables:
        • CM_MLPERF_BACKEND: pytorch
        • CM_MLPERF_BACKEND_VERSION: <<<CM_TORCH_VERSION>>>
    • _tf
      • Aliases: _tensorflow
      • ENV variables:
        • CM_MLPERF_BACKEND: tf
        • CM_MLPERF_BACKEND_VERSION: <<<CM_TENSORFLOW_VERSION>>>
Default variations

_cuda

Script flags mapped to environment

  • --clean=valueCM_MLPERF_CLEAN_SUBMISSION_DIR=value
  • --docker=valueCM_RUN_DOCKER_CONTAINER=value
  • --hw_name=valueCM_HW_NAME=value
  • --model=valueCM_MLPERF_CUSTOM_MODEL_PATH=value
  • --num_threads=valueCM_NUM_THREADS=value
  • --output_dir=valueOUTPUT_BASE_DIR=value
  • --rerun=valueCM_RERUN=value

Default environment

These keys can be updated via --env.KEY=VALUE or env dictionary in @input.json or using script flags.

  • CM_MLPERF_SUT_NAME_IMPLEMENTATION_PREFIX: nvidia

Native script being run

No run file exists for Windows


Script output

cmr "app vision language mlcommons mlperf training nvidia [variations]" [--input_flags] -j