app-mlperf-inference-mlcommons-python
Automatically generated README for this automation recipe: app-mlperf-inference-mlcommons-python
Category: Modular MLPerf inference benchmark pipeline
License: Apache 2.0
Developers: Arjun Suresh, Thomas Zhu, Grigori Fursin * Notes from the authors, contributors and users: README-extra
This portable CM script is being developed by the MLCommons taskforce on automation and reproducibility to modularize the python reference implementations of the MLPerf inference benchmark using the MLCommons CM automation meta-framework. The goal is to make it easier to run, optimize and reproduce MLPerf benchmarks across diverse platforms with continuously changing software and hardware.
See the current coverage of different models, devices and backends here.
- CM meta description for this script: _cm.yaml
- Output cached? False
Reuse this script in your project
Install MLCommons CM automation meta-framework
Pull CM repository with this automation recipe (CM script)
cm pull repo mlcommons@cm4mlops
Print CM help from the command line
cmr "app vision language mlcommons mlperf inference reference ref" --help
Run this script
Run this script via CLI
cm run script --tags=app,vision,language,mlcommons,mlperf,inference,reference,ref[,variations] [--input_flags]
Run this script via CLI (alternative)
cmr "app vision language mlcommons mlperf inference reference ref [variations]" [--input_flags]
Run this script from Python
import cmind
r = cmind.access({'action':'run'
'automation':'script',
'tags':'app,vision,language,mlcommons,mlperf,inference,reference,ref'
'out':'con',
...
(other input keys for this script)
...
})
if r['return']>0:
print (r['error'])
Run this script via Docker (beta)
cm docker script "app vision language mlcommons mlperf inference reference ref[variations]" [--input_flags]
Variations
-
No group (any combination of variations can be selected)
Click here to expand this section.
_3d-unet
- ENV variables:
- CM_TMP_IGNORE_MLPERF_QUERY_COUNT:
True
- CM_MLPERF_MODEL_SKIP_BATCHING:
True
- CM_TMP_IGNORE_MLPERF_QUERY_COUNT:
- ENV variables:
_beam_size.#
- ENV variables:
- GPTJ_BEAM_SIZE:
#
- GPTJ_BEAM_SIZE:
- ENV variables:
_bert
- ENV variables:
- CM_MLPERF_MODEL_SKIP_BATCHING:
True
- CM_MLPERF_MODEL_SKIP_BATCHING:
- ENV variables:
_dlrm
- ENV variables:
- CM_MLPERF_MODEL_SKIP_BATCHING:
True
- CM_MLPERF_MODEL_SKIP_BATCHING:
- ENV variables:
_multistream
- ENV variables:
- CM_MLPERF_LOADGEN_SCENARIO:
MultiStream
- CM_MLPERF_LOADGEN_SCENARIO:
- ENV variables:
_offline
- ENV variables:
- CM_MLPERF_LOADGEN_SCENARIO:
Offline
- CM_MLPERF_LOADGEN_SCENARIO:
- ENV variables:
_r2.1_default
- ENV variables:
- CM_RERUN:
yes
- CM_SKIP_SYS_UTILS:
yes
- CM_TEST_QUERY_COUNT:
100
- CM_RERUN:
- ENV variables:
_server
- ENV variables:
- CM_MLPERF_LOADGEN_SCENARIO:
Server
- CM_MLPERF_LOADGEN_SCENARIO:
- ENV variables:
_singlestream
- ENV variables:
- CM_MLPERF_LOADGEN_SCENARIO:
SingleStream
- CM_MLPERF_LOADGEN_SCENARIO:
- ENV variables:
-
Group "batch-size"
Click here to expand this section.
_batch_size.#
- ENV variables:
- CM_MLPERF_LOADGEN_MAX_BATCHSIZE:
#
- CM_MLPERF_LOADGEN_MAX_BATCHSIZE:
- ENV variables:
-
Group "device"
Click here to expand this section.
_cpu
(default)- ENV variables:
- CM_MLPERF_DEVICE:
cpu
- CUDA_VISIBLE_DEVICES: ``
- USE_CUDA:
False
- USE_GPU:
False
- CM_MLPERF_DEVICE:
- ENV variables:
_cuda
- ENV variables:
- CM_MLPERF_DEVICE:
gpu
- USE_CUDA:
True
- USE_GPU:
True
- CM_MLPERF_DEVICE:
- ENV variables:
_rocm
- ENV variables:
- CM_MLPERF_DEVICE:
rocm
- USE_GPU:
True
- CM_MLPERF_DEVICE:
- ENV variables:
_tpu
- ENV variables:
- CM_MLPERF_DEVICE:
tpu
- CM_MLPERF_DEVICE:
- ENV variables:
-
Group "framework"
Click here to expand this section.
_deepsparse
- ENV variables:
- CM_MLPERF_BACKEND:
deepsparse
- CM_MLPERF_BACKEND_VERSION:
<<<CM_DEEPSPARSE_VERSION>>>
- CM_MLPERF_BACKEND:
- ENV variables:
_ncnn
- ENV variables:
- CM_MLPERF_BACKEND:
ncnn
- CM_MLPERF_BACKEND_VERSION:
<<<CM_NCNN_VERSION>>>
- CM_MLPERF_VISION_DATASET_OPTION:
imagenet_pytorch
- CM_MLPERF_BACKEND:
- ENV variables:
_onnxruntime
(default)- ENV variables:
- CM_MLPERF_BACKEND:
onnxruntime
- CM_MLPERF_BACKEND:
- ENV variables:
_pytorch
- ENV variables:
- CM_MLPERF_BACKEND:
pytorch
- CM_MLPERF_BACKEND_VERSION:
<<<CM_TORCH_VERSION>>>
- CM_MLPERF_BACKEND:
- ENV variables:
_ray
- ENV variables:
- CM_MLPERF_BACKEND:
ray
- CM_MLPERF_BACKEND_VERSION:
<<<CM_TORCH_VERSION>>>
- CM_MLPERF_BACKEND:
- ENV variables:
_tf
- Aliases:
_tensorflow
- ENV variables:
- CM_MLPERF_BACKEND:
tf
- CM_MLPERF_BACKEND_VERSION:
<<<CM_TENSORFLOW_VERSION>>>
- CM_MLPERF_BACKEND:
- Aliases:
_tflite
- ENV variables:
- CM_MLPERF_BACKEND:
tflite
- CM_MLPERF_BACKEND_VERSION:
<<<CM_TFLITE_VERSION>>>
- CM_MLPERF_VISION_DATASET_OPTION:
imagenet_tflite_tpu
- CM_MLPERF_BACKEND:
- ENV variables:
_tvm-onnx
- ENV variables:
- CM_MLPERF_BACKEND:
tvm-onnx
- CM_MLPERF_BACKEND_VERSION:
<<<CM_ONNXRUNTIME_VERSION>>>
- CM_MLPERF_BACKEND:
- ENV variables:
_tvm-pytorch
- ENV variables:
- CM_MLPERF_BACKEND:
tvm-pytorch
- CM_MLPERF_BACKEND_VERSION:
<<<CM_TORCH_VERSION>>>
- CM_PREPROCESS_PYTORCH:
yes
- MLPERF_TVM_TORCH_QUANTIZED_ENGINE:
qnnpack
- CM_MLPERF_BACKEND:
- ENV variables:
_tvm-tflite
- ENV variables:
- CM_MLPERF_BACKEND:
tvm-tflite
- CM_MLPERF_BACKEND_VERSION:
<<<CM_TVM-TFLITE_VERSION>>>
- CM_MLPERF_BACKEND:
- ENV variables:
-
Group "implementation"
Click here to expand this section.
_python
(default)- ENV variables:
- CM_MLPERF_PYTHON:
yes
- CM_MLPERF_IMPLEMENTATION:
reference
- CM_MLPERF_PYTHON:
- ENV variables:
-
Group "models"
Click here to expand this section.
_3d-unet-99
- ENV variables:
- CM_MODEL:
3d-unet-99
- CM_MODEL:
- ENV variables:
_3d-unet-99.9
- ENV variables:
- CM_MODEL:
3d-unet-99.9
- CM_MODEL:
- ENV variables:
_bert-99
- ENV variables:
- CM_MODEL:
bert-99
- CM_MODEL:
- ENV variables:
_bert-99.9
- ENV variables:
- CM_MODEL:
bert-99.9
- CM_MODEL:
- ENV variables:
_dlrm-99
- ENV variables:
- CM_MODEL:
dlrm-99
- CM_MODEL:
- ENV variables:
_dlrm-99.9
- ENV variables:
- CM_MODEL:
dlrm-99.9
- CM_MODEL:
- ENV variables:
_gptj-99
- ENV variables:
- CM_MODEL:
gptj-99
- CM_MODEL:
- ENV variables:
_gptj-99.9
- ENV variables:
- CM_MODEL:
gptj-99.9
- CM_MODEL:
- ENV variables:
_llama2-70b-99
- ENV variables:
- CM_MODEL:
llama2-70b-99
- CM_MODEL:
- ENV variables:
_llama2-70b-99.9
- ENV variables:
- CM_MODEL:
llama2-70b-99.9
- CM_MODEL:
- ENV variables:
_resnet50
(default)- ENV variables:
- CM_MODEL:
resnet50
- CM_MLPERF_USE_MLCOMMONS_RUN_SCRIPT:
yes
- CM_MODEL:
- ENV variables:
_retinanet
- ENV variables:
- CM_MODEL:
retinanet
- CM_MLPERF_USE_MLCOMMONS_RUN_SCRIPT:
yes
- CM_MLPERF_LOADGEN_MAX_BATCHSIZE:
1
- CM_MODEL:
- ENV variables:
_rnnt
- ENV variables:
- CM_MODEL:
rnnt
- CM_MLPERF_MODEL_SKIP_BATCHING:
True
- CM_TMP_IGNORE_MLPERF_QUERY_COUNT:
True
- CM_MODEL:
- ENV variables:
_sdxl
- ENV variables:
- CM_MODEL:
stable-diffusion-xl
- CM_NUM_THREADS:
1
- CM_MODEL:
- ENV variables:
-
Group "network"
Click here to expand this section.
_network-lon
- ENV variables:
- CM_NETWORK_LOADGEN:
lon
- CM_MLPERF_SUT_NAME_RUN_CONFIG_SUFFIX1:
network_loadgen
- CM_NETWORK_LOADGEN:
- ENV variables:
_network-sut
- ENV variables:
- CM_MLPERF_SUT_NAME_RUN_CONFIG_SUFFIX1:
network_sut
- CM_NETWORK_LOADGEN:
sut
- CM_MLPERF_SUT_NAME_RUN_CONFIG_SUFFIX1:
- ENV variables:
-
Group "precision"
Click here to expand this section.
_bfloat16
- ENV variables:
- CM_MLPERF_QUANTIZATION:
False
- CM_MLPERF_MODEL_PRECISION:
bfloat16
- CM_MLPERF_QUANTIZATION:
- ENV variables:
_float16
- ENV variables:
- CM_MLPERF_QUANTIZATION:
False
- CM_MLPERF_MODEL_PRECISION:
float16
- CM_MLPERF_QUANTIZATION:
- ENV variables:
_fp32
(default)- ENV variables:
- CM_MLPERF_QUANTIZATION:
False
- CM_MLPERF_MODEL_PRECISION:
float32
- CM_MLPERF_QUANTIZATION:
- ENV variables:
_int8
- Aliases:
_quantized
- ENV variables:
- CM_MLPERF_QUANTIZATION:
True
- CM_MLPERF_MODEL_PRECISION:
int8
- CM_MLPERF_QUANTIZATION:
- Aliases:
Default variations
_cpu,_fp32,_onnxruntime,_python,_resnet50
Script flags mapped to environment
--clean=value
→CM_MLPERF_CLEAN_SUBMISSION_DIR=value
--count=value
→CM_MLPERF_LOADGEN_QUERY_COUNT=value
--dataset=value
→CM_MLPERF_VISION_DATASET_OPTION=value
--dataset_args=value
→CM_MLPERF_EXTRA_DATASET_ARGS=value
--docker=value
→CM_RUN_DOCKER_CONTAINER=value
--hw_name=value
→CM_HW_NAME=value
--imagenet_path=value
→IMAGENET_PATH=value
--max_amps=value
→CM_MLPERF_POWER_MAX_AMPS=value
--max_batchsize=value
→CM_MLPERF_LOADGEN_MAX_BATCHSIZE=value
--max_volts=value
→CM_MLPERF_POWER_MAX_VOLTS=value
--mode=value
→CM_MLPERF_LOADGEN_MODE=value
--model=value
→CM_MLPERF_CUSTOM_MODEL_PATH=value
--multistream_target_latency=value
→CM_MLPERF_LOADGEN_MULTISTREAM_TARGET_LATENCY=value
--network=value
→CM_NETWORK_LOADGEN=value
--ntp_server=value
→CM_MLPERF_POWER_NTP_SERVER=value
--num_threads=value
→CM_NUM_THREADS=value
--offline_target_qps=value
→CM_MLPERF_LOADGEN_OFFLINE_TARGET_QPS=value
--output_dir=value
→OUTPUT_BASE_DIR=value
--power=value
→CM_MLPERF_POWER=value
--power_server=value
→CM_MLPERF_POWER_SERVER_ADDRESS=value
--regenerate_files=value
→CM_REGENERATE_MEASURE_FILES=value
--rerun=value
→CM_RERUN=value
--scenario=value
→CM_MLPERF_LOADGEN_SCENARIO=value
--server_target_qps=value
→CM_MLPERF_LOADGEN_SERVER_TARGET_QPS=value
--singlestream_target_latency=value
→CM_MLPERF_LOADGEN_SINGLESTREAM_TARGET_LATENCY=value
--sut_servers=value
→CM_NETWORK_LOADGEN_SUT_SERVERS=value
--target_latency=value
→CM_MLPERF_LOADGEN_TARGET_LATENCY=value
--target_qps=value
→CM_MLPERF_LOADGEN_TARGET_QPS=value
--test_query_count=value
→CM_TEST_QUERY_COUNT=value
--threads=value
→CM_NUM_THREADS=value
Default environment
These keys can be updated via --env.KEY=VALUE
or env
dictionary in @input.json
or using script flags.
- CM_MLPERF_LOADGEN_MODE:
accuracy
- CM_MLPERF_LOADGEN_SCENARIO:
Offline
- CM_OUTPUT_FOLDER_NAME:
test_results
- CM_MLPERF_RUN_STYLE:
test
- CM_TEST_QUERY_COUNT:
10
- CM_MLPERF_QUANTIZATION:
False
- CM_MLPERF_SUT_NAME_IMPLEMENTATION_PREFIX:
reference
- CM_MLPERF_SUT_NAME_RUN_CONFIG_SUFFIX: ``
Script output
cmr "app vision language mlcommons mlperf inference reference ref [variations]" [--input_flags] -j