Skip to content

Submission Generation

Click here to view the proposal slide for Common Automation for MLPerf Inference Submission Generation through CM.

If you have not followed the cm run commands under the individual model pages in the benchmarks directory, please make sure that the result directory is structured in the following way. You can see the real examples for the expected folder structure here.

└── System description ID(SUT Name)
    ├── system_meta.json
    └── Benchmark
        └── Scenario
            ├── Performance
            |   └── run_1 run for all scenarios
            |       ├── mlperf_log_summary.txt
            |       └── mlperf_log_detail.txt
            ├── Accuracy
            |   ├── mlperf_log_summary.txt
            |   ├── mlperf_log_detail.txt
            |   ├── mlperf_log_accuracy.json
            |   └── accuracy.txt
            |── Compliance_Test_ID
            |   ├── Performance
            |   |   └── run_x/#1 run for all scenarios
            |   |       ├── mlperf_log_summary.txt
            |   |       └── mlperf_log_detail.txt
            |   ├── Accuracy # for TEST01 only
            |   |   ├── baseline_accuracy.txt (if test fails in deterministic mode)
            |   |   ├── compliance_accuracy.txt (if test fails in deterministic mode)
            |   |   ├── mlperf_log_accuracy.json
            |   |   └── accuracy.txt
            |   ├── verify_performance.txt
            |   └── verify_accuracy.txt # for TEST01 only
            |── user.conf
            └── measurements.json

Click here if you are submitting in open division

  • The model_mapping.json should be included inside the SUT folder which is used to map the custom model full name to the official model name. The format of json file is:

    {
        "custom_model_name_for_model1":"official_model_name_for_model1",
        "custom_model_name_for_model2":"official_model_name_for_model2",

    }

If you have followed the cm run commands under the individual model pages in the benchmarks directory, all the valid results will get aggregated to the cm cache folder. The following command could be used to browse the structure of inference results folder generated by CM.

Get results folder structure

cm find cache --tags=get,mlperf,inference,results,dir | xargs tree

Once all the results across all the models are ready you can use the following the below section to generate a valid submission tree compliant with the MLPerf requirements.

Generate submission folder

The submission generation flow is explained in the below diagram

flowchart LR
    subgraph Generation [Submission Generation SUT1]
      direction TB
      A[populate system details] --> B[generate submission structure]
      B --> C[truncate-accuracy-logs]
      C --> D{Infer low talency results <br>and/or<br> filter out invalid results}
      D --> yes --> E[preprocess-mlperf-inference-submission]
      D --> no --> F[run-mlperf-inference-submission-checker]
      E --> F
    end
    Input((Results SUT1)) --> Generation
    Generation --> Output((Submission Folder <br> SUT1))

Command to generate submission folder

cm run script --tags=generate,inference,submission \
  --clean \
  --preprocess_submission=yes \
  --run-checker=yes \
  --submitter=MLCommons \
  --division=closed \
  --env.CM_DETERMINE_MEMORY_CONFIGURATION=yes \
  --quiet

Tip

  • Use --hw_name="My system name" to give a meaningful system name. Examples can be seen here

  • Use --submitter=<Your name> if your organization is an official MLCommons member and would like to submit under your organization

  • Use --hw_notes_extra option to add additional notes like --hw_notes_extra="Result taken by NAME"

  • Use --results_dir option to specify the results folder. It is automatically taken from CM cache for MLPerf automation based runs

  • Use --submission_dir option to specify the submission folder. (You can avoid this if you're pushing to github or only running a single SUT and CM will use its cache folder)

  • Use --division=open for open division submission

  • Use --category option to specify the category for which submission is generated(datacenter/edge). By default, the category is taken from system_meta.json file located in the SUT root directory.

  • Use --submission_base_dir to specify the directory to which the outputs from preprocess submission script and final submission is added. No need to provide --submission_dir along with this. For docker run, use --submission_base_dir instead of --submission_dir.

If there are multiple systems where MLPerf results are collected, the same process needs to be repeated on each of them. One we have submission folders on all the SUTs, we need to sync them to make a single submission folder

If you are having results in multiple systems, you need to merge them to one system. You can use rsync for this. For example, the below command will sync the submission folder from SUT2 to the one in SUT1.

rsync -avz username@host1:<path_to_submission_folder2>/ <path_to_submission_folder1>/
Same needs to be repeated for all other SUTs so that we have the full submissions in SUT1.

    flowchart LR
        subgraph SUT1 [Submission Generation SUT1]
          A[Submission Folder SUT1]
        end
        subgraph SUT2 [Submission Generation SUT2]
          B[Submission Folder SUT2]
        end
        subgraph SUT3 [Submission Generation SUT3]
          C[Submission Folder SUT3]
        end
        subgraph SUTN [Submission Generation SUTN]
          D[Submission Folder SUTN]
        end
        SUT2 --> SUT1
        SUT3 --> SUT1
        SUTN --> SUT1

If you are collecting results across multiple systems you can generate different submissions and aggregate all of them to a GitHub repository (can be private) and use it to generate a single tar ball which can be uploaded to the MLCommons Submission UI.

Run the following command after replacing --repo_url with your GitHub repository URL.

cm run script --tags=push,github,mlperf,inference,submission \
   --repo_url=https://github.com/mlcommons/mlperf_inference_submissions_v5.0 \
   --commit_message="Results on <HW name> added by <Name>" \
   --quiet
    flowchart LR
        subgraph SUT1 [Submission Generation SUT1]
          A[Submission Folder SUT1]
        end
        subgraph SUT2 [Submission Generation SUT2]
          B[Submission Folder SUT2]
        end
        subgraph SUT3 [Submission Generation SUT3]
          C[Submission Folder SUT3]
        end
        subgraph SUTN [Submission Generation SUTN]
          D[Submission Folder SUTN]
        end
    SUT2 -- git sync and push --> G[Github Repo]
    SUT3 -- git sync and push --> G[Github Repo]
    SUTN -- git sync and push --> G[Github Repo]
    SUT1 -- git sync and push --> G[Github Repo]

Upload the final submission

Warning

If you are using GitHub for consolidating your results, make sure that you have run the push-to-github command on the same system to ensure results are synced as is on the GitHub repository.

Once you have all the results on the system, you can upload them to the MLCommons submission server as follows:

You can do the following command which will run the submission checker and upload the results to the MLCommons submission server

cm run script --tags=run,submission,checker \
--submitter_id=<> \
--submission_dir=<Path to the submission folder>

You can do the following command to generate the final submission tar file and then upload to the MLCommons Submission UI.

cm run script --tags=run,submission,checker \
--submission_dir=<Path to the submission folder> \
--tar=yes \
--submission_tar_file=mysubmission.tar.gz

        flowchart LR
            subgraph SUT [Combined Submissions]
              A[Combined Submission Folder in SUT1]
            end
        SUT --> B[Run submission checker]
        B --> C[Upload to MLC Submission server]
        C --> D[Receive validation email]