get-preprocessed-dataset-openorca
Automatically generated README for this automation recipe: get-preprocessed-dataset-openorca
Category: AI/ML datasets
License: Apache 2.0
- CM meta description for this script: _cm.json
- Output cached? True
Reuse this script in your project
Install MLCommons CM automation meta-framework
Pull CM repository with this automation recipe (CM script)
cm pull repo mlcommons@cm4mlops
Print CM help from the command line
cmr "get dataset openorca language-processing preprocessed" --help
Run this script
Run this script via CLI
cm run script --tags=get,dataset,openorca,language-processing,preprocessed[,variations]
Run this script via CLI (alternative)
cmr "get dataset openorca language-processing preprocessed [variations]"
Run this script from Python
import cmind
r = cmind.access({'action':'run'
'automation':'script',
'tags':'get,dataset,openorca,language-processing,preprocessed'
'out':'con',
...
(other input keys for this script)
...
})
if r['return']>0:
print (r['error'])
Run this script via Docker (beta)
cm docker script "get dataset openorca language-processing preprocessed[variations]"
Variations
-
Group "dataset-type"
Click here to expand this section.
_calibration
- ENV variables:
- CM_DATASET_CALIBRATION:
yes
- CM_DATASET_CALIBRATION:
- ENV variables:
_validation
(default)- ENV variables:
- CM_DATASET_CALIBRATION:
no
- CM_DATASET_CALIBRATION:
- ENV variables:
-
Group "size"
Click here to expand this section.
_60
(default)_full
_size.#
Default variations
_60,_validation
Default environment
These keys can be updated via --env.KEY=VALUE
or env
dictionary in @input.json
or using script flags.
- CM_DATASET_CALIBRATION:
no
Native script being run
No run file exists for Windows
Script output
cmr "get dataset openorca language-processing preprocessed [variations]" -j