get-preprocessed-dataset-criteo
Automatically generated README for this automation recipe: get-preprocessed-dataset-criteo
Category: AI/ML datasets
License: Apache 2.0
-
Notes from the authors, contributors and users: README-extra
-
CM meta description for this script: _cm.json
- Output cached? True
Reuse this script in your project
Install MLCommons CM automation meta-framework
Pull CM repository with this automation recipe (CM script)
cm pull repo mlcommons@cm4mlops
Print CM help from the command line
cmr "get dataset criteo recommendation dlrm preprocessed" --help
Run this script
Run this script via CLI
cm run script --tags=get,dataset,criteo,recommendation,dlrm,preprocessed[,variations] [--input_flags]
Run this script via CLI (alternative)
cmr "get dataset criteo recommendation dlrm preprocessed [variations]" [--input_flags]
Run this script from Python
import cmind
r = cmind.access({'action':'run'
'automation':'script',
'tags':'get,dataset,criteo,recommendation,dlrm,preprocessed'
'out':'con',
...
(other input keys for this script)
...
})
if r['return']>0:
print (r['error'])
Run this script via Docker (beta)
cm docker script "get dataset criteo recommendation dlrm preprocessed[variations]" [--input_flags]
Variations
-
No group (any combination of variations can be selected)
Click here to expand this section.
_1
- ENV variables:
- CM_DATASET_SIZE:
1
- CM_DATASET_SIZE:
- ENV variables:
_50
- ENV variables:
- CM_DATASET_SIZE:
50
- CM_DATASET_SIZE:
- ENV variables:
_fake
- ENV variables:
- CM_CRITEO_FAKE:
yes
- CM_CRITEO_FAKE:
- ENV variables:
_full
_validation
-
Group "type"
Click here to expand this section.
_multihot
(default)- ENV variables:
- CM_DATASET_CRITEO_MULTIHOT:
yes
- CM_DATASET_CRITEO_MULTIHOT:
- ENV variables:
Default variations
_multihot
Script flags mapped to environment
--dir=value
→CM_DATASET_PREPROCESSED_PATH=value
--output_dir=value
→CM_DATASET_PREPROCESSED_OUTPUT_PATH=value
--threads=value
→CM_NUM_PREPROCESS_THREADS=value
Native script being run
No run file exists for Windows
Script output
cmr "get dataset criteo recommendation dlrm preprocessed [variations]" [--input_flags] -j