Croissant 🥐

Croissant specifications as implemented by mlcroissant.
For the actual specifications, please refer to the Croissant 1.0 standard..
To add new properties, refer to the documentation.

Metadata

Nodes to describe a dataset metadata.
Property Expected type Cardinality Description
sc:name sc:Text ONE The name of the dataset.
cr:citeAs sc:Text ONE A citation for a publication that describes the dataset. Ideally, citations should be expressed using the bibtex format. Note that this is different from schema.org/citation, which is used to make a citation to another publication from this dataset.
sc:creator sc:Organization sc:Person MANY The creator(s) of the dataset.
sc:dateCreated sc:Date sc:DateTime ONE The date the dataset was initially created.
sc:dateModified sc:Date sc:DateTime ONE The date the dataset was last modified.
sc:datePublished sc:Date sc:DateTime ONE The date the dataset was published.
sc:description sc:Text ONE Description of the dataset.
cr:isLiveDataset sc:Boolean ONE Whether the dataset is a live dataset.
sc:keywords sc:Text MANY A set of keywords associated with the dataset, either as free text, or a DefinedTerm with a formal definition.
sc:inLanguage sc:Language sc:Text MANY The language(s) of the content of the dataset.
sc:license sc:CreativeWork sc:Text sc:URL MANY The license of the dataset. Croissant recommends using the URL of a known license, e.g., one of the licenses listed at https://spdx.org/licenses/.
sc:publisher sc:Organization sc:Person MANY The publisher of the dataset, which may be distinct from its creator.
sc:sameAs sc:URL MANY The URL of another Web resource that represents the same dataset as this one.
sc:sdLicense sc:CreativeWork sc:Text sc:URL MANY A license document that applies to this structured data, typically indicated by URL.
sc:url sc:URL ONE The URL of the dataset. This generally corresponds to the Web page for the dataset.
sc:version sc:Integer sc:Number sc:Text ONE The version of the dataset following the requirements below.
sc:distribution cr:FileObject cr:FileSet MANY
cr:recordSet cr:RecordSet MANY
rai:dataCollection sc:Text ONE
rai:dataCollectionType sc:Text ONE
rai:dataCollectionTypeOthers sc:Text ONE
rai:dataCollectionMissing sc:Text ONE
rai:dataCollectionRaw sc:Text ONE
rai:dataCollectionTimeFrameStart sc:Date sc:DateTime ONE
rai:dataCollectionTimeFrameEnd sc:Date sc:DateTime ONE
rai:dataPreprocessingImputation sc:Text ONE
rai:dataPeprocessingProtocol sc:Text ONE
rai:dataPreprocessingManipulation sc:Text ONE
rai:dataAnnotationProtocol sc:Text ONE
rai:dataAnnotationPlatform sc:Text ONE
rai:dataAnnotationAnalysis sc:Text ONE
rai:dataAnnotationPerItem sc:Text ONE
rai:dataAnnotationDemographics sc:Text ONE
rai:dataAnnotationTools sc:Text ONE
rai:dataBiases sc:Text MANY
rai:dataUseCases sc:Text ONE
rai:dataLimitation sc:Text ONE
rai:dataSocialImpact sc:Text ONE
rai:dataSensitive sc:Text ONE
rai:dataMaintenance sc:Text ONE