Start a new topic

Template - Data Manifest

The data manifest is a list of all the files in the primary project that will be submitted to BDC for formal ingestion to the ecosystem. Identification of all data files that are associated with a specific research participant is possible using this document. An automated script will be able to catalog all the files in the primary project and populate most of the file. 


The data manifest also provides the information necessary to ensure the integrity of ingested files. The data manifest will have the following columns

Column Header

Column Description

DOCFILE*†

File name. The DOCFILE must not contain backward slashes (\). 

FILEDESC*

File description. The description should describe the contents of the file and how they were generated when applicable. May reference additional study documentation or collection instruments. 

FILESIZE*†

Size of the file object in bytes

DIRECTORY*†

Directory name associated with the file. 

FORMAT*†

File extension associated with the file

CATEGORY*†

File category, either “raw” or “processed”

METADATA*†

Path to the associated metadata file associated with the data file

MD5*†

MD5 checksum calculated for the file upon ingest.

URLS*

Upon ingest, determination of the full path to the file in the cloud bucket (including bucket name).

* Indicates a required field, 

† populated by automated script


A Microsoft Excel template for the data manifest populated with example data is attached and available for download. 

xlsx
Login or Signup to post a comment