Page Last Updated: October 20, 2025
Data Structure Overview🔗
The HBCD dataset follows NBDC data structure standards established as part of the ABCD Study (see details), which incorporates the Brain Imaging Data Structure (BIDS) wherever possible for cross-study consistency. At a high level, data are organized into two categories:
Tabulated Data
Study instrument (behavior, biology, and environment), Demographics, and select file-based data organized in a standardized tabular format that includes all participant data per table.
Go to Tabulated Data documentation
File-Based Data
Unlike tabulated data, file-based data comes in a variety of formats, often modality-specific. File-based data includes:
| Raw BIDS | Raw imaging, EEG, and biosensor data converted to the BIDS standard with unaltered signal content |
| Derivatives | Processed imaging, EEG, and biosensor datasets generated by standardized pipelines |
Go to File-Based Data documentation
When possible, tabulated data tables are derived from file-based data (e.g., MRS, MRI, EEG, wearable sensor data) to provide a single file with rows across participants/sessions. Users may choose either the original file-based data or the combined tabulated version, depending on their needs.
Not all processed data are available in tabulated form. Tabulated datasets have one row per participant/session, so only derivatives that can be summarized into a single row/column structure are included. If no tabulated file exists for the derivatives you need, you will need to use the file-based data.
- Tabulated data: one row per participant/session with summary fields.
- File-based data: required for complex, multidimensional, or non-row-summarizable outputs.
Note tabulated files closely mirror their source derivative file names for easy cross-reference. For example, the following subject/session-level XCP-D derivatives are combined into a single tabulated file:
| File-based derivatives | sub-{ID}_ses-{V0X}_task-rest_dir-PA_run-{X}_space-fsLR_seg_Gordon_stat-alff_bold.tsv |
| Tabulated file | img_xcpd_space-fsLR_seg_Gordon_stat-alff_bold.tsv |
hbcd/ |__ derivatives/ # Derivatives | |__ rawdata/ |__ phenotype/ # Tabulated Data |__ sub-{ID}/ # Raw BIDS
hbcd/ |__ derivatives/ # Processed pipeline derivatives | |__ bibsnet/ | |__ hbcd_motion/ | |__ made/ | |__ mriqc/ | |__ nibabies/ | |__ osprey/ | |__ qmri_postproc/ | |__ qsiprep/ | |__ qsirecon/ | |__ symri/ | |__ xcp_d/ | |__ rawdata/ |__ phenotype/ # Tabulated data (demographics, visit info, behavior, etc.) | |__ par_visit_data.* | |__ sed_basic_demographics.* | |__ {instrument_name}.* | |__ sub-{ID}/ # Raw BIDS formatted data (MRI, MRS, EEG, biosensors) | |__ sub-{ID}_sessions.tsv | |__ sub-{ID}_sessions.json | |__ ses-<V0X>/ | |__ anat/ | |__ dwi/ | |__ eeg/ | |__ fmap/ | |__ func/ | |__ motion/ | |__ mrs/ | |__ sub-{ID}_ses-<V0X>_scans.tsv | |__ sub-{ID}_ses-<V0X>_scans.json | |__ dataset_description.json |__ participants.tsv |__ participants.json