Page Last Updated: May 22, 2026
Naming Conventions🔗
The instrument table and variable names used for tabulated HBCD study data largely follow standardized naming conventions adapted from the ABCD Study. This ensures consistency across instruments and derived datasets, allowing for intuitive parsing of variable meaning and structure.
Convention Logic & Rules🔗
The standard variable naming format is comprised of 4 or 5 main components separated by a single underscore ( _ ). The scale component is only present in a subset of instruments that contain multiple scales:
domain_source_table_{scale}_item
Variable names may also include subcomponents, separated by double ( __ ) underscores to indicate nested components of table, scale, and/or item. Subcomponents distinguish finer details such as subscales, versions, or counter types. Finally, multiselect fields are preceded by triple underscores ( ___ ), mainly relevant for Adult & Child Demographics table variables.
Let's break down the following example: ncl_cg_spm2__inf_soc_001
ncl: Neurocognition & Language (domain)cg: Caregiver (source)spm2__inf: nested table namespm2: the SPM-2 instrument (table)inf: Infant version of SPM-2 (table subcomponent)
soc: scale for metrics of socialization (scale)001: item number (item)
Naming Component Definitions🔗
Details of individual naming components are as follows:
domain |
Data domain, e.g. bio (Biospecimens), img (Imaging) - see values key |
source |
Can either be the subject/who the protocol element is about OR respondent/who completed the assessment. Examples include cg (Caregiver), ch (Child), etc. - see values key |
table |
Instrument/protocol element name |
{scale} |
Name of scale within instrument/protocol element for instruments with multiple scales (not including administrative/summary score variables). For example, the IBQ-R (VSF)+BI includes 4 scales, each indicated by a separate scale component (e.g. Behavioral Inhibition scale annotated by a value of beh in variable name mh_cg_ibqr_beh_001). |
item |
Will either be an item number corresponding to individual questions in a scale (e.g. 001) or admin field/score label for administrative/summary score variables - see details |
| Domain Values | Description |
|---|---|
bio | BioSpecimens |
mh | Behavior/Child-Caregiver Interaction |
eeg | Tabular EEG |
img | Tabular Imaging |
ncl | Neurocognition and Language |
nt | Novel Tech |
pex | Pregnancy/Exposure Including Substance |
ph | Physical Health |
sed | Social and Environmental Determinants |
| Source Values | Description |
|---|---|
bm | Biological Mother |
cg | Caregiver (Responsible Adult) |
ch | Child |
ld | Linked Data |
ra | RA (research assistant) |
Administrative and summary score variable types include administrative fields and score labels in place of the item naming component, respectively. Possible values include:
| Admin fields | administration; location; lang; date_taken; candidate_age; gestational_age; adjusted_age |
| Score labels | score; summary_score; total_score; etc. |
Exceptions🔗
Some table/variable names deviate from the standard naming conventions. These exceptions are temporary and will be standardized in future releases. Main exceptions include:
- Demographics domain tables (
sed_basic_demographicsandpar_visit_data) - Biospecimen domain tables, e.g.
bio_bm_biosample_nails_results - Administrative and summary score variables (e.g.
date_taken,summary_score) often contain additional single underscores -see infobox for details
Tabulated Pipeline Derivatives🔗
Tabulated derivatives from imaging and EEG processing pipelines follow a standardized naming convention:
domain_pipeline_derivative domain | img (imaging) or eeg (EEG) |
pipeline | Name of the processing pipeline (e.g. xcpd) |
derivative | Basename of output files aggregated across participants |
For example, the table name below corresponds to participant data aggregated across the XCP-D derivatives:
- Table:
img_xcpd_space-fsLR_seg_Gordon_stat-alff_bold.tsv - Derivatives:
sub-[ID]_ses-[V0X]_task-rest_dir-PA_run-{X}_space-fsLR_seg_Gordon_stat-alff_bold.tsv