README.knit

This repository contains data, code, and output for a manuscript entitled “Cognitive deficits and enhancements in youth from adverse conditions: An integrative assessment using Drift Diffusion Modeling in the ABCD study”. The manuscript is currently in press at Developmental Science. To see the original Stage 1 Protocol, go to https://osf.io/qja9p/.

Do you want to download or clone the materials for this project? Go to https://github.com/stefanvermeent/abcd_ddm

Directory Structure

The names of each folder are intended to be self-explanatory. There are eight top-level folders to organize the inputs and outputs of this project:

Open Science Workflow: Statement on and explanation of our Open Science Workflow.
Preregistrations: The preregistration for the analyses done prior to Stage 1 submission.
Registered Report: The Registered Report written in R markdown.
Supplement: a supplemental text (to be submitted with the manuscript) documenting all secondary analyses in detail.
Scripts: R-scripts that read, analyze, and produce all analysis objects.
Data: Folder in which real ABCD data can be placed to make the analyses fully reproducible. Note that we cannot openly share the raw data on the open repository. Contains synthetic data to facilitate computational reproducibility in the absence of ABCD access.
Analysis Objects: Folder containing all analysis objects.
Codebooks: lists of variable names, labels, and value labels (where applicable).

Click on each of the folders to get more details.

Overview of project milestones

Below is an overview of all the project milestones, such as first-time data access, submissions, and revisions. Data access events were automatically captured using custom code, which over the course of this project was collected in the R package projectlog https://stefanvermeent.github.io/projectlog/. For more information about how tracking worked, Go to the Open science workflow tab.

2024-01-19 09:51:25: Stage 2 accepted manuscript at Developmental Science
- Milestone: Accepted
- Data MD5 hash:
- Link to code snippet
2023-10-31 12:01:33: Submission of Stage 2 manuscript
- Milestone: Submission
- Data MD5 hash:
- Link to code snippet
2023-08-01 12:08:29: first-time access to ivs with mnlfa correction
- Milestone: Data Access
- Data MD5 hash: 01189ac84ccf49b95c2c82f2f3f89cef
- Link to code snippet
2023-07-17 09:31:09: Revision 2 of Stage 1 Registered Report for Developmental Science
- Milestone: Revision
- Data MD5 hash:
- Link to code snippet
2023-04-28 14:01:24: Revision of Stage 1 Registered Report for Developmental Science
- Milestone: Revision
- Data MD5 hash:
- Link to code snippet
2022-12-23 10:04:18: Stage 1 submission of Registered Report to Developmental Science
- Milestone: Submission
- Data MD5 hash:
- Link to code snippet
2022-12-08 10:04:00: read family ids including info on waves.
- Milestone: Data Access
- Data MD5 hash: 7a422f847f70b13c8388c2d437a0cefc
- Link to code snippet
2022-12-08 09:34:04: Access family ids
- Milestone: Data Access
- Data MD5 hash: 335c000784108004113778a186cf34c7
- Link to code snippet
2022-10-11 20:26:24: Read NIH Toolbox Part 2. Incl. files that got previously dropped by accident.
- Milestone: Data Access
- Data MD5 hash: 2364dde9131dfce5b249b3091f6b82aa
- Link to code snippet
2022-10-11 12:03:50: Read NIH Toolbox trial-level data part 1 – incl. files that previously got dropped by accident.
- Milestone: Data Access
- Data MD5 hash: abfd4d7bd6986d3810acf21998ca5780
- Link to code snippet
2022-10-07 12:52:55: Read NIH Toolbox performance summary scores
- Milestone: Data Access
- Data MD5 hash: 5b65303a7b83bb599c6ac1916ab7cd07
- Link to code snippet
2022-09-29 17:28:04: Read second part of the NIH toolbox data.
- Milestone: Data Access
- Data MD5 hash: 88b5e9aa6bbb24b4fef905af7a560696
- Link to code snippet
2022-09-29 12:33:42: Read first part of NIH Toolbox data.
- Milestone: Data Access
- Data MD5 hash: 1dbeb4e70b986f18b9467a1afe3f80dd
- Link to code snippet
2022-09-28 08:50:56: Read TBI data for data exclusions.
- Milestone: Data Access
- Data MD5 hash: cdf1f71286a9609dc48fac74dbd4eb04
- Link to code snippet
2022-09-26 10:45:23: read all Little Man Task data.
- Milestone: Data Access
- Data MD5 hash: 43594b8749c0ec3ecbe92f80bb033482
- Link to code snippet
2022-09-22 10:01:58: read single LMT file to develop parsing strategy.
- Milestone: Data Access
- Data MD5 hash: c3077f193b17f0218489d0a2bc839ff9
- Link to code snippet
2022-09-20 09:31:08: data quality checks for DDM models
- Milestone: Preregistration
- Data MD5 hash:
- Link to code snippet

How to reproduce this repository

This repository can be reproduced in two ways.

The first way is by downloading all files locally. You can do this by cloning the repository or by downloading the repository as a .zip file. After cloning/downloading, double click on abcd.Rproj to open up RStudio.

The second way is by using Docker. We supply a Docker image that instantiates an Rstudio cloud environment with all the required dependencies. We provide more information about how to use the Docker image below (see 6. Using Docker. In short, Docker provides a ‘container’—similar to a virtual machine—with all the project dependencies (R packages, software, operating systems) pre-loaded in their (historic) versions as they were used when the project was created. Thus, Docker ensures that scripts will still function as intended even into the future. Note: It is not necessary to use Docker to reproduce our results. If you’re cloning the repository within a couple of years of publication and installing the proper dependencies (see step 1 below) you should be able to run all scripts without issues. However, some of our scripts might stop to function properly over time as other dependencies get updated and/or deprecated (e.g., future versions of RStudio) If that case, Docker is the solution.

Below is the exact order these scripts should be ran (irrespective of whether you are working with or without Docker):

1. Loading Dependencies

/scripts/dependencies.R: Loads (and installs if necessary) all the necessary R packages as they were on 2022-10-15 using the Groundhog package. This ensures that scripts will continue functioning as intended even after packages have been updated. Groundhog is a more lightweight, easy-to-use alternative to package dependency managers such as renv.

2. Data Preparation

The data preparation steps are only required when you are using real ABCD study data. If you are using the synthetic data, you can skill to step 3. You should have the following data files/folders in your data folder (Remember to set the data_folder object in /scripts/dependencies.R to "data"):

abcd_tbss01.txt: File containing Cognitive summary scores (used for Participant ID verification).
abcd_tbi01.txt: File containing TBI data.
abcd_lmtlb01: Folder containing the Little Man Task Data.
abcd_tb_tlb01: Folder containing the NIH Toolbox Cognitive Data.
acspsw03.txt: File containing Family IDs (used for making sure that siblings end up in the same subset).

After downloading the above files/folders, run the preprocessing scripts in the following order:

1_data_prep/1_preprocessing.R: Reads and binds individual data files, and fixes participant IDs where necessary. Note that this script may take several hours to complete.
1_data_prep/2_clean_data.R: Takes care of trial-level and participant-level exclusions.
1_data_prep/3_data_subsets.R: Divides the full dataset into a training and test set.

3. Primary Analyses

1_ddm_fit.R: Fits the hierarchical Bayesian DDM models to the data of the four cognitive tasks and saves the results.
2_ivs.R: Prepare the independent variables necessary for the main analyses.
3_ddm_extract_results.R: Extract and parse the DDM model objects.
4_sem_training.R: Iteratively fits the SEM model to the training set and saves the results.
5_sem_test.R: Fits the final SEM model to the test set.
6_exploratory.R: Exploratory (i.e., not pre-registered) analyses.

4. Manuscript

registered_report/scripts/staging.R: Reads and combines all the relevant analysis objects generated by the analysis scripts and combines them in a list for the supplemental materials.
registered_report/registered_report.qmd: loads the staged results list and knits together the manuscript and all statistics, tables, and figures computed in previous scripts.

5. Supplement

supplement/scripts/staging_suppl.R: Reads and combines all the relevant analysis objects generated by the analysis scripts and combines them in a list for the supplemental materials.
supplement/supplemental_materials.qmd: loads the staged results list and knits together the written supplement and all tables and figures computed by the staging script above.

6. Using Docker

If you want to create a Docker container in which to run the scripts, you will have to follow the following steps:

Download and install docker at https://docs.docker.com/get-docker/.
If on Windows, open the PowerShell. If on Mac, open the terminal through ‘Applications > Utilities > Terminal’.
On the command line, type: docker run --rm -d -e PASSWORD=my_password -p 8787:8787 stefanvermeent/abcd_ddm
Open a browser window and enter: localhost:8787 as the URL.
You will be redirected to an Rstudio cloud login page. As the username, type rstudio. As the password, type my_password, unless you changed it under step 3.
You should now see an RStudio environment with the required dependencies pre-installed and loaded.

Contact

For questions or comments, feel free to contact me at p.c.s.vermeent@gmail.com.