ADHD and attentional control: Impaired segregation of task positive and task negative brain networks

In children with attention deficit hyperactivity disorder (ADHD) difficulty maintaining task focus may relate to the coordinated, negatively correlated activity between brain networks that support the initiation and maintenance of task sets (task positive networks) and networks that mediate internally directed processes (i.e., the default mode network). Here, resting-state functional connectivity MRI between these networks was examined in ADHD, across development, and in relation to attention. Children with ADHD had reduced negative connectivity between task positive and task negative networks (p = 0.002). Connectivity continues to become more negative between these networks throughout development (7–15 years of age) in children with ADHD (p = 0.005). Regardless of group status, females had increased negative connectivity (p = 0.003). In regards to attentional performance, the ADHD group had poorer signal detection (d′) on the continuous performance task (CPT) (p < 0.0001), more so on easy than difficult d′ trials (p < 0.0001). The reduced negative connectivity in children with ADHD also relates to their attention, where increased negative connectivity is related to better performance on the d′ measure of the CPT (p = 0.008). These results highlight and further strengthen prior reports underscoring the role of segregated system integrity in ADHD.

To ensure data quality we required the following filters. First, each block was considered valid if children had no more than 50% false alarms on the "easy" non-target trials in that block. Next average scores were computed for hit rate, omissions, false alarms on "easy" non-target trials, and false alarms on "difficult" or "catch" trials using all of the valid blocks. To be included in final analyses, children were required to have an average of >10% correct hits and <90% false alarms on "easy" non-target trials.
Human connectome project (HCP) cohort: Participants for the HCP related analyses were obtained from the HCP consortium "500 Subject release". These data are publically available on the human connectome project database (https://db.humanconnectome.org). Of these subjects we included 61 healthy control subjects (22-35 years of age, 26 males) which were selected based on their optimal data quality and low motion (at least 800 frames remained after motion scrubbing). All subjects included were unrelated. HCP data was acquired on a 3T Siemens Skyra optimized to achieve 100 mT/m gradient strength. All the data was corrected to account for the nonlinearities associated with the high gradient and the displacement of the isocenter in this optimized system. For further details see the HCP 500 Subjects + MEG2 Data Release: Reference Manual (WU-Minn, 2014) and (Glasser et al., 2013). Two separate T1-weighted images were acquired and averaged, with a TR=2400 ms, TE=2.14 ms, TI = 1000 ms, FA = 8°, and ES = 7.6 ms. Two T2-weighted images were acquired and averaged with a TR=3200 ms, TE=565 ms. T1-weighted and T2-weighted images were acquired with a voxel resolution of 0.7 mm (isotropic). Resting state BOLD data were acquired using a gradient echo echo planar imaging sequences with 2mm3 voxels, TR=720ms, TE = 33.1ms, and a multiband acceleration factor of 8. The HCP dataset included two sequential days of scanning in which two resting state scans were acquired on each day. Each of these EPI scans were acquired in both the left to right and right to left acquisition direction. In order to maximize the amount of data, all four datasets, both session (day 1 and day 2) and each acquisition direction, were processed as described above and timecourses were concatenated before construction of correlation matrices.
MRI data processing OHSU sample: Data were processed using the pipelines from the Human Connectome Project (Glasser et al., 2013), which include the use of FSL (Jenkinson, Beckmann, Behrens, Woolrich, & Smith, 2012;Smith et al., 2004) and FreeSurfer tools (Desikan et al., 2006;Fischl & Dale, 2000;Fischl, Sereno, & Dale, 1999). Briefly, gradient distortion corrected T1-weighted and T2weighted volumes were first aligned to the MNI's AC-PC axis and then non-linearly normalized to the MNI atlas. Later, the T1w and T2w volumes are re-registered using boundary based registration (Greve & Fischl, 2009) to improve alignment. Then, the brain is segmented using recon-all from FreeSurfer. Segmentations are improved by using the enhanced white matter-pial surface contrast of the T2-weighted sequence. The BOLD data is corrected for field distortions (using FSL's TOPUP) and processed by doing a preliminary 6 degrees of freedom linear registration to the first frame. After this initial alignment, the average frame is calculated and used as final reference. Next, the BOLD data is registered to this final reference and to the T1weighted volume, all in one single step, by concatenating all the individual registrations into a single registration. Strict motion correction procedures were applied to resting state functional maps and volumes with a framewise displacement (FD) (Fair et al., 2012;Power, Barnes, Snyder, Schlaggar, & Petersen, 2012) which exceeded .2mm were excluded and only subjects with greater than 4 minutes of remaining motion free data were included in this analysis. In order to insure that the same amount of data was used in all subjects, 73 motion free frames were randomly selected to construct each scans covariance matrix.
Surface registration. The cortical ribbon defined by the structural T1-weighted and T2-weighted volumes is used to define a high resolution mesh which will be used for surface registration of the BOLD data. This cortical ribbon is also used to quantify the partial contribution of each voxel in the BOLD data in surface registration. Timecourses in the cortical mesh are calculated by obtaining the weighted average of the voxels neighboring each vertex within the grid, where the weights are given by the average number of voxels wholly or partially within the cortical ribbon. Voxels with high coefficient of variation, indicating difficulty with tissue assignment or containing large blood vessels, are excluded. Next, the resulting timecourses in this mesh are downsampled into a standard space of anchor points (grayordinates), which were defined in the brain atlas and mapped uniquely to each participant's brain after smoothing them with a 2mm full-width-half-max Gaussian filter. Subcortical regions are treated and registered as volumes. Two thirds of the grayordinates are vertices located in the cortical ribbon while the remaining grayordinates are subcortical voxels.
Nuisance regression. Additional preprocessing consists of regressing out the grey matter, ventricle and white matter average signal, and the movement between frames from the six image alignment parameters , , , ' , ( , and ) on the actual and the previous TR and their squares, which correspond to the Volterra series expansion of motion (Friston, Williams, Howard, Frackowiak, & Turner, 1996;Power et al., 2014Power et al., , 2012. The regression's coefficients (beta weights) are calculated solely based on frames with low movement, but regression is calculated considering all the frames to preserve temporal order in the data for filtering in the time domain. Next, timecourses are filtered using a first order Butterworth band pass filter to preserve frequencies between 9 and 80 mHz.
HCP sample: For HCP subjects, BOLD data was denoised using ICA-FIX a tool which uses independent Component Analysis (ICA) to account for nuisance and covariates. ICA-FIX automatically removes artifactual or "bad" components. Briefly, each voxel's timecourses from 25 HCP subjects were decomposed into 229 spatial components. Of these, on average 24 components were hand-classified as "good" and the remainder as "bad". Next, a classifier was trained to identify "good" and "bad" components. Once the classifier was optimized (by leaveone-subject-out cross validation), the resulting classifier was used to identify the "bad" components from each participant. Such components were removed by regressing the "bad" components (timecourses) out from the timecourses on each grayordinate. In addition to ICA-FIX BOLD data was further denoised by regressing the whole brain signal. All other processing techniques were identical between the HCP and OHSU cohorts.

Motion and whole brain regression
This study is the first to examine negative connectivity patterns in ADHD after recent realizations by the field of the critical importance of motion correction (Grayson et al., 2016;Power et al., 2013Power et al., , 2012T. Satterthwaite et al., 2012). In order to ensure that our findings were not driven by differences in head motion we use multiple preprocessing techniques aimed at eliminating motion effects, including motion regression, motion censoring based on frame displacement, and whole brain regression. Here, we use a motion cut off of FD < .2mm in order to maximize the amount of quality data, while excluding motion related artifacts in our sample. While motion is always a concern, particularly in hyperkinetic and developmental samples, the likelihood of our results being related to motion artifacts low given that the final sample was matched on average frame displacement.
As an additional quality control measure whole brain regression (WBR) was used. Outside of unique data collection circumstances, WBR has repeatedly been shown to reduce noise unlikely related to neural activity, remove cardiac and respiratory signals known to correlate with the global signal and reduce motion artifact (Grayson et al., 2016;K Murphy & Fox, 2016;Power et al., 2012;T. D. Satterthwaite et al., 2013;Schölvinck, Maier, Ye, Duyn, & Leopold, 2010). With this said, we recognize that there is still controversy regarding the use of WBR. While prior work has highlighted the benefits of this procedure (Grayson et al., 2016;Keller et al., 2013;Miranda-Domínguez et al., 2014;Power et al., 2014), there are others who disagree (Gotts et al., 2013). On the one hand it is clear that WBR reduces noise (Grayson et al., 2016;Power et al., 2012;T. D. Satterthwaite et al., 2013) and is one of the only methods which are able to correct for non-spatially dependent artifacts. Further, after WBR the structure of the negatively correlated networks are maintained in their spatial distribution and cross subject consistency (Fox, Zhang, Snyder, & Raichle, 2009). On the other hand, WBR can artificially induce some low probability negative correlations (Kevin Murphy, Birn, Handwerker, Jones, & Bandettini, 2009). However, in our prior work we have confirmed that the rank order of the strongest negative and positive correlations does not overtly change with and without the use of WBR (Miranda-Domínguez et al., 2014). In addition, the strongest negative correlations have been validated with non-MR measures of brain activity (Keller et al., 2013). The current study examines only the strongest negative correlations which are of the highest probability of being the true and biological relevant negative correlations which are not artificially induced by the use of WBR. In total, our preprocessing and analyses ensure that the cleanest motion free data was used to calculate connectivity on only the most robust negative connections between networks.

Secondary motion censoring methods
It has recently become clear that links between connectivity and behavior may be influenced by subject head motion even after stringent motion correction and censoring methods (Siegel et al., 2016). Therefore, we ensured that 1) connectivity patterns were not related to subject head motion, and 2) that brain behavior correlations were not related to motion. In brief, the first step involved ensuring the correlation between average negative connectivity was not related to subjects head motion measured by mean remaining FD. Next, we calculated the motion influence on the functional connectivity-behavior relationship as described in full elsewhere (Siegel et al., 2016). In brief, this required that we ensure there was no correlation between the mean FD-connectivity relationship and the observed brain-behavior relationships. This was calculated as a "correlation of correlations", that is the correlation between the correlation of mean FD and all connections within the network mask, and the CPT-connectivity correlation. In short, we found modest relationships between head motion and these two parameters (see supplementary results) and so in order to ensure that our results were not influenced by subject motion our results were tested in multiple ways, step 1) With and without the removal of 30 scans whose behavior and or connectivity patterns were related to head motion (mean FD) and 2) with and without mean FD as a covariate in the linear mixed effects models.

Supplementary Results: Main effects without secondary motion correction
Our main analyses removed 30 scans whose connectivity patterns were driving a modest relationship between head motion and functional connectivity, as well as been head motion and subject age. After their removal our analyses passed motion criteria defined by Siegel and colleagues (Siegel et al., 2016), see figure S1 for these scans. Without the exclusion of these subjects our effects were conserved. ADHD subjects had less negative connectivity (p = .005), and the age relationship (p = .015), and sex effect (p = .021) remained significant. The group by age effect remained a trend (p = .09). CPT relationships were also significant with these subjects included for the d-prime difficult (t=-2.86, p = .004), CPT easy condition (t = -3.21, p = .001) and the relationship remained stronger in ADHD subjects than controls with a significant interaction between d-prime difficult and group status (p = .035).

Connection level effects
In order to examine whether age, group, and sex effects were seen at the level of individual connections between the task positive and negative networks a similar set of analyses computed a linear mixed effects model for each connection between networks. Of the 250 connections between networks, 24 connections were significantly under-connected in ADHD (FDR corrected p < .05), 15 connections showed significant age effects (greater negative connectivity and increased age), and 20 connections were more negatively connected in females. No effects (FDR corrected) were in the opposite direction (i.e. ADHD greater negative connectivity than controls). Each of these connections can be seen in Supplementary Table 4.

Supplementary Figures
Supplementary Figure 1. Scans removed from the main analyses do to a relationship between motion and negative connectivity. Average negative connectivity between networks is shown on the x axis and scan remaining frame displacement (FD) is shown on the y axis. 30 scans were removed (red) which contributed to the motion, connectivity relationship. With the inclusion of the scans in red the FD-connectivity correlation (r = .13, p < .0001), without the scans included (r = .06, p = .10). Main effects (ie, group, age, sex effects, and CPT relationships) are conserved with the inclusion of these scans. The scan with a negative connectivity value below -1 (fischers transformed correlations) was not removed in the main analyses because it did not influence this connectivity-motion relationship, however, the exclusion of this data point (this scan was an ADHD subject) does not alter the significance of any results.

Supplementary Tables
Supplementary Table 1. Number of ROIs per task positive -task negative network mask. Column 1. The number of ROIs in each network, as defined by Gordon et al.(Gordon et al., 2016). Column 2. Number of significantly negative connections (average R < -.35) between task negative (default) and task positive network regions, defined in an independent set of adults. Note the majority of negative connections are from the default to the cingulopercular network. Column 3. Number of significantly negative connections between the default network and all other brain regions. Note that only two additional regions which are outside the task positive network (visual and supplementary motor) are now included in this network mask.
Supplementary Table 2. Alternative thresholds for defining the network mask. An independent sample of adults were used to define the connections which were most negatively correlated with the default mode. Main analyses considered connections from the default mode which were below R < -.35, and excluded 30 scans whose connectivity was correlated with motion (mean FD). Alternative analyses are shown at various thresholds, with and without the removal of 30 scans. 1) = 0 + 2 + 5 + 9 + > + 2) = 0 + 2 + 5 + 9 + > + C + 3) = 0 + 2 + 5 + 9 + > + C + E + F + G + H + Supplementary Table 3. Statistical Models predicting average negative connectivity (Y). Model 1. Used for the main analyses to test main effects of age, group, and gender. Model 2. Used to test the age by group interaction. Model 3. Omnibus model testing the influence of all interactions between average connectivity and demographic variables. All models included mean remaining FD at each scan.
Supplementary Table 4. Group, age, and sex effects of average negative connectivity between task positive and task negative (default) brain systems. ROIs and network labels are defined by the Gordon parcellation (Gordon et al., 2014). All p-values represent the FDR corrected significance for each main effect in a linear mixed effects model including all three parameters (model 1).