Evolution of brain network dynamics in neurodevelopment

Cognitive function evolves significantly over development, enabling flexible control of human behavior. Yet, how these functions are instantiated in spatially distributed and dynamically interacting networks, or graphs, that change in structure from childhood to adolescence is far from understood. Here we applied a novel machine-learning method to track continuously overlapping and time-varying subgraphs in the brain at rest within a sample of 200 healthy youth (ages 8–11 and 19–22) drawn from the Philadelphia Neurodevelopmental Cohort. We uncovered a set of subgraphs that capture surprisingly integrated and dynamically changing interactions among known cognitive systems. We observed that subgraphs that were highly expressed were especially transient, flexibly switching between high and low expression over time. This transience was particularly salient in a subgraph predominantly linking frontoparietal regions of the executive system, which increases in both expression and flexibility from childhood to young adulthood. Collectively, these results suggest that healthy development is accompanied by an increasing precedence of executive networks and a greater switching of the regions and interactions subserving these networks.

Image Acquisition All subject data were acquired on the same scanner (Siemens Tim Trio 3 Tesla, influence of subject motion 3,4 . The first 4 volumes of the functional timeseries were removed to al-often reported as the mean squared coherence between time series over a large frequency interval. 84 Similarly, the wavelet coherence is the mean squared coherence estimated over a large frequency 85 interval in the wavelet domain 37 . As such, the wavelet coherence has a minimum possible value 86 of 0 and a maximum possible value of 1, in contrast to a Pearson correlation coefficient which has 87 a minimum possible value of -1. We chose to apply a coherence measurement over a correlation 88 measurement based on prior work demonstrating its usefulness in the context of fMRI neuroimag-89 ing data 38 , and based on the parsimony of network modeling methods that deal with positive-only 90 edge weights 35,39 , necessary for the non-negative matrix decomposition procedure. 91 The fully weighted adjacency matrix A raw therefore represents the functional brain network 92 for a given subject in which network nodes represent brain regions and network edges represent 93 functional connections between those regions. Following 40 , we divide each layer of a subject's 94 functional connectivity matrix by the average value of that layer to obtain the normalized functional 95 connectivity matrix A l = A raw l <A raw l > which we use for the remainder of our analysis. 96 To further assure that individual differences in motion did not drive our functional connectiv-97 ity findings, we regress motion out of each element in the functional connectivity matrix. Finally, 98 note that following these preprocessing choices, we do not apply any arbitrary thresholds to the 99 functional connectivity matrix; instead, the structure of the full connectivity matrix is explored in 100 an effort to remain sensitive to small variation in connectivity patterns tracking with development. 101

Supplementary Results
Matrix decomposition optimization parameters Given a functional connectivity matrix X, 103 non-negative matrix factorization factors the matrix into a basis matrix of subgraphs, W , and a 104 matrix of time-dependent coefficients, H, such that X ≈ W H. In order to determine the opti-105 mal decomposition of the functional connectivity matrix X, we performed a grid-search procedure 106 over a parameter k, that sets the number of subgraphs, and β, that controls the sparsity of the time-107 coefficient matrix H. We computed the reconstruction error, defined as ||X − W H|| F , over all 108 pairs of k and β. Based on our finding that RSS is more dependent on k than β, we assessed the 109 average RSS over all β for each k. Intuitively, the addition of each additional subgraph progres-110 sively decreases the explanatory power of that subgraph, yielding an RSS curve that decays with 111 increasing k. Based on the first-order difference of RSS versus k, we selected k = 10, a cutoff 112 which has also been selected in prior literature 41,42 . Next, we selected β = 10 −2.0 by considering 113 the minimum of the curve obtained from plotting the reconstruction error against β, while fixing 114 k = 10. The minimum point ensures that we obtain a reconstruction that is as close as possible to 115 the original matrix (Fig 1).    Subgraphs obtained from matrix decomposition From the NMF decomposition of the func-117 tional connectivity matrix, we obtained a set of 10 subgraphs. In the main text, we investigate the 118 cognitive systems expressed in each subgraph. Here, we present the same subgraphs across 264 119 brain regions rather than 13 cognitive systems. We observed that the subgraphs captured varying 120 interactions among brain regions. Some subgraphs captured distributed interactions across the en-121 tire network, while other subgraphs captured fairly localized interactions. These results suggest that the brain consists of a complex landscape containing both regions with distributed activity 123 patterns and regions with more localized activity (Fig. 2). NMF to the concatenated functional connectivity matrix, ordered according to decreasing average expression. We normalized the colorbar to between 0 and 1. We also show the temporal coefficient weights for a representative child subject (in blue) and a representative young adult subject (in green).
Alternate metrics of temporal stability To measure the changes in subgraph expression using 125 the temporal coefficients, we computed the entropy of the coefficients for each subject and each 126 subgraph. Intuitively, entropy is related to the unpredictability of information context. Thus, sub-127 graphs that have a greater tendency to change in expression should have higher entropy (i.e. their 128 time courses are more unpredictable), and subgraphs that are stable in expression should have 129 low entropy. However, because the entropy measure is based on the distribution of the temporal 130 coefficients, it is not dependent on the precise ordering of the temporal coefficients. As an alterna-131 tive method of measuring the changes in subgraph expression over time, we computed a temporal 132 derivative metric, which is defined as the absolute value of the first order difference between adja-133 cent temporal coefficients. This metric directly captures the changes in subgraph expression levels 134 over time. We observed that the entropy metric was highly correlated to the temporal derivative 135 metric (Pearson's correlation coefficient r = 0.99, p < 0.001). This suggests that, while entropy is 136 purely based on the distribution of the temporal coefficients, the underlying distribution is related 137 to the fact that there is temporal structure, as measured by the temporal derivative (Fig. 3  ing to temporal derivative, and one statistic per subject corresponding to entropy. We observed 148 that the temporal derivative is significantly higher in the permuted signals than the real signals 149 (t 398 = −80.59, p < 0.001). However, because entropy is a histogram based estimator, permuting 150 the order of the signals does not affect its underlying distribution. Thus, there was no difference 151 between the permuted signals and the real signals (t 398 = 0, p = 1); Fig. 4).

152
A. ations in the temporal coefficients and random fluctuations, we constructed temporal null models by permuting temporal coefficients. We note that this leads to significant changes in the temporal derivative, but it does not change the entropy due to the distribution-based nature of the entropy metric.
Robustness of cognitive systems We next asked whether the connectivity patterns of the cognitive 153 systems we observed using 10 subgraphs would change significantly if the number of subgraphs 154 were increased or decreased. Applying the same matrix decomposition, we first selected k = 8 155 subgraphs rather than k = 10 subgraphs (Fig. 5). Similar to the main text, we ordered the sub-156 graphs in decreasing average expression. We observe that the structure of the subgraphs remains 157 largely similar to the structure of the original 10 subgraphs. Moreover, we observe similar neu-158 rodevelopment effects in the first subgraph capturing brain regions involved in executive function.

159
Next, we applied the same procedure using k = 12 (Fig. 6), and arranging subgraphs in order 160 of decreasing average expression. At this point, we observe that the executive subgraph referred 161 to in the main text breaks apart into multiple subgraphs. However, the structure of other subgraphs 162 remains consistent between k = 12 and k = 10.

163
We then selected k = 45 (Fig. 7), arranging subgraphs in order of decreasing average 164 expression. We observe some similarities in connectivity patterns, but also breakage of the original 165 10 subgraphs into multiple subgraphs. Moreover, we also observe a number of noisy and less 166 coherent subgraphs, particularly as the subgraph number increases. We note the the ordering of 167 the subgraphs differs than that of the main text: this may be due to the splitting of the original 10 168 subgraphs into multiple subgraphs, such that each of the split subgraphs has a lower weight in the 169 temporal coefficients than the original joint subgraph.

170
Lastly we aimed to the the robustness of the results to changes in the parameter β. Recall 171 that β controls the sparsity of the time-dependent coefficients matrix H. To that end, we performed the same matrix decomposition procedure as described in the main text, but set the parameter 173 β = 0 (Fig. 8). We obtain similar subgraph structure, however the ordering of the subgraphs has 174 changed. We note that modifying the parameter β redistributes the temporal weights, increasing 175 the magnitude of the temporal coefficients for the first subgraph while decreasing the magnitude 176 of temporal coefficients for the remaining subgraphs.  Figure 6: Matrix decomposition using 12 subgraphs. We computed system-wide connectivity over 12 subgraphs. We normalized the colorbar to between 0 and 1. We observe that some subgraphs retain a similar structure as the original 10 subgraphs, while some subgraphs (namely Subgraph 1 in the main text), breaks apart into multiple subgraphs.
Neurodevelopmental effects in the remaining subgraphs In the main text, we hypothesized that  Table 5. 193 We note that while computation memory limits the size of a concatenated functional connec-  Because cognitive abilities increase markedly as youth develop, it is possible that these re-206 sults could be driven simply by neurodevelopment, and not be a direct marker of individual dif-207 ferences in cognition. To determine whether the relationship between the temporal derivative of 208 the executive subgraph coefficients and the accuracy score was independent of age effects, we re-209 gressed out the effects of both motion and age on the cognitive scores and on the temporal weights. 210 We then computed a null distribution of Pearson's correlation coefficients between the temporal 211 derivative and overall accuracy scores, using 1000 permutations of the temporal weights. From 212 this non-parametric permutation approach, we observed a significant difference between the ob-  Custom Templating In order to avoid registration bias and maximize sensitivity to detect regional 227 effects that can be impacted by registration error, a custom template was created with ANTs 47 ; T1 228 images were normalized to this population-specific template space using the top-performing SyN 229 diffeomorphic registration method implemented in ANTs 48 . Structural images were then pro-230 cessed with 'antsCorticalThickness' 49 , which uses the custom template to guide brain extraction,

231
N4 bias correction 50 , and Atropos probabilistic tissue segmentation 51 . This custom template was 232 used in combination with the following motion censoring approach.

233
Motion censoring As high motions measurements may affect functional connectivity metrics, 234 we repeated the same analyses as describe in the main text after motion censoring was applied. In the main text, we note that the first subgraph obtained from matrix composition captured 244 a number of cognitive systems involved in executive function. After motion censoring, as well as 245 using a custom template, we note a similar pattern of interactions emerges in the first subgraph. 246 We observed a visually similar looking first subgraph, and a significant correlation between these 247 two patterns of interaction (Pearson's correlation coefficient r = 0.53, p < 0.001). Furthermore, 248 we observe similar trends in the energy of the first subgraph, with energy higher in the group 249 of young adults than the group of children (Wilcoxon rank-sum test z = −3.08, p = 0.002).

250
While we observe higher entropy in the group of young adults than the group of children, this 251 difference is no longer significant (Wilcoxon rank-sum test z = −1.10, p = 0.27), and may be due 252 to changes in the underlying distribution of the temporal coefficients after high-motion volumes 253 are discarded. We note that a limitation to the motion censoring procedure is the changes in the