From static to temporal network theory – applications to functional brain connectivity

Network neuroscience has become an established paradigm to tackle questions related to the functional and structural connectome of the brain. Recently, there has been a growing interest to examine the temporal dynamics of the brain's network activity. While different approaches to capture fluctuations in brain connectivity have been proposed, there have been few attempts to quantify these fluctuations using temporal network theory. Temporal network theory is an extension of network theory that has been successfully applied to the modeling of dynamic processes in economics, social sciences and engineering. The objective of this paper is twofold: (i) to present a detailed description of the central tenets and outline measures from temporal network theory; (ii) apply these measures to a resting-state fMRI dataset to illustrate their utility. Further, we discuss the interpretation of temporal network theory in the context of the dynamic functional brain connectome. All the temporal network measures and plotting functions described in this paper are freely available as a python package Teneto.


Introduction
It is well-known that the brain's large-scale activity is organized into networks. The underlying organization of brain's infrastructure into networks, at different spatial levels, has been dubbed the brain's functional and structural connectome (1,2). Functional connectivity, derived by correlating the brain's activity over a period of time, has been successfully applied in both fMRI (3,4,5,6) and MEG Although temporal network theory has been successfully applied in others fields, (e.g. social sciences), its implementation in network neuroscience is limited. In this paper, we first provide an introduction to temporal network theory by extending the definitions and logic of static network theory. Thereafter, we define temporal network measures. Finally, we apply these measures to a resting-state fMRI dataset acquired during eyes open and eyes closed conditions, revealing differences in dynamic brain connectivity between conditions.

From static networks to temporal networks
We begin the introduction to temporal network theory by expanding upon the definitions of network theory. In network theory, a graph (G) is defined as a set of nodes and edges: V is a set containing N number of nodes. E is a set of tuples that represents edges or connections between each pair of nodes (i, j) : i, j ∈ V. The graph may have binary edges (i.e. an edge is either present or absent) or it may be weighted, often normalized between 0 and 1, to represent the magnitude of connectivity.
When each edge has a weight, the definition of E is extended to a 3-tuple (i, j, w) where w denotes the weight of the edge between i and j.
E is often represented as a matrix of the tuples, which is called a connectivity matrix, A (sometimes the term "adjacency matrix" is used). An element of the connectivity matrix A i,j represents the degree of connectivity for a given edge. When G is binary, A i,j = {0, 1} and in the weighted version A i,j = w. In the case of A i,j = A j,i , for all i and j, the matrix is considered undirected and, when this is not the case, directed.
One appeal of network theory is the range of topics it can encompass. A set of nodes can be a group of people, a collection of cities, or an ensemble of brain regions. Each element in the nodal set can represent vastly different things in the world (e.g. a person: Ashely; a city: Gothenburg; or a brain region: the left thalamus). Likewise, edges can represent a range of different types of connections between their respective nodes (e.g. friendship, transportation or neural communication). Regardless of what the network is modeling, many different properties regarding the patterns of connectivity between nodes can be quantified, examples being centrality measures, hub detection, small world properties, clustering and efficiency (see 47, 2, 48 for detailed discussions).
It is important to keep in mind that a graph is only a representation of some state of the world being modeled. The correspondence between the graph model and the state of the world may decrease due to aggregations, simplifications and generalizations. Adding more information to the way nodes are connected can entail that G provides a better representation, thus increasing the likelihood that subsequently derived properties of the graph correspond with the state of the world being modeled. One simplification made in eq. 1 is that two nodes can be connected by one edge only. Such a simplification may not be appropriate for all questions. For example, if we wanted to model multiple transportation links between several cities (e.g. rail, road, and air), we will need multiple types of edges if these different transportation links are of importance. In some circumstances, the type of link is irrelevant. For example, when studying the spread of a disease, the only relevant information needed might be that people can move between cities. However, if we are managing shipping routes where different transportation routes correspond to different prices or shipping time, then this might be important information. Similarly, networks (e.g. social networks) develop and change over time. In social networks, friendships are started and can end.
To capture such additional information in the graph, edges need to be expressed along additional non-nodal dimensions. We modify eq. 1 to: where D is a set of the additional non-nodal dimension. In the case of multiple additional dimensions, D is a class of disjoint sets where each dimension is a set. Eq. 2 is sometimes referred to as a multigraph. For example, D could be a set containing three transportation types {"rail", "road", "air"} or a temporal index {"2014","2015","2016"}.
In a static graph, the edges E are elements that contains pairs (2-tuples) and thus represent connections between two nodes. In a multigraph, E consists of (|D| + 2)-tuples for binary graphs, where |D| expresses the number of sets in D.
Here, the addition of two is because of the original two node indexes, which defines the edge in eq. 1. If time is the only dimension in D, then an element in E is a triplet (i, j, t) : i, j ∈ V, t ∈ D. When G is weighted, E contains (2 × |D|) + 2)-tuples as w becomes the size of D, representing one weight per edge.
While some of the measures presented here are valid for all types of multigraphs, time is a special case since it comprises of an ordered set. There is no loss of information if {"rail", "road", "air"} becomes {"road","air","rail"}. However, when D contains temporal information, the order is crucial. Thus, a temporal network is when a multigraph has an ordered set in D that represents time.
From the description above, it is evident that multiple sets can be included in a multigraph. For example, consider a set of three cities. Let D be ({"rail","road","air"},{"2014","2015","2016"}). Here, each edge is expressed in a multigraph by a 4-tuple which indexes the two cities, transportation type, and year. Analogously, it is conceivable that a detailed network description of the human connectome may include information regarding an edge's presence in time, frequency and task context. Thus, a multigraph representation of the connectome may, for example, include information on edges that are conditional to (i) a delay of 100ms after stimulus onset (time), (ii) presence of gamma oscillations (frequency), and (iii) the presence of an n-back exercise (task context). In such complex multigraphs, the temporal network measures presented in this paper can be used to examine relationships across time, but it will require fixing or aggregating over the other dimensions. However, more complex measures that consider all non-nodal dimensions have been proposed elsewhere (e.g. 49).
For the remaining parts of this paper we will only consider the case when D contains an ordered set of temporal indexes and when G is binary and undirected. In this case, each edge is indexed by i,j and t. To facilitate readability, connectivity matrices are written as A t i,j , i.e. with the temporal index of D in the superscript.
We can illustrate the concept of temporal networks by a simple example that models the evolution of friendship between three individuals (Ashley, Blake and Casey). A friendship between two of them is represented by an edge ( Figure  1A ). We then add an additional dimension, time, to the graph which includes the following temporal indexes: D ={2014,2015,2016}. Let us now presume that in 2014, only Ashley and Blake knew each other. In 2015, Ashley became friends with Casey. In 2016, Blake and Casey also became friends. The temporal evolution of friendships among a small social network can be projected onto a slice graph representation ( Figure 1B). Here, it becomes apparent that Casey is the person in this small social network that shows the largest change in friendship over time, a property which cannot be depicted in a static graph. Figure 1: A slice graph example of friendship in a small social network evolving over three years. Each person (node) is a row with a circle at each year (the non-nodal dimension). Lines display represent friendship (edges) between the nodes.

5
The small social network in Figure 1B shows snapshots of connectivity through time. At each time point, there is a connectivity matrix which is sometimes called a graphlet (50,40,41). A graphlet is a complete two dimensional connectivity matrix, akin to static networks, but each graphlet is only a part of the entire temporal network. It useful to describe what type of graphlet that is being used. For example we call a graphlet expressing time for a time-graphlet or t-graphlet (41). Likewise, a graphlet expressing frequency information has been called a frequency graphlet or f-graphlet (40).
Instead of representing the data with multiple graphlets, E can be used to derive the contact sequence containing the nodes and temporal index (51). Unlike the graphlet representations, which must be discrete, contact sequences can also be used on continuous data and, when connections are sparse, a more memory efficient way to store the data.

Measures for temporal networks
Once the t-graphlets have been derived, various measures can be implemented in order to quantify the degree and characteristics of the temporal flow of information through the network. We begin by introducing two concepts which are used in several of the temporal network measures that will be defined later. The focus is on measures that derive temporal properties at a local level (i.e. per node or edge) or a global level (see Discussion for other approaches). We have limited our scope to describe only the case of binary, undirected and discrete tgraphlets, although many measures can be extended to continuous time, directed edges, and non-binary data.

Concept: Shortest temporal path
In static networks, the shortest path is the minimum number of edges (or sum of edge weights) that it takes for a node to reach another node. In temporal networks, a similar measure can be derived. Within temporal networks, we can quantify the time taken for one node to reach another node. This is sometimes called the "shortest temporal distance" or "waiting time".
Consider the temporal network shown in Figure 2A. Starting at time point 1, the shortest temporal path for node 1 to reach node 5 is 5 time units ( Figure 2B, red line). Here, only one edge was traveled per time point. However, deriving the shortest temporal path becomes a more complex task when considering that multiple edges can be traveled per time point. For example, node 5 at time 2 can reach node 3 in one time step, if multiple edges are allowed to be traveled ( Figure 2C, red line). If multiple edges cannot be traveled, then the shortest path for node 5 to reach node 2, starting at time point 2, is 5 time units ( Figure  2C, blue line).
When calculating the shortest temporal path, a parameter must be set that restrains how many edges per time point that can be traveled. This parameter will depend on the temporal resolution of the data and is best chosen given previously established knowledge of the dynamics of the data. For fMRI, where the temporal resolution is in seconds, it makes sense to assume that several edges can be traveled per unit of time. Contrarily, in MEG where the resolution are in the range of milliseconds, it is more reasonable to place a limit on the number of edges that can be traveled per time unit.
Regarding the shortest temporal path, it is useful to keep in mind that the path is rarely symmetric, not even so when the t-graphlets themselves are symmetric. This is illustrated by considering the network shown in Figure 2A for which it takes 5 units of time for node 1 to reach node 5 when starting at t = 1. However, for the reversed path, it only takes 3 units of time for node 5 to reach node 1 (allowing for multiple edges to be traveled per time point).

Concept: Inter-contact time
The inter-contact time between two nodes is defined as the temporal difference for two consecutive non-zero edges between those nodes. This definition differs from the shortest temporal paths in so far as it only considers direct connections between two nodes. Considering Figure 2A, the inter-contact times between nodes 4 and 5 become a list [2,2] as there are edges present at time points 2, 4 and 6. Each edge will have a list of inter-contact times. The number of inter-contact times in each list will be one minus the amount non-zero edges between the nodes. Unlike the shortest temporal paths, graphs that contain inter-contact times will always be symmetric.

Nodal measure: Temporal centrality
Akin to degree centrality in the static case, where the sum of edges for a node is calculated, a node's influence in a temporal network can be calculated in a similar way. The difference from its static counterpart is that we additionally sum the number of edges across time. Formally, temporal degree centrality, D T , for a node i is computed as where T is the number of time points, N is the number of nodes and A t i,j is a graphlet.
While providing an estimate of how active or central a node is in a temporal network, temporal degree centrality does not quantify the temporal order of the edges. This is illustrated in Figure 3, where node 3 and node 2 have identical temporal degree centrality despite having very different temporal ordering of their edges.

Nodal measure: temporal closeness centrality
A centrality measure that considers the temporal order is temporal closeness centrality (52). This is an extension of closeness centrality which is the inverse sum of the shortest temporal. Temporal closeness centrality is calculated by the average of the shortest temporal paths between two nodes d τ as where d τ i,j is the average shortest path between nodes i and j. Like its static counterpart, if a node has shorter temporal paths compared to other nodes, it will have a larger temporal closeness centrality.
Consider the example given in Figure 3 that shows a temporal sequence of connectivity among three nodes over 20 time points. Note that the temporal degree centrality is identical for both node 2 and node 3, while degree centrality for node 1 is twice as large. Node 2 has the largest temporal closeness centrality since the time between edges for node 2 are longer than for node 3, which has the lowest value of temporal closeness centrality.

Edge measure: Bursts
Bursts have been identified using temporal network theory as an important property for many processes in nature (53,54,55,56,57). A hallmark of a bursty edge is the presence of multiple edges with short inter-contact times, followed by longer and varying inter-contact times. In statistical terms such a process is characterized with a heavy tailed distribution of inter-contact time probabilities. Numerous patterns of social communication and behaviour have been successfully modeled as bursty in temporal network theory, including email communication (58,53), mobile phone communication (59), spreading of sexually transmitted diseases (60), soliciting online prostitution (61), and epidemics (62). With regard to network neuroscience, we have recently shown that bursts of brain connectivity can be detected in resting-state fMRI data (41). Furthermore, bursty temporal patterns have also been identified for the amplitude of the EEG alpha signal (63,64,65).
There are several strategies to quantify bursts. A first indication of whether a time series of brain connectivity between two nodes is bursty or not is simply to plot the distribution of inter-contact times. Thus, the complete distribution Figure 3: A slice graph representation of a simple example of a temporal network that illustrates the conceptual difference between temporal degree centrality and temporal closeness centrality.
of τ for a given edge contains information of the temporal evolution of brain connectivity. However, there are other methods available to quantify bursts. One example is the burstiness coefficient (B) first presented in ref. (66) and formulated for discrete graphs in ref. (51): where τ ij is a vector of inter-contact times between nodes i and j through time.
When B > 0, it is an indication that the temporal connectivity is bursty. This occurs when the standard deviation σ(τ ) is greater than the mean µ(τ ). In eq. 5, bursts are calculated per edge which can be problematic when having limited data. Functional imaging sessions must be long enough in order to be able to accurately establish whether a given temporal distribution is bursty or not (too few inter-contact times will entail a too poor estimation of σ to accurately estimate B). Typically, for resting state fMRI datasets acquired during rather short time spans (5-6 minutes) with low temporal resolution (typically 2-3 seconds), it might be difficult to quantify B in a single subject. A potential remedy in some situations is to compute B after concatenating inter-contact times across subjects.
Eq. 5 calculates the number of bursts per edge. This can easily be extended to a nodal measure by summing over the bursty coefficients across all edges for a given node. Alternatively, a nodal form of B can be calculated by using all the inter-contact times for all j, instead of averaging over j in B ij . Finally, if a process is known to be bursty, instead of quantifying B, it is possible to count the number of bursts present in a time series.

Global measure: Fluctuability
While centrality measures provide information about the degree of temporal connectivity and bursts describe the distribution of the temporal patterns of connectivity at a nodal level, one might additionally want to retrieve information about the global state of a temporal network. To this end, fluctuability aims to quantify the temporal variability of connectivity. We define fluctuability F as the ratio of the number of present edges in A over the grand sum of A t where U is a function that delivers a binary output as follows: U (A i,j ) is set to 1 if at least one of an edge occurs between nodes i and j across time t = 1, 2, ..., T .
If not, U (A i,j ) is set to zero. This can be expressed mathematically as: where T is the total number of time points and A has at least one non-zero edge. From the definition given in eq. 6 it follows that the maximum value of F is 1 and this value only occurs when every edge is unique and occurs only once in time.
While the above definition of fluctuability may seem counter-intuitive, it is an adequate measure to quantify the temporal diversity of edges. F reveals how connectivity patterns within the network fluctuates across time. To see this, consider the networks shown in Figures 4A and 4B, for which there are two edges present at each time point. There are only three different unique edges in Figure 4A, meaning that the sum of U is 3 for the network shown in Figure  4A. However, there is a greater fluctuation in edge configuration for the network shown in Figure 4B and all six possible edges are present (entailing that the sum of U is equal 6). Since both networks have in total 24 connections over time, it becomes easy to see that the network shown in Figure 4B has a twice as large value of F compared to the network shown in Figure 4A.
Notably, fluctuability is insensitive to the temporal order of connectivity. For example, the networks depicted in Figures 4B and 4C have the same fluctuability, despite having a considerably different temporal orders of edge connectivity. Thus, fluctuability can be used as an indicator of the overall degree of spatial diversity of connectivity over time.
The definition of fluctuability can be changed to work on a nodal level. To achieve this, the summation in eq. 6 is applied over only one of the nodal dimensions. Note that for nodes with no connections at all, the denominator will be 0 and, to circumvent this hindrance, nodal fluctuability F N i is defined as:

Global measure: Volatility
One possible global measure considering the temporal order is to quantify how much, on average, connectivity between consecutive t-graphlets changes. This indicates how volatile the temporal network is over time. Thus, volatility (V ) can be defined as: where D is a distance function and T is the total number of time points. The distance function quantifies the difference between a graphlet at t with the graphlet at t + 1. In all the following examples in this paper for volatility use Hamming distance as it is appropriate for binary data.
Whereas there was no difference in fluctuability between the networks shown in Figures 4B and 4C, there was a difference in volatility, as the network in Figure  4B has more abrupt changes in connectivity compared to the network shown in Figure 4C.
Extensions of the volatility measure are possible. Similar to fluctuability, volatility can be defined at a local level. A per edge version of volatility can be formulated as: Additionally, taking the mean over j in V L i,j would give an estimate of volatility centrality.

Global measure: Reachability Latency
Measures of reachability focus on estimating the time taken to "reach" nodes in a temporal network. In ref. 67, both reachability ratio and reachability time are used. The reachability ratio is the percentage of edges that have a temporal path connecting them. The reachability time is the average length of all temporal paths. However, when applying reachability to the brain, the two aforementioned measures are not ideal given the non-controversial assumption that any region in the brain, given sufficient time, can reach all other regions.
With this assumption in mind, we define a measure of reachability, reachability latency that quantifies the average time it takes for a temporal network to reach an a priori defined reachability ratio. This is defined as: where d t i is an ordered vector of length N of the shortest temporal paths for node i at time point t. The value k represents the rN th element in the vector, which is the rounded product of the fraction of nodes that can be reached, r, with N being the total number of nodes in the network.
In the case of r = 1, (i.e. 100% of nodes are reached), eq. 11 can be rewritten as: Eq. 12 has been referred to as the temporal diameter of the network (68). If eq. 12 is modified and calculated per node instead of averaging over nodes, it would be a temporal extension of node eccentricity.
Unless all nodes are reached at the last time point in the sequence of recorded data, there will be a percentage of time points from which all nodes can not be reached. This effectively reduces their value of R as d t i,j cannot be calculated by R is still normalized by T . If this penalization is considered too unfair, it is possible to normalize R by replacing T with T * , which is the number of time points where d T i,j has a real value.

Global measure: Temporal Efficiency
A similar concept is the idea of temporal efficiency. In the static case, efficiency is computed as the inverse of the average shortest path of all nodes. Temporal efficiency is first calculated at each time point as the inverse average shortest path length for all nodes. Subsequently, the inverse of average shortest path lengths are averaged across time points to obtain an estimate of global temporal efficiency, which is defined as Although reachability and efficiency estimate similar temporal properties, since both are based on the shortest temporal paths, the global temporal efficiency may result in different results than the reachability latency. This is because efficiency is proportional to the average shortest temporal path and reachability to the longest shortest temporal path to reach r percent of the network. Similar to the case of static graphs, temporal efficiency can also be calculated on a nodal as well as a global level.

Summary of temporal network measures
In Table 1 we provide a brief summary of the temporal network measures outlined here, accompanied with a short description. We also signify which measures that are sensitive to temporal order.

Statistical considerations of temporal network measures
When implementing temporal graph measures, it is important to perform adequate statistical tests in order to infer differences between subject groups, task conditions, or chance. For group comparisons, non-parametric permutation methods are advantageous where the group assignment of the calculated measure can be shuffled between the groups and a null-distribution can be created.
Alternatively, to justify that a measure is significantly present above chance levels, construction of null graphs is needed. There exist multiple ways to create temporal null graphs and they each have their own benefits and drawbacks.
One method is to permute the temporal order of entire time series, but this will destroy any auto-correlation present in the data. Another alternative is to permute the phase of the time series prior to thresholding the t-graphlets. A third option would be to permute blocks of time series data. We refer the reader to ref. (51)  Applying temporal network measures onto fMRI data.
In the second part of the paper we now turn our attention to the application and interpretation of temporal network measures when applied to neuroimaging data. This is done by applying the measures outlined above onto an fMRI dataset. . We used data only from the 2nd or 3rd session, which were the eyes open (EO) and second eye closed session (EC), where the order was counterbalanced across subjects. Two subjects were excluded due to incomplete data. Further details regarding the scanning procedure are given in 69.
All resting state fMRI data was pre-processed using Matlab (Version 2014b, Mathworks, Inc.), CONN (70) and SPM8 (71) Matlab toolboxes. Functional imaging data was realigned and normalized to the EPI MNI template as implemented in SPM. Spatial smoothing was applied using a Gaussian filter kernel (FWHM = 8 mm). Additional image artifact regressors attributed to head movement (72,73) were derived by using the ART toolbox for scrubbing (http://www.nitrc.org). Signal contributions from white brain matter, cerebrospinal fluid and head-movement (6 parameters), and the ART micro-movement regressors for scrubbing, were regressed out from the data using the CompCor algorithm (74, the first 5 PCA components were removed for both white matter and CSF). After regression, the data was band-passed between 0.008-0.1 Hz, as well as linearly detrended and despiked. Time series of fMRI brain activity were extracted from 264 regions of interest (spherical ROIs with a 5mm radius) using a parcellation scheme of the cortex and subcortical structures described in 11. These 264 regions of interest are further divided into 10 brain networks as described in 75.

Creating time-graphlets (t-graphlets)
While there are many proposed methods for dynamic functional connectivity (76, 37, 69, 77, 39, 41), we chose a weighted correlation strategy (described below) as it does not require fitting any parameters or clustering.
Our logic is to calculate the dynamic functional brain connectivity estimates based on a weighted Pearson correlation. To calculate the conventional Pearson correlation coefficient, all points are weighted equally. In the weighted version points contribute differently to the coefficient depending on what weight they have been assigned. This weight is then used to calculate the weighted mean and weighted covariance to estimate the correlation coefficient. Using a unique weighting vector per time point we were able to get unique connectivity estimates for each time-point.
The weighted Pearson correlation between the signals x and y is defined as r(x, y; w) = Σ x,y;w Σ x,x;w Σ y,y;w (14) where Σ is the weighted covariance matrix and w is a vector of weights that is equal in length to x and y. The weighted covariance matrix is defined as where n is the length of the time series. Note that Σ is the covariance matrix and n i is a sum over time points. µ x;w and µ y;w are the weighted means, defined as Eqs. 14-16 define the weighted Pearson coefficient with the exception of the weight vector w. If every element in w is identical, we can easily observe that the unweighted (conventional) Pearson coefficient will be calculated. Here, we instead wished to calculate a unique w for each time point, providing a connectivity estimate based on the weighted mean and weighted covariance.
Different weighting schemes could be applied. In fact, many of the different dynamic connectivity methods proposed in the literature are merely different weighting schemes (e.g. a non-tapered sliding window approach is just a binary weight vector). Broadly speaking, a weighting scheme between two ROIs can consider these two time series in isolation (local weighting) or, alternatively, consider every ROI's time series (global weighting).
We decided upon the global weighting scheme, where we calculate the multivariate distance between a time-point and all other nodes. This entails that the weights for the covariance estimates at t are larger for other time points that display a similar global spatial pattern across all nodes to the nodes at t. A new weight vector is calculated for each time-point. With a unique weight vector per time-point, there is a unique weighted Pearson correlation per time-point. This reflects the weighted covariance where time-points with similar global spatial brain activation are weighted higher. This produces, for each edge, a connectivity time series with fluctuating covariance.
More formally, the weights for estimating the connectivity at time t are derived by taking the distance between the activation of the ROIs at t and each other time point (indexed by v): where D is a distance function and y is the multivariate time series of the ROIs. The weight vector of t is created by applying eq. 17 for all v ∈ T, v = t. This implies that at the time point of interest, t, we calculate a vector of weights (indexed by v) that reflects how much the global spatial pattern of brain activity (i.e. all ROIs) differ in brain activity from t. For each weight vector w t , the values were scaled to range between 0 and 1. Finally, w t t (undefined in eq. 17) is then set to 1. For the distance function, we used Euclidean distance (i.e. ). After the derivation of the connectivity time series, a Fisher transform and a Box Cox transform were applied. For the Box Cox transform the λ parameter was fit by taking maximum likelihood after a grid-search procedure through -5 and 5 in 0.1 increments for each edge. Prior to the Box Cox transformation, the smallest value was scaled to 1 to make sure the Box Cox transform performed similarly throughout the time series (78). Each connectivity time series was then standardized by subtracting the mean and dividing by the standard deviation. Binary t-graphlets were created by setting edges exceeding two standard deviations to 1, otherwise 0.
Our thresholding approach to create binary connectivity matrices is suboptimal and could be improved upon in future work (see Discussion). The need to formulate more robust thresholding practices has been an ongoing area of research in static network theory in the neurosciences (79). Similar work needs to be carried out for temporal networks as a limitation of the current approach is a heightened risk of false positive connections.
An example of a binary slice graph representation of dynamic fMRI brain connectivity is shown in Figure 5 where the connectivity within the visual subnetwork was computed using the weighted Pearson correlation method in a single subject (31 nodes belonging to the visual brain network).

Tools for temporal network theory
We have implemented all temporal network measures described in the present work in a Python package of temporal network tools called Teneto (http://www.github.com/wiheto/teneto) for python 3.x, although the package itself is still under development. The package currently contains code for all the measures mentioned above and plotting functions for slice plots. Data formats for both the t-graphlet and contact sequence data representation are available.

Statistics
All between group comparisons in the next section use the between group permutation method outlined previously. Null distributions were created with 100000 permutations, shuffling which group each subject's EO/EC results belonged to and all comparisons were two tailed. For between subject comparisons a Spearman rank correlations were used.

Applying temporal degree centrality and temporal closeness centrality
With temporal centrality measures we can formulate research questions along the following lines: (i) which nodes have the most connections through time (temporal degree centrality), or (ii) which nodes have short temporal paths to all other nodes (temporal closeness centrality). For the shortest paths calculations, we allowed all possible steps at a single time point to be used in this example.
The node centrality, averaged over subjects, was calculated for both centrality measures. The centrality estimates for nodes were compared across imaging sessions to evaluate whether there was a similar temporal pattern across subjects. The temporal degree centrality was correlated significantly for the EO and EC conditions ( Figure 6A, ρ=0.35, p<0.0001). A similar, but slightly weaker trend was observed for temporal closeness centrality ( Figure 6B, ρ=0.62, p<0.0001). This entails that nodes appear to have similar centrality properties for the EO and EC resting state conditions. Although both centrality measures showed between session correlations, there was no consistent relationship between the two measures. The temporal degree centrality was correlated against the temporal closeness centrality. No significant relation was observed in the EO session ( Figure 6C, ρ=0.09, p=0.15), and a negative relation for the EC session ( Figure 6D, ρ=0.45, p<0.0001). This results suggest that the two different temporal centrality measures identify different nodes in the brain as being central. Figure 6: Applying temporal degree centrality and temporal closeness centrality for the EO and EC conditions. Each dot represents centrality for a node. (A) Temporal degree centrality for the EO condition versus the EC condition. (B) Same as A, but for temporal closeness centrality. (C) Temporal degree centrality versus temporal closeness centrality in the EO condition. (D) same as C, but for EC. *** signifies p<0.001

Applying burstiness
By applying the burstiness measure (B) to a fMRI dataset we can ask questions related to the temporal distribution of connections in brain connectivity. To illustrate that there is indeed a bursty pattern of brain connectivity, we first plot the distribution of all inter-contact times taken from all subjects and edges for the EO session and observe a heavy tailed distribution ( Figure 7A).
We then considered the question regarding the most robust way to calculate B, given that our example fMRI dataset has a rather low temporal resolution and only span a limited time period. It is possible that there may not be enough edges present in each subject for a stable estimate of B in a single subject. To test this concern, we evaluated whether there was a difference in B for a single subject versus the case of concatenating the inter-contact times over multiple subjects. This was done for a single edge that connects right posterior cingulate cortex and right medial prefrontal cortex in the EO session. As shown in Figure 7B, there is a considerable variance in the individual subject estimates of burstiness. But if we cumulatively add subjects, the estimate of burstiness stabilizes after approximately 12 subjects. This illustrates the importance of having enough data to calculate reliable B estimates. Henceforth, all B estimates have been calculated by pooling inter-contact times over subjects.
We then wished to contrast EO versus EC in terms of burstiness. Both conditions showed a bursty distributions across all edges (see Figure 7C) and slightly more so for the eyes closed compared to the eyes open condition. Both withinand between-network connectivity showed a bursty distribution of connectivity patterns in both conditions (Figures 7DE).
Given that both EO and EC showed bursty correlations, we tested whether values of B correlated between conditions ( Figure 7F). We found a weak, but significant, correlation between conditions (ρ = 0.066, p<0.0001). This weak correlation (i.e. less than one percent of the between-condition variance is accounted for) suggests that burstiness may relate to task related edges but more research is needed on this topic.

Applying fluctuability
Using the fluctuability measure, researchers may ask questions regarding how many unique edges exist in a temporal network model of the dynamic functional brain connectome, indicating whether more resources (i.e. diversity of connections) are required during a given task.
The fluctuability measure was applied and contrasted between the EO and EC conditions ( Figure 8A) and between-subjects ( Figure 8B). We observed no significant between-subject correlation in F (ρ= 0.18, p=0.23) but found a difference for the average value of F between conditions (p=0.0020), with the EO condition having a higher degree of fluctuability. Thus, the EO condition

Applying volatility
With volatility, we can ask whether the connectivity changes more quicker or slower through time. Some tasks might require the subject to switch between different cognitive faculties or brain states while other tasks may require the brain to be more stable and switch states less.
As with fluctuability, we computed volatility for both between-subjects ( Figure  8C) and between-conditions ( Figure 8D). We observed a significant correlation for between-subject volatility over the two conditions (ρ = 0.46, p=0.0012, Figure  8C) was obtained. Additionally, no significant difference in volatility between EO and EC was observed (p=0.051, Figure 8D).

Applying reachability latency
The measure of reachability latency addresses the following question regarding the overall connectivity pattern along the temporal axis: how long does it take to travel to every single node in the temporal network? For example, reachability latency may be useful for evaluating the dynamics when functional or structural connectomes differ substantially. We computed the reachability latency by setting r=1 (i.e. all nodes must be reached).
The results are shown in Figure 9 and a significant difference in average reachability latency between conditions was found ( Figure 9A, EO: 21.07 EC: 22.96, p=0.0005). Given that there was an overall increase in reachability latency during EC compared to EO, we decided to, post hoc, unpack this finding and check whether the discovered global difference in reachability could be localized to brain networks that should differ between the EC and EO conditions. So, rather than calculating reachability latency for the entire brain, we averaged the measure of reachability latency (to reach all nodes) for 10 preassigned brain networks (technically modules in network theory terminology). In this post hoc analysis, we see that the brain networks with the highest differences in reachability latency were the visual, dorsal attention, and the fronto-parietal brain networks ( Figure 8B). Thus, the results suggests a longer reachability latency for these networks, i.e. it takes more time to reach all nodes in the visual and attention networks in the EC condition compared to EO, seems biologically reasonable.
Despite this between-condition differences in reachability, we observed that there was also a significant between-subject relationship (ρ=0.36, p=0.015, Figure  8C). Taken together with the previous finding, our results show that measures of reachability latency reflects both between-conditions and between-subjects differences.

Applying temporal efficiency
Finally, we computed the global temporal efficiency for both EO and EC conditions. While reachability latency employs the shortest temporal path to calculate how long it takes to reach a certain percentage of nodes, temporal efficiency relates to the average inverse of all shortest temporal paths.
We found that temporal efficiency is significantly larger during EO than EC (p=0.0011, Figure 10A). This finding suggests that, on average, the temporal paths are shorter in the EO condition compared to the EC condition. We  Figures 10BC). This suggests that there is a relationship between the longest shortest temporal path (part of reachability) and the average shortest temporal path (part of efficiency) in this dataset.

Discussion
Our overarching aim in this work was to provide an overview of the key concepts of temporal networks for which we have introduced and defined temporal network measures that can be used in studies of dynamic functional brain connectivity. Additionally, we have shown the applicability of temporal metrics in network neuroscience and provided results that pertain to their characteristics by applying them onto a resting-state fMRI dataset.

Summary of applying temporal network measures to fMRI data
Both temporal degree centrality and closeness centrality were correlated across conditions whereas no correlation between the two centrality measures was observed. This result suggests that the two centrality measures quantify different dynamic properties of the brain. At a global network level, we examined the temporal uniqueness of edges (fluctuability) as well as the rate of change of connectivity (volatility). We could identify a significant condition-dependent difference in fluctuability, but no difference was observed for volatility between conditions. Conversely, a significant between-subjects correlation was found for temporal network volatility, but the between-subjects correlation in fluctuability not significant. These results suggest that the observed differences in volatility, i.e. differences in brain connectivity at two different points in time, were to a relatively larger extent driven by inter-subject differences in connectivity dynamics than by differences related to the tasks (EO/EC) per se.
Our results regarding reachability latency during EO and EC conditions indicates task driven changes in latency, especially since the connectivity of visual and attention brain networks are known to reconfigure between EO and EC conditions (80). Thus, the observed difference in reachability latency might be a reflection of a putative network reconfiguration. Further, reachability also showed a between-subject correlation across conditions. The distribution of inter-contact time points of connectivity between brain nodes is bursty is in agreement with our previous findings (41). Notably, our previous findings were obtained at a high temporal resolution (TR = 0.72 seconds) and it is therefore reassuring that we are able to detect similar properties of burstiness in brain connectivity also at a lower temporal resolution (TR = 2 seconds). Of note, the between-network versus within-network connectivity here differed from that obtained in a previous study which found between-network connectivity to be significantly more bursty than within-network connectivity (41). This difference is probably due to the different kind of thresholding being applied. Here a variance based thresholding was applied instead of the magnitude based in the previous study. We have discussed previously that these different strategies will prioritize different edges (81,78).

Other approaches to temporal network theory
The list of measures for temporal networks described here is far from exhaustive. While we have primarily focused on temporal properties that can be defined on a nodal and/or global level, detecting changes in network modularity over time is an active part of network theory research (82,83). This has recently been applied to the brain connectome (84,85) and applied in the context of learning (86,87). In a similar vein, the presence of hyperedges have explored and identified groups of edges that have similar temporal evolution (88,89). Similarly, studies investigating how different tasks evoke different network configurations (90,75,91) is also an active research area. Another recent exciting development is to consider a control theory based approach to network neuroscience (92), which can be applied to networks embedded in time (93).
Yet another avenue of temporal network research is to adopt static network measures to each t-graphlet and then derive time series of fluctuating static measures (94,95). It is also possible is to quantify properties of dynamic fluctuations in brain connectivity through time and then correlate them with the underlying static network. Using such a strategy, between-subjects differences for the dynamic and the static networks can be revealed (e.g. 96).
Finally, there are considerably more measures within the temporal network literature that can be put to use within the field of network neuroscience. For example, the list of centrality measures provided here is not complete. For example, a temporal extension of the betweenness centrality, which is often used for static networks, can be adopted to the temporal domain (97). Another example is the computation of spectral centrality in the temporal domain (see 68 for further details).

When is temporal network theory useful?
As stated in the introduction, graphs are an abstract representation corresponding to the some state in the world. The properties quantified on these representations try to reflect the properties corresponding to the world. Not every representation of brain function will require time, which makes temporal network measures unsuitable. Under what conditions will temporal network theory be of use? If the brain is viewed as a system that is processing information within different brain networks and this information is communicated between these brain networks, then we believe that temporal network theory can be advantageous to quantify these interactions.
There are a couple of additional considerations when applying temporal network theory. Interpreting what a measure means can only be done in relation to the temporal resolution of the data. For example, volatility when applied to a dataset obtained with a temporal resolution of years will obviously entail a different interpretation compared to a dataset acquired with a temporal resolution of milliseconds.
Finally, consideration is also needed about which temporal network measure(s) that should be applied to a research question. Although temporal network theory provides a wide array of measures to the users disposal, we advise against applying the entire battery of measures to a given dataset. Given a hypothesis about some state of the world (S), this should first be translated into a hypothesis about which network measure will quantify the network representation of S. A more exploratory analysis showing significant (and multiple comparison corrected) correlations in five out of ten measures, when these measures where not first formulated in relation to S, become hard-to-impossible to translate into something meaningful.

Limitations and extensions for temporal network measures
Our scope was limited to temporal measures that operate on binary time series of brain connectivity (i.e. binary t-graphlets). Most of the measures discussed here can be extended and defined for series of weighted connectivity matrices. However, certain temporal measures are not straightforward to convert to the weighted case. Pertinent examples are burstiness and reachability for which no simple strategy on how to apply them in case of a weighted connectivity context is apparent.
Regardless of the method used to derive the brain connectivity time series, it is important that adequate pre-processing steps are performed on the data to avoid potential bias in the analysis. Our proposal of deriving t-graphlets with weighted Pearson correlation coefficients to compute time series of brain connectivity constitute no exception to this concern. In a connectivity analysis that is based on sequences of binary t-graphlets, an absence or presence of an edge might potentially be influenced by the user's selection of thresholding. Hence, the strategy regarding how to optimally threshold the t-graphles into binary graphlets is of vital importance. We believe that it is important to keep in mind that comparisons of variance as well as the mean of connectivity time series might be biased by the underlying mean-variance relationship (81,78). This further emphasizes the need for adequate thresholding strategies for connectivity time series. Moreover, subject head motion, known to be a large problem for fMRI connectivity studies, (72,73,98), can also lead to spurious dynamic properties (99) and should be controlled for.

Outlook
By providing a survey of the theory of temporal networks and by showing their applicability and usefulness in network neuroscience, we hope that we have stirred the readers interest in using models based on temporal networks when studying the dynamics of functional brain connectivity. To this end, we have implemented all temporal network measures described in the present paper in a software package that is freely available (Teneto, written in Python and can be downloaded at github.com/wiheto/teneto). The plan for the Teneto package is to include additional temporal network measures, plotting routines, wrappers for other programming languages, and dynamic connectivity estimation.