Guided graph spectral embedding: Application to the C. elegans connectome

Graph spectral analysis can yield meaningful embeddings of graphs by providing insight into distributed features not directly accessible in nodal domain. Recent efforts in graph signal processing have proposed new decompositions—for example, based on wavelets and Slepians—that can be applied to filter signals defined on the graph. In this work, we take inspiration from these constructions to define a new guided spectral embedding that combines maximizing energy concentration with minimizing modified embedded distance for a given importance weighting of the nodes. We show that these optimization goals are intrinsically opposite, leading to a well-defined and stable spectral decomposition. The importance weighting allows us to put the focus on particular nodes and tune the trade-off between global and local effects. Following the derivation of our new optimization criterion, we exemplify the methodology on the C. elegans structural connectome. The results of our analyses confirm known observations on the nematode’s neural network in terms of functionality and importance of cells. Compared with Laplacian embedding, the guided approach, focused on a certain class of cells (sensory neurons, interneurons, or motoneurons), provides more biological insights, such as the distinction between somatic positions of cells, and their involvement in low- or high-order processing functions.


INTRODUCTION
Many aspects of network science relate to graph partitioning-the grouping of nodes in subgraphs-and graph embedding-their representation in a low-dimensional space that accounts for graph topology (Von Luxburg, 2007).Spectral graph theory motivates analytical methods based on the eigenvectors of fundamental graph operators, such as the adjacency and the Laplacian operators (Chung, 1997).For instance, the well-known graph cut problem can be convexly relaxed and solved by thresholding of the Laplacian eigenvector with the smallest non-zero eigenvalue, known as the Fiedler vector (Fiedler, 1989).More recently, new approaches in graph signal processing have taken advantage of the Laplacian eigenvectors to define the graph Fourier transform, which can then be used to process (i.e., filter) graph signals in the spectral domain (Ortega, Frossard, Kovačević, Moura, & Vandergheynst, 2018;Shuman, Narang, Frossard, Ortega, & Vandergheynst, 2013); the spectral graph wavelet transform by Hammond, Vandergheynst, and Gribonval (2011) is one such example.
The Laplacian eigenvectors also provide a meaningful embedding by mapping nodes onto a line, or higher-dimensional representation, that minimizes distances between connected nodes (Belkin & Niyogi, 2003).Other well-known embedding techniques use different metrics for distance in order to assess local graph properties, ranging from simple Euclidean distance in locally linear embedding (Roweis, 2000), to shortest path in Isomap (Tenenbaum, 2000), transition probability (Shen & Meyer, 2008), or conditional probability of an edge in t-distributed stochastic neighbor embedding (van der Maaten & Hinton, 2008).
A time-dependent dynamical similarity measure has also been introduced (Schaub, Delvenne, Lambiotte, & Barahona, 2018).Efforts have also been made to employ global properties of the graph, such as in Sammon mapping (Sammon, 1969) optimizing a cost function including all pairwise distances.In this manner, embedding is performed while taking in consideration both local (neighborhood) and global (distant nodes) properties of the graph.However, these techniques are not aware of the network at the mesoscale.One cannot guide the embedding by giving a certain subgraph more importance while still preserving local features and global topology characteristics.
In essence, the most powerful feature of graph spectral embedding is to effectively summarize local structure across the graph into low-dimensional global patterns.This is achieved, for instance, with the recently introduced concept of graph Slepians; i.e., graph signals that are bandlimited and take into account a subset of selected nodes.Specifically, two types of Slepian designs that respectively optimize for energy concentration and modified embedded distance have been introduced (Van De Ville, 2016;Van De Ville, Demesmaeker, & Preti, 2017b).
In this work, we further build on this framework by providing a simple way to guide analyses with additional flexibility.Guidance includes the selection of a given subgraph or group of nodes to study, and the ability to specify the intensity of the focus set on these selected nodes.With respect to graph Slepians, we hereby provide several extensions.First, we allow the selection process to be weighted, so that the importance of a node can be gradually changed.Second, we propose a new criterion that meaningfully combines the two existing ones; i.e., we want to maximize energy concentration and minimize modified embedded distance at the same time.Third, as we detail below, these two criteria are counteracting, and hence, we obtain stable solutions even at full bandwidth, where the original Slepian designs degenerate numerically.Fourth, we show how this criterion can be rewritten as an eigenvalue problem of an easy modification of the adjacency matrix, which can be interpreted as reweighting paths in the graph, and thus significantly simplifies the whole Slepian concept.The solution of the eigendecomposition then defines the guided spectral domain, spanned by its eigenvectors.We illustrate the proposed approach with a proof-of-concept on the Caenorhabditis elegans (C.elegans) connectome.
Through spectral embedding-based visualization, we observe the effects of focusing on a specific cellular population made of sensory neurons, interneurons or motoneurons, and we reveal trajectories of these neurons as a function of focus strength.The purpose of this work is to introduce guided spectral analysis; that is, to indicate direction by selecting a subset of nodes, and to adjust the strength of the focus set on this subset.Each colored circle in the figure depicts one C. elegans neuron.Light gray strokes link the cells that are connected by gap junctions or chemical synapses.Labels and connectivity were retrieved from Varshney et al. (2011).

Essential Graph Concepts
We consider an undirected graph with N nodes, labeled 1, 2, . . ., N .The edge weights are contained in the symmetric weighted adjacency matrix Ã with non-negative real-valued elements a i,j , i, j = 1, . . ., N .
We also assume that the graph contains no self-loops; i.e., all diagonal elements a i,i are zero.The degree matrix D is the diagonal matrix with elements d i,i = N j=1 a i,j .The graph Laplacian is defined as L = D − Ã and can be interpreted as a second-order derivative operator on the graph.Here, we use the symmetrically normalized variants of the adjacency A and graph Laplacian L defined as This normalization is often used in applications to emphasize the changes in topology and not in nodal degree (De Lange, De Reus, & Van den Heuvel, 2014).
Let us define a graph signal as a vector of length N that associates a value to each node (Shuman et al., 2013).One way to recognize the importance of the Laplacian and its eigendecomposition is to consider the smoothness of a graph signal x as which sums squared differences between signal values on nodes that are connected, proportionally to their link strength a i,j .The eigenvectors of L minimize this distance that is reflected by the eigenvalues, sorted by convention increasingly as λ 1 = 0 ≤ λ 2 ≤ . . .≤ λ N .Therefore, considering the eigenvectors associated to the smallest non-zero eigenvalues provides the Laplacian embedding of the nodes that minimizes distance in a lower-dimensional space (Belkin & Niyogi, 2003).The eigenvector with the smallest non-zero eigenvalue is also known as the Fiedler vector (Fiedler, 1989), which relates to the solution of the convex relaxation of the graph cut problem (Von Luxburg, 2007).
Therefore, the eigendecomposition L = UΛU of the graph Laplacian is the cornerstone of spectral methods for graphs, as the eigenvectors {u k }, k = 1, ..., N (columns of U) play the role of graph Fourier components, and the associated eigenvalues {λ k }, k = 1, ..., N , of frequencies (Chung, 1997).The graph Fourier transform (GFT) then provides the link between a graph signal x and its spectral coefficients given by vector x: x = Ux, and x = U x.

Graph Slepians
In earlier work, the combination of the concepts of selectivity and bandwidth for graph signals has been used to define "graph Slepians" (Tsitsvero, Barbarossa, & Di Lorenzo, 2016;Van De Ville, 2016;Van De Ville et al., 2017b); i.e., bandlimited graph signals with maximal energy concentration in the subset of nodes S-a generalization of prolate spheroidal wave functions that were proposed fifty years ago on regular domains to find a trade-off between temporal and spectral energy concentrations (Slepian, 1978;Slepian & Pollak, 1961).The presence or absence of a node in S is encoded by the diagonal elements of the selection matrix S; that is, we have S i,i = δ i∈S , i = 1, . . ., N , where δ is the Kronecker delta.The Slepian design then boils down to finding the linear combination of Laplacian eigenvectors, encoded by spectral coefficients ĝ, within the bandlimit W with maximal energy in S, reverting to the Rayleigh quotient where W is a spectral selection matrix that has W ones on its diagonal followed by N − W zeros.This problem can be solved by the eigendecomposition of the concentration matrix C = W U SUW as .., W , are orthonormal over the entire graph as well as orthogonal over the subset S; i.e., we have g k g l = δ k−l as well as For the purpose of this work, we introduce the set of bandlimited graph signals such that we can then rewrite the Slepian criterion of Eq. ( 2) directly in the vertex domain as An alternative Slepian design was also proposed in Van De Ville et al. (2017b)-see also Huang et al. (2018), modifying the Laplacian embedded distance of Eq. ( 1) as follows: It is important to realize that the eigenvalues {µ k } of the original design reflect the energy concentration in the subset S, while the eigenvalues {ξ k } of the alternative design correspond to a modified embedded distance that can be interpreted as a "frequency value" localized in S, in analogy to the global GFT case.Consequently, "interesting" eigenvectors correspond to those with high µ k , concentrated in the subset S, or low ξ k , showing the main localized low-frequency trends, respectively.
However, the eigendecompositions, taken individually, do not necessarily lead to eigenvectors that combine both virtues.

Guiding Spectral Embedding Using a New Criterion
We hereby propose to further generalize the Slepian design in a number of ways.First, we relax the selection matrix S to a cooperation matrix M with diagonal elements that can take any non-negative real values m l ≥ 0, l = 1, . . ., N .This allows to gradually change the impact of a node on the analysis, between an enhanced (m l > 1), an unmodified (m l = 1) and a reduced (m l < 1) importance with respect to the selection matrix case.Second, we combine the criteria of both already existing Slepian designs by subtracting the modified embedded distance from the energy concentration: Third, we remove the bandlimit constraint and allow g to be any graph signal, which is an operational choice due to the joint optimization of both criteria, as will be illustrated and discussed later on.
Using the Taylor series approximation of the square root function, we derive L 1/2 in terms of the adjacency matrix A: . Details on the series expansion are discussed in section 3.3.We can then further rewrite the internal part of the criterion (5) as By convention, the associated eigenvalues are sorted in decreasing order.Based on the fact that eigenvalues of the symmetric normalized Laplacian are greater or equal to 0 and lower or equal to 2, one , where m max is the highest cooperation value appearing in M, using bounds from Corollary 2.4 in Lu and Pearce (2000).
In what follows, we will be considering the linear and quadratic approximations of the new criterion's eigenvalues: Interestingly, the combination of both existing Slepian criteria leads to the emergence of the adjacency matrix A as the key player in our new formalism.In fact, when the cooperation matrix is the identity matrix, the criterion reverts to the eigendecomposition of A itself.
Let us now interpret the impact of the cooperation weights: obviously, an element a i,j of the adjacency matrix contains the weight of a direct path from i to j.The linear approximation ζ lin reweights such a direct path with the average (m i + m j )/2 of the cooperation weights that are attributed to nodes i and j, as illustrated in Fig. 2A (left illustration).Notice that paths where only one node has a cooperation weight equal to 0 are still possible, as the other cooperation weight is then simply divided by two.
As for the quadratic approximation, it takes into account length-2 paths between nodes i and j.For instance, the sum of all length-2 paths between i and j can be read out from the squared adjacency matrix: where the inner product reveals the kernel interpretation of the length-2 walk matrix.Therefore, as illustrated in Fig. 2A (right illustration), the term reweights all length-2 paths by the summed cooperation weight between the start and end nodes, while subtracting the term penalizes the path according to the cooperation weight of node l through which the path passes.
Analogously, the term A k in the criterion introduces modifications of k-length paths in the graph.
However, for k > N , reweighting reduces to modifications of lower-length paths.The Cayley-Hamilton theorem implies that for every matrix A of size N × N , the matrix A N can be written as a linear combination of matrices A k for k = 0, 1, . . .N − 1.By induction, it holds that A k for every k > N can also be written as a linear combination of the same set of N matrices.Hence, modifications of paths longer than N − 1 can be seen as a linear combination of additional modifications of paths of length 0 to N − 1.

MATHEMATICAL CONSIDERATIONS
This section provides mathematical foundations supporting the methods and the results presented in this work.We start by discussing the link between the selection matrix and the eigenspectrum associated to the energy concentration criterion, and the relationship with the modified embedded distance criterion, using full bandwidth.Then, we provide a formal justification of the Taylor series approximation of the square root matrix function used in Eq. ( 6) and discuss the error associated to this approximation.
3.1 Eigenspectrum associated to the energy concentration criterion In the case of two nodes i and j, the average of their cooperation weights yields the multiplying factor for a i,j (blue term).When a third node l is added, the difference between average cooperation weight between nodes i and j (light blue term), and the cooperation weight of node l (salmon term), multiplies the length-2 path and then also contributes to the output entry.B. For full bandwidth, the concentration matrix is defined as C = U SU, where U is the matrix whose columns are eigenvectors of the graph Laplacian, and S is a diagonal selection matrix.Hence, the eigendecomposition of C is trivial: its eigenvectors are the rows of U, and the eigenvalues of C correspond to the diagonal entries of S, as can be seen from Fig. 3A for W = 279.

Eigenspectrum associated to the modified embedded distance criterion
We show that for full bandwidth, the number of zero eigenvalues of the modified embedded distance matrix, denoted z λ , is lower-bounded by the number of zeros on the diagonal of the selection matrix, denoted z S .To see this, consider the following decomposition of the modified embedded distance matrix C emb : where i k is the index of the k th non-zero entry of the selection matrix S, and l i k denotes the i k th column vector of the matrix L 1/2 .From this expression, it can be seen that the rank of C emb is at most N − z S and hence z λ ≥ z S .Equality holds when the set of vectors l i k corresponding to the non-zero entries of S are linearly independent.This is the case for connected graphs, as any subset (with cardinality strictly less than N ) of the columns of L 1/2 is linearly independent.This relationship is observed in Fig. 3B for W = 279.

Taylor series of matrix-valued functions
The Taylor expansion of L 1/2 proposed in Eq. ( 6) is derived using the scalar Taylor series of f (x) = √ x evaluated around the point a = 1: where The square root matrix of L then writes: Since the Laplacian and adjacency matrices are normalized, their eigenvalues verify Λ L = I − Λ A and their eigenvectors are equal (U L = U A ) when ordered following increasing and decreasing eigenvalues, respectively.The previous equation finally reduces to: , which is the expression used in Eq. ( 6).
Truncation of the Taylor series of a function f (x) to a finite upper bound on k ≤ K leads to an approximation error which can be estimated by the Lagrange form of the remainder where the (K + 1) th derivative is evaluated at the point ξ found between x and 1.On the other hand, since the eigenvectors forming U L are unit-norm vectors, the distance d K between a finite sum approximation of L 1/2 and the true square root of the matrix is bounded as: where || • || F denotes the Frobenius norm.In the case of a first order Taylor approximation (K = 1), we get: The eigenvalues λ i range from 0 to 2, and all contribute to the total approximation error d 1 , with eigenvalues further from 1 contributing more.Since the second-order derivative of the square root function increases as its argument approaches 0, the most contributing factors of the error derive from Taylor approximations at near-zero eigenvalues.Hence, graphs whose Laplacian spectrum exhibits higher eigengaps in the lower band tend to have lower approximation error.
Finally, the Frobenius distance d K,M between the true proposed criterion M − L 1/2 ML 1/2 and its approximation using a K th -order Taylor approximation of L 1/2 verifies: where ||M|| F corresponds to the Frobenius norm of the cooperation matrix.Hence, the upper bound on d K,M reduces as the nodes are given less importance; i.e., when the cooperation values get closer to 0.
The C. elegans worm is an intensely studied model organism in biology.In particular, the wiring diagram of its 302 neurons has been carefully mapped during a long and effortful study (White, Southgate, Thomson, & Brenner, 1986).Here, we use the symmetric normalized adjacency matrix that summarizes data from 279 somatic neurons, and combined connectivity from chemical synapses and gap junctions (Chen, Hall, & Chklovskii, 2006).We retrieved the type of each neuron (sensory neuron, interneuron or motoneuron) from the WormAtlas database (http://www.wormatlas.org/).
In their modeling work, Varshney et al. (2011) studied network properties of the worm connectome using different approaches, including Laplacian embedding.In particular, the topological view generated by mapping nodes on the first two eigenvectors with smallest non-zero eigenvalues already reveals interesting network organization (see Fig. 1).The horizontal dimension (u 2 ) mainly distinguishes the motoneurons from the head (right green circles) and from the ventral cord (left green circles).The vertical dimension (u 3 ) reflects information flow from sensory neurons and interneurons of the animal's head (top) to the nerve ring and ventral cord circuitries (bottom).

Eigenvalues of Different Criteria
To illustrate the eigenvalues obtained with the existing Slepian designs, as well as the newly proposed criterion, we considered the 128 motoneurons and "unselected" them by setting their respective entries in Next, we applied the modified Slepian design inspired by the Laplacian embedded distance.As shown in Fig. 3B, the eigenvalues ξ k reflect the modified embedded distance, which we now want to minimize.
For increasing bandwidth (plainer curves), its smallest values can be made lower; however, the subset of nodes with S i,i entries set to 0 is also described by eigenvectors with small eigenvalues.This becomes even clearer at full bandwidth, a case for which a subspace of 128 dimensions spanned by eigenvectors with a modified embedded distance of 0 is retrieved, as indicated by the green area in Interestingly, computing the eigenspectrum using a linear approximation of the criterion matrix (Fig. 3D, light brown curve) leads to very similar results, which only slightly vary for the largest  and 279 with increasingly lighter blue or purple shades, respectively.In the full bandwidth case, the shaded green areas highlight eigenvalues linked to optimal solutions of the respective criteria.In the third case, equivalent µ and ξ eigenspectra are plotted in blue and purple on top of the ζ one.The full ζ eigenspectrum is also compared to approximations obtained through Taylor series of increasing order (D), from linear to order 20, as depicted by increasingly darker brown curves.The two smaller plots are insets sampled at the start and at the end of the main plot, respectively.
eigenvalues.When the approximation order is increased up to 20 (increasingly dark brown curves), this low error further diminishes, although a mild difference remains with the ground truth.Inspection of the Slepian vectors related to several locations of the eigenspectrum (Supplementary Fig. S2) confirmed that the only salient differences actually involved the first Slepian vector (largest eigenvalue one).

Topology Revealed by Guided Spectral Analysis
We now guide the spectral analysis to focus on the three different types of neurons.For instance, when focussing on the role of the sensory neurons, we gradually decrease the cooperation weights of interneurons and motoneurons from 1 to 0. For each setting, we then visualize the topology revealed by the guided analysis by projecting the nodes on the eigenvectors with the second and third largest eigenvalues.We build the trajectory of each node through this two-dimensional embedding, after applying the Procrustes transform (Schönemann, 1966)  In Figs.4A and B, the trajectories are depicted when focussing on the sensory neurons by attributing cooperation weights to the other types of neurons ranging from 1 to 0.5, and from 0.5 to 0, respectively.
During the first half (Fig. 4A), the network organization is only slightly altered with respect to the initial view of Fig. 1; i.e., the sensory neurons move slightly more to the periphery, while the interneurons and motoneurons move to the origin.In the second part of the trajectory (Fig. 4B), a major split occurs in the bottom right branch of Fig. 4A between the left and right versions of a whole series of neurons, while the bottom left branch neurons move back to the center of the coordinate frame.The cell types found in the top branch are AWA, AWC, ASE, ASIL and AFD (all bilateral), while the segregated right/left pairs include the six members of the IL1 (IL1, IL1V, IL1D) and IL2 (IL2, IL2V, IL2D) families, the four cells of the OLQ (OLQV, OLQD) and CEP (CEPV, CEPD) subtypes, as well as the OLL and ADE neurons (all bilateral).In addition, URXR also lies in the right side, while URYL can be seen amongst members of the left sub-branch.Intriguingly, in addition to the above, we notice the segregation of the RIP interneurons, despite setting the focus on sensory neurons.
In Figs.5A and B, we then focus on the interneurons by reducing the cooperation weights of sensory neurons and motoneurons in two steps.As expected, the interneurons move towards the periphery.Their organization does not seem to be dominated by left versus right variants, as we found for sensory neurons, but rather by a set of well-defined clusters related to their functional involvement in the C.
elegans neuronal circuitry: in the first quadrant, we find the isolated AIA bilateral pair.Moving clockwise, a large cluster of neurons includes the bilateral AIY, AIZ, AIN, AIB and the single neuron RIR (top half of the cluster); and the bilateral RIA, AUA, RIB, RIG as well as the single neuron RIH  (bottom half of the cluster).A compact set of five cells follows (the AVE pair, AVKL, and the single neurons DVA and PVT), before reaching another large ensemble of neurons at the bottom left including bilateral AVA, PVC, LUA, PVW, AVD, and the single neuron PVR.Moving back upwards, AVJ neurons can be seen as a segregated pair, before eventually reaching a last group of cells containing the bilateral RIF, AVH, AIM, PVQ and AVF.
Finally, in Figs.6A and B, the organization of motoneurons is examined.Already in the first step (Fig. 6A), when reducing the cooperation weights of the sensory and interneurons from 1.0 to 0.5, we observe much stronger changes than in the previous cases.In particular, the initial organization completely collapses and the left branch of the motoneurons spreads out.This branch then develops into a peripheral organization when further decreasing the cooperation weights (Fig. 6B), with three main subsets of neurons: at the top, we retrieve members such as VC01-02, VD01-05, DB01 and DD01-02.In the bottom left branch, we find VA08-09, VB06/08-09, VD07-10 and DD04-05.In the bottom right branch, we locate the motoneurons VA11-12,VB10-11, VD12-13, DA09, DD06, AS11, PDA and PDB.
In addition, DVB is found as an intermediate in-between both bottom branches.
importance of selected nodes.The new criterion lets the adjacency matrix emerge as the central graph operator, instead of the Laplacian, and is operational at full bandwidth.This is surprising at first sight, because neither of the conventional Slepian criteria is practical without the bandlimit constraint.For the energy concentration with binary cooperation weights, as shown in Fig. 3A for an illustrative example on the C. elegans connectome, full bandwidth leads to two eigenvalues (1 and 0), the dimensionality of the corresponding subspaces being the number of nodes with cooperation weight 1 and 0, respectively.For the modified embedded distance, as shown in Fig. 3B, full bandwidth creates a subspace with eigenvalue 0 of dimensionality equal to the number of nodes with cooperation weight 0. Therefore, subtracting both criteria leads to opposing objectives; i.e., at full bandwidth, an energy concentration of 1 encodes the subspace for nodes with weight 1, while a modified embedded distance of 0 encodes the subspace for nodes with weight 0.
The obtained eigenspectrum for the new criterion, shown in Fig. 3C, illustrates that only a few eigenvectors are able to combine high energy concentration with low modified embedded distance, a counterbalance that can be further revealed by measuring µ and ξ separately for these new eigenvectors.
Such a large eigengap is also good news for numerical computation of the leading eigenvectors for large graphs when relying upon efficient large-scale solvers (Lehoucq & Sorensen, 1996) implemented in widely available software libraries such as ARPACK.
Intriguingly, the approximation error was already low using a linear approximation, and did not noticeably decrease further, except for the first Slepian vector, when resorting to higher-order terms (see Fig. 3 and Supplementary Fig. S2).Modifying the importance of a node via the corresponding cooperation value affects all-length paths through that node according to the series expansion (8), where the power of A in each term corresponds to the affected path length.Once we restrict the criterion to a linear approximation, the only paths whose importance is changed are those of length 1.This does not mean that other paths are not included in the graph analysis, but rather that they are included with their original (unmodified) effect on the topology.Low error of linear approximation suggests that the highest percentage of topological importance of a node falls into the importance of its length-1 paths.Further, a slightly higher error at eigenvectors with the highest ζ may be explained similarly.Not modifying higher order paths produces greater error at these eigenvectors because of their increased relative importance due to the fact that high ζ eigenvectors tend to be very smooth (even approaching a constant signal); thus, in order to even out the values at all nodes in the process, one needs to "reach" far enough.
The proposed criterion should not be confused with the Sobolev norm that is sometimes used to regularize graph signals (Mahadevan & Maggioni, 2006).Specifically, in the case of M = I, our criterion of Eq. ( 5) applied to g reverts to g g − g Lg, whereas the Sobolev norm of g reads g g + g Lg.The difference in the sign of the second term introduces significantly distinct optimization goals regardless of the apparent similarity of the two expressions.
As for future extensions of our approach, one could envisage to dig into the relationship with graph uncertainty principles (Agaskar & Lu, 2013;Teke & Vaidyanathan, in press;Tsitsvero et al., 2016), to consider statistical resampling for graphs (Pirondini, Vybornova, Coscia, & Van De Ville, 2016), or to focus on the discovery of hierarchical graph structure (Arenas, Fernández, & Gómez, 2008;Irion & Saito, 2014) by gradual refinement of the subgraph.The design could also be extended for directed graphs using recent extensions of spectral decompositions for directed graphs (Mhaskar, 2018;Sandryhaila & Moura, 2013).

Gaining Insights on C. elegans
The application of our newly developed approach to the C. elegans connectome enabled to confirm many past findings from the literature, and to shed light on additional cellular targets that may deserve further experimental analyses.At the level of sensory neurons (Fig. 4, Supplementary Fig. S3A), AWA and AWC are involved in odortaxis, as they integrate attractive sensory cues (Bargmann, Hartwieg, & Horvitz, 1993;Z. Li et al., 2012).Both subtypes differ in their mode of action: AWC actually respond to the removal of odor (Chalasani et al., 2007;Gray et al., 2004), and recently, it was discovered that while AWA stochastically pulse in response to odorant stimuli, AWC deterministically respond in graded manner (Itskovits, Ruach, & Zaslaver, 2018).AFP are thermosensors (Mori & Ohshima, 1995), while ASE and ASI are implicated in chemotaxis (Bargmann & Horvitz, 1991;Luo et al., 2014).The latter also play a role in eliciting a roaming (or dispersal (Gray et al., 2004)) behavior (Flavell et al., 2013).In summary, all those neurons act as low-order sensors; the fact that they cluster together may indicate the requirement for cross-modality information exchange across different detection modules.Perhaps unsurprisingly, all also share the ability to modulate some aspects of locomotory behaviour (Wakabayashi, Kitagawa, & Shingai, 2004).
Regarding the neurons at the bottom of the two-dimensional visualization, IL1 and OLQ have jointly been implicated in the worm foraging response (Hart, Sims, & Kaplan, 1995), to which CEP were also more recently associated in the context of head withdrawal (Kindt et al., 2007).The unraveling of RIP as the only interneurons found within our focussed analysis of sensory neurons may arise in this context: as they inhibit pharyngeal pumping and receive presynaptic inputs from IL1 (White et al., 1986), they may guide a decrease in pumping paralleling the withdrawal response.In addition, OLQ and CEP have both been found to contribute to a hub-and-spoke network processing gentle head touch (Chatzigeorgiou & Schafer, 2011).
It is interesting to notice that the mechanosensory transduction process conducted by those neurons, and revealed by our visualization, is not the canonical one, which relies on distinct cellular intermediates (Rankin, 2002;Wicks & Rankin, 1995).A reason may be the focus on sensory neurons of our exploratory analysis, which could filter out the otherwise dominating role of the sensory neurons that mostly interact with other neuronal subtypes (i.e., interneurons and motoneurons from the locomotion circuitry).Along this line of reasoning, Supplementary Fig. S4 (second row) shows that for higher-order Slepian vectors (fourth and fifth), which are less concentrated in S, additional contributors emerge, such as the bilateral PHA/PHB (right sub-branch, involved in posterior mechanosensory transduction (Rankin, 2002;Wicks & Rankin, 1995)) or ASK (top sub-branch, members of the chemical avoidance circuitry (Hilliard, Bargmann, & Bazzicalupo, 2002)).
A second set of antagonistic functions shared by several of the colocalizing cellular intermediates are food search and reproductory drive: indeed, CEP and ADE are linked in a dopaminergic slowing down response upon food sensing (Sawin, Ranganathan, & Horvitz, 2000), while in addition, URX receive serotonergic inputs driving neuroendocrine release towards body fat loss (Noble, Stieglitz, & Srinivasan, 2013).At the same time, URX, URY and OLL all express the neuropeptide PDF-1, tying them to reproductive drive (Barrios, Ghosh, Fang, Emmons, & Barr, 2012); along the same line of evidence, Macosko et al. (2009) discovered the implication of IL2 and URX in a pheromone attraction hub-and-spoke circuitry, and IL2 were also found to release extracellular vesicles functioning in mating-related behaviours (Wang et al., 2014).
The aforementioned neurons were separated in the analysis between their right and left counterparts, and further inspection (Supplementary Fig. S5A) showed that this segregation involved chemical synapses, but not gap junctions.This may highlight a need for laterally independent processing in the mechanosensory and/or reproductory functions carried by the involved neurons, at least at the start of the process; polysynaptic interactions may then arise through interneuron relays.
Turning to interneurons (Fig. 5, Supplementary Fig. S3B), AIA, AIB, AIY and AIZ are jointly known for their role on locomotory behaviour (Gray, Hill, & Bargmann, 2005;Wakabayashi et al., 2004), through the response to odor stimulation (Chalasani et al., 2007;Hilliard et al., 2002).Fitting with their isolation in the visualization, AIA stand apart in that they mostly relate to pheromone sensing (Macosko et al., 2009) and the egg-laying motor program (Hobert, 2003), whereas AIB, AIY and AIZ are stimulated by thermo-and chemosensory neurons (Hobert, 2003) and act as first-relay drives for locomotory behaviour (Luo et al., 2014), playing a role in distinct facets of locomotion (Gray et al., 2005).AIY and AIB are particularly coupled in that they respond oppositely to olfactory cues: while AIY exhibit potentiated activity, AIB become more inhibited (Chalasani et al., 2007).Intriguingly, AIN were also retrieved alongside the above neurons in our inspection, and have not been a largely studied cellular target to date; to our knowledge, the only hint at a possible involvement in the described mechanisms is the faint side detection of the str-1::GFP reporter construct (linked to the expression of an odorant receptor) in AIN on top of AWB neurons (Troemel, Kimmel, & Bargmann, 1997).As for RIR, it has also been scarcely studied, and is one of the few unpaired neurons of the worm's head (Hobert, Johnston Jr, & Chang, 2002).
Fittingly, RIA and RIB (which appeared in the same cluster of neurons) make the continuation of the locomotory pathway, acting as second-layer intermediates that then relay the signals to head motoneurons or command interneurons (Gray et al., 2005).RIA and AUA have also been jointly involved in a complex response to food stimulation upon serotonergic signaling (Chang, Chronis, Karow, Marletta, & Bargmann, 2006;Z. Li et al., 2012;Noble et al., 2013), resulting in modulatory effects on locomotion (Flavell et al., 2013;Harris et al., 2011).AVE neurons, which stood in the next unraveled ensemble of neurons, belong to the command interneurons acting as final relays to locomotion-inducing motoneurons; in particular, AVE neurons contribute to backward locomotion (Haspel, O'Donovan, & Hart, 2010;Hobert, 2003;Kawano et al., 2011).Unsurprisingly, they were retrieved as members of the C. elegans rich club (Towlson, Vértes, Ahnert, Schafer, & Bullmore, 2013), and were also found as connector hubs (Pan, Chatterjee, & Sinha, 2010).Interestingly, in that same study, AVKL also stood out as a connector hub, and was put forward by the authors as a target for future experimentations.
Clustering together with AVE and AVKL, DVA has also been implicated in movement as a stretch-sensing neuron that can have positive and negative effects on locomotion (W.Li, Feng, Sternberg, & Xu, 2006).More precisely, it appears to play a role in mating behaviour, since it undergoes NLP-12 release that strongly potentiates locomotion (Hu, Pym, Babu, Murray, & Kaplan, 2011), and expresses nematocin to trigger mating-related responses through muscular contractions in the sexual apparatus of male worms (Garrison et al., 2012).Their shared localization in the visualization may imply that DVA modulates reproductory functions through AVE; this view would square well with the observation that AVE neurons show particularly short neurites compared to other command interneurons, which project down to the vulva region (Gray et al., 2005).Similarly, PVT was linked to reproductory functions, as it was found to release the neuropeptide PDF-1 in the context of mate-searching behaviours (Barrios et al., 2012).
In the following set of neurons, AVAL and AVD are command interneurons regulating backward locomotion, while PVC rather control forward motions (Chalfie et al., 1985;Hobert, 2003;Zhen & Samuel, 2015).Their central involvement in locomotory behaviour explains that all belong to the worm's rich club (Towlson et al., 2013).Upon controllability analyses, AVA were also found crucial for the controllability of muscles, while AVD and PVC were shown essential for the controllability of the motoneuron layer (Yan et al., 2017).As for LUA, they contact AVA and AVD within the posterior touch circuitry, but could be attributed no quantifiable impact on locomotion (Chalfie et al., 1985;Wicks & Rankin, 1995).PVR is more difficult to relate to the other members of the cluster, but we remark that it was already initially segregated from the other neurons in the Laplacian view (see Supplementary Fig. S3B, left column), which may indicate a more complex role that does not solely relate to the interneuron circuitry.
AVJ neurons were seen in isolation; they act as relays between the AVB command interneurons, which mediate forward locomotion (Hardaker, Singer, Kerr, Zhou, & Schafer, 2001) like PVC, and the AVF interneurons, which receive vulvar muscle-related inputs.Thus, AVJ are implicated in the coordination  Summary of the main functions operated by the sensory neurons (A), interneurons (B) and motoneurons (C) unraveled by guided spectral analysis.Clusters of neurons discussed in Sec.5.2 are delineated and color coded according to their main roles: this may be in sensing (thermosensation in red, olfactory sensation in yellow, chemosensation in green and mechanosensation in blue), higher-order functions (reproduction in pink, food responses in brown), or locomotion (from first cellular relays to effector motoneurons in increasingly darker shades of gray).A gradient in the color coding indicates that more than one function is performed by neurons from a given cluster.Neurons that could not be clearly related to the rest of the unraveled circuitry are encircled in white.
between the locomotor and egg-laying responses.AVF were part of the final revealed cell cluster, and were jointly involved in sexual conditioning, through interplays with the glia-derived MCM neurons, with PVQ and RIF (Sammut et al., 2015).Finally, AIM neurons reuptake serotonin and are part of a slowing down of motor responses upon the presence of food (Jafari, Xie, Kullyev, Liang, & Sze, 2011).
Considering motoneurons (Fig. 6, Supplementary Fig. S3C), the three major subdivisions nicely fit somatic location (see Supplementary Fig. S6): top neurons in the visualization are located most anteriorly, bottom left ones in the middle, and bottom right ones at the posterior end of the worm.It was the only cell type bearing this characteristic, which also highlights that the information extracted for sensory neurons and interneurons goes beyond their mere somatic positioning.
A-type and B-type cholinergic motoneurons (which respectively drive backward and forward motion (Haspel et al., 2010;Kawano et al., 2011)) were found in all clusters alongside the inhibitory D-type motoneurons, to which they project (Schuske, Beg, & Jorgensen, 2004;Zhen & Samuel, 2015).AS11 is a posterior motoneuron involved in male posture during mating (Jarrell et al., 2012).Through control theory tools, all of these neuron subtypes were shown to be crucial for proper locomotion (Yan et al., 2017); intriguingly, the authors then also highlighted an important role of the PDB motoneuron, which was previously unknown and also transpires from our results, since it stands amongst the bottom right sub-branch of the display.Aside PDB, the authors specifically put forward VB11, VD13, VA12 and VD12 as receiving independent signals from receptor neurons, and all those cells lie in the bottom right sub-branch as well.
As for DVB, we noticed its intermediate position between both bottom sub-branches of the display; this may arise from its implication in the defecation mechanism, which requires a temporal sequence of anterior and posterior body contractions prior to enteric muscle activation (Schuske et al., 2004).
In summary, as illustrated in Fig. 7, all three types of neurons found in the C. elegans nematode (sensory neurons, interneurons and motoneurons) could be arranged in a meaningful hierarchy thanks to our introduced guided graph spectral analysis method.Sensory neurons were separated between first-order sensors for thermal, chemical or olfactory stimuli, and cells involved in higher-order reproductory or food-related responses, as well as head mechanosensation.A segregation of the left and right components of this latter system was also evidenced, and at the same time, many of the involved neurons were not purely sensory (as depicted by red circles), but rather possessed a dual sensory neuron/interneuron label, or even a triple assignment (dark red or dark brown circles), perhaps highlighting the fact that reproduction-and food-related behaviours lie more downstream in the typical response pathway of the worm.
In accordance to this hypothesis, raw sensing was much less present in the interneuron circuitry, except to recruit first-layer members of the locomotion network.Different levels of processing of motor functions were clearly separated (see the gradient from white to dark gray tones going clockwise in the interneuron summary figure), with the eventual recruitment of motoneurons, which were the only class of cells that were separated on the basis of somatic location.
Future analyses will enable not only to gain insight into each type of neurons independently (by setting the cooperation weights of other types to 0 as we did here), but also to study them in conjunction through more elaborated combinations.In addition, it will be interesting to see whether future experimental work can shed light on some of the neurons that were extracted here without being yet extensively documented in the literature, such as AVKL or RIR.

Figure 1 .
Figure 1.Spectral embedding of the C. elegans connectome according to the eigenvectors of the Laplacian matrix with second and third smallest eigenvalues.
to 0. We applied the original, concentration-based Slepian design for different bandwidths W = 100, 150, 200, 279, the latter corresponding to full bandwidth.The eigenvalues µ k , which reflect energy concentration in the 151 remaining neurons, are shown in Fig.3A.The characteristic behavior of classical Slepians is preserved for the graph variant; i.e., eigenvalues cluster around 1 and 0 for well and poorly concentrated eigenvectors, respectively, and the phase transition occurs more abruptly at higher bandwidth.For full bandwidth, perfect concentration becomes possible, and the problem degenerates in retrieving two linear subspaces of 151 and 128 dimensions spanned by eigenvectors with concentration 1 and 0, respectively (see Sec. 3.1 for a proof on the number of distinct eigenvalues).In practical terms, for high but not full bandwidth, the "interesting" eigenvectors with large concentration correspond to the part indicated by the green area on the plot, and become numerically indistinguishable.A few indicative examples are displayed in Supplementary Fig.S1C.
Fig. 3B and explicitly demonstrated in Sec.3.2.Some examples can be seen in Supplementary Fig. S1D.The degeneracies of the Slepian designs at full bandwidth are instructive about the opposing effects of maximizing energy concentration and minimizing modified embedded distance; i.e., the subspaces indicated by the green areas in Figs.3A and B, which are optimal for the corresponding criteria, are actually different ones, representing signals on sensory and interneurons (151 nodes) on the one hand, and on motoneurons (128 nodes) on the other hand (compare Supplementary Figs.S1C and D, first rows).This leads us to the eigenvalues ζ k of the proposed criterion, as shown in Fig. 3C (black curve).The maximum eigenvalue peaks close to 1, a case reflecting jointly high equivalent µ k (blue curve) and low equivalent ξ k (purple curve); i.e., a high energy concentration at the same time as a low modified embedded distance (low localized graph frequency) within S. The low amount of such solutions shows that it is difficult to conceal high energy concentration and small modified embedded distance.As values of ζ k decrease, we first observe a rise in modified embedded distance (eigenvectors remain reasonably concentrated within S, but rapidly exhibit a larger localized graph frequency), and then a decrease of both µ k and ξ k , which indicates that eigenvectors become less concentrated within the subset of interest.Afterwards, we observe a regime in which both quantities are null at the same time; that is, a subspace spanned by eigenvectors that are fully concentrated outside S. Notice that this set of eigenvectors is now "pushed away" from the meaningful low ξ k ones, and lie in the middle of the spectrum.Finally, the sign of ζ k switches, and the right hand side of Fig.3Cdenotes eigenvectors of increasing concentration within S and localized graph frequency, the latter effect dominating over the former.

Figure 3 .
Figure 3. Plots of eigenvalues obtained using different Slepian criteria: (A) energy concentration µ, (B) modified embedded distance ξ, and (C) our new proposed criterion ζ.For the first two cases, in which the design depends on a bandwidth parameter, eigenvalue spectra are plotted for W = 100, 150, 200

Figure 4 .
Figure 4. Focussing on the sensory neurons by reducing the cooperation weights of the interneurons and motoneurons (A) from 1 to 0.5, and (B) from 0.5 to 0. The trajectory of a neuron is represented by a colour change from light to dark tones, and dots represent final positions.Note that the starting configuration

Figure 5 .
Figure 5. Focussing on the interneurons by reducing the cooperation weights of the sensory neurons and motoneurons (A) from 1 to 0.5, and (B) from 0.5 to 0. The trajectory of a neuron is represented by a colour change from light to dark tones, and dots represent final positions.Note that the starting configuration

Figure 6 .
Figure 6.Focussing on the motoneurons by reducing the cooperation weights of the sensory neurons and interneurons (A) from 1 to 0.5, and (B) from 0.5 to 0. The trajectory of a neuron is represented by a colour change from light to dark tones, and dots represent final positions.Note that the starting configuration Figure 7. Summary of the main functions operated by the sensory neurons (A), interneurons (B) and motoneurons (C) unraveled by guided spectral

Figure S4 .Figure S5 .Figure S6 .
Figure S2.Eigenspectrum of the newly developped ζ criterion (A), with vertical bars highlighting the locations of the spectrum at which four pairs of In an example three-node network, output entries for different examples where cooperation weights are either set to 0 (white nodes) or to 1 (black nodes).Edge thickness is proportional to the output entry weight.Red strokes denote positive edge values, while blue strokes highlight negative edge values.All non-zero entries of the normalized adjacency matrix of the example network equal to compensate for any irrelevant global transformations.As a complementary visualization, note that we provide the start, intermediate and end points of each trajectory as separate figures in Supplementary Fig. S3.Example visualizations when resorting to different Slepian vectors are provided in Supplementary Fig. S4.