Biomarkers of Depression Symptoms Defined by Direct Intracranial Neurophysiology

Quantitative biomarkers of depression are critical for development of rational therapeutics, but limitations of current low-resolution, indirect brain assays may impede their discovery. We applied graph theory and machine learning to a large unique dataset of intracranial electrophysiological recordings to generate a four-dimensional whole-brain model of neural activity. Using this model, we found patterns of network activity that correctly classified depression in over 80% of individuals. These complex patterns were especially evident in alpha and beta spectral power across frontal and occipital brain regions, respectively. Our findings reveal a widespread network of abnormal activity that may inform advanced personalized treatment.


Abstract:
Quantitative biomarkers of depression are critical for development of rational therapeutics, but limitations of current low-resolution, indirect brain assays may impede their discovery. We applied graph theory and machine learning to a large unique dataset of intracranial electrophysiological recordings to generate a four-dimensional whole-brain model of neural activity. Using this model, we found patterns of network activity that correctly classified depression in over 80% of individuals. These complex patterns were especially evident in alpha and beta spectral power across frontal and occipital brain regions, respectively. Our findings reveal a widespread network of abnormal activity that may inform advanced personalized treatment.

Main:
Neurons work together in integrated brain networks to enable high-level functions such as human emotion and complex behavior 1,2 . Dysfunction in these networks can result in major depressive disorder, which is a leading cause of world-wide disability and lacks adequate treatment options (World Health Organization, 2012). A quantitative biomarker of depression that characterizes network dysfunction could enable early detection, objective disease classification, monitoring of disease progression, and an understanding of circuit physiology from which rational treatment design could evolve 3 .
However, the heterogeneity of depression and the inadequate resolution of current measurement tools may have impeded detection of reliable depression signals. Direct sampling from brain regions themselves through intracranial EEG (iEEG) may overcome such challenges and complement noninvasive techniques to provide insight into the architecture of abnormal depression circuits. These high-resolution recordings from cortical and subcortical brain regions can be performed for multi-day time periods in patients with epilepsy undergoing pre-surgical mapping. Such patients have high rates of co-morbid depression [35][36][37][38][39][40] that shares origin [41][42][43][44][45] and treatment response 46 characteristics with primary depression. Findings in these patients therefore may inform understanding of depression circuits more generally.
Despite clear advantages of iEEG, population-level iEEG analyses of major depression have not been performed. One major barrier is the heterogeneous sampling of disease-relevant networks across subjects, owing to the clinical placement of electrodes for seizure monitoring.
To overcome this challenge, we utilized graph theory and a novel mathematical approach that allowed us to create a model of multiregional iEEG activity for each individual based on information provided by the correlational structure of activity across our population (SuperEEG 47 ). The resulting model provided a highly detailed representation of naturalistic brain activity across space and time, and its inherent organization into functional brain networks. By quantifying the multi-dimensional nature of corresponding brain dynamics, our study enabled us to better understand how patterns of rhythmic activity underlying functional connectivity differ in depressed and non-depressed brain networks 48 .
We detected patterns of neural activity and connectivity that effectively distinguished depressed and non-depressed individuals. Yet the same network changes were not present in everyone with depression. Two depression-associated patterns emerged, and the presence of either one increases the likelihood of depression in any individual. One pattern was strongly characterized by excessive frontal cortical connectivity with decreased activity in the alpha spectral frequency band. The second was marked by decreased connectivity across the occipitotemporal region and dominant beta band activity.

Overall Approach
We took a two-step approach to quantify iEEG biomarkers of depression. With the first step ("Model building", Fig. 1), our overall goal was to identify large-scale functional networks across iEEG electrodes. To tackle inconsistent network sampling across individuals, we utilized a novel method called SuperEEG 47 that uses the correlational relationship of brain activity across electrodes to generate a whole-brain iEEG model for each subject. We next uncovered the inherent organization of the brain's correlational structure from the output of this model by examining network modularity, a graph theoretic concept that quantifies the organization of a system into constituent modules 49,50 , and has been used to reveal system disruptions in disease states 51-57 including MDD 29 . This approach revealed 6 functional networks in our data, which we refer to as "modules." We utilized direct neural recordings from 54 patients to construct a whole-brain model of iEEG activity based on correlational relationships of neural LFP time series signals across all electrode pairs. We then parcellated this model into functional network modules using graph theory metrics. Model Utilization: We used the whole-brain iEEG model to study how brain activity and connectivity measures relate to depression status. We first defined spectral power features across network modules and applied supervised machine Construction of whole-brain iEEG model from correlational relationships between LFP time series across all electrode pairs Data-driven parcellation of whole-brain correlational structure from model output to derive 6 functional brain modules Extraction of spectral power features from each brain module to derive spectral-spatial features Unification of depression biotypes with connectivity signatures Intra/inter modular connectivity signatures: Quantification of differences in group-level correlation structure across (intra) and within (inter) brain modules bewteen depression and control groups

Activity Analysis Connectivity Analysis
Group-Level depression biomarker: Application of machine learning algorithms to identify the spectral-spatial features that discriminate depression at group-level Depression Biotypes: Subdivision of group-level biomarker into patient specific network expression patterns (NEPs) to reveal 2 depression biotypes learning to identify a group-level biomarker of depression (Activity analysis). In parallel, we identified alterations in functional network connectivity and organization between depressed and control groups (Connectivity analysis). Common features of the group-level biomarker expressed at the individual level clustered participants into neurophysiological biotypes with distinct patterns of altered activity and connectivity.
In the second step ("Model utilization", Fig. 1), we used the whole-brain model to relate the architecture and intrinsic neural activity of functional networks to depression status as defined by a validated self-report measure, the Patient Health Questionnaire (PHQ-9) 58-60 . Informed by previous work that identified spectral power features from quantitative EEG that were predictive of depression diagnosis and treatment response 11-16 , we first performed an activity analysis, where we extracted spectral power features across each module, and then tested their ability to detect depressed from non-depressed patients at the group level using supervised machine learning. We next asked whether biomarkers defined at the group level account for inter-individual heterogeneity; a critical barrier to their clinical application. To address this question, we examined whether common features of the group-level biomarker clustered depressed patients into different neurophysiological biotypes. Finally, to assess the relationship between the brain's neural activity and its underlying functional connectivity, we performed a connectivity analysis in parallel to the activity analysis, where we quantified differences in the group-level correlation structure across depressed and non-depressed groups. By unifying results from both analyses, we linked distinct patterns of network disorganization with dysfunctional activity across depression subtypes.

Derivation of functional modules
Our total population consisted of 54 patients with intracranial electrodes placed for the purpose of pre-surgical mapping for treatment-resistant epilepsy. The number of electrodes per subject ranged from 33-201 (mean=120, SD=37). We utilized continuous iEEG recordings across this population to build a whole-brain, multi-subject model of iEEG recordings that served as the basis for studying system-level organization and signatures of depression (Fig2a-c, see Methods). In brief, to generate this model, we first constructed subject-level full-brain correlational models.
Interelectrode correlation matrices were constructed from activity where sensors were present and learned radial-basis function weighted averages were used to generate correlational information at locations where sensors were not present. The subject-level models were then averaged to generate a population-level model. We then used Gaussian process regression based on the population-level model and individual time series for each subject to reconstruct whole-brain local field potentials for each subject. The output of this SuperEEG 47 model is therefore an estimate of iEEG time series for a union set of electrodes across our total population (4,244 electrodes/per patient).
After construction of the model, we next utilized principles of graph theory to identify data-driven functional networks (modules) across it. Our rationale was that the inherent network structure present in the model could enhance discovery of depression circuitry over an atlas-based approach because atlases apply boundaries to brain regions based on structural rather than functional organization and thus, while they provide a useful comparison to a data-driven parcellation scheme, there is no reason to assume their boundaries will perfectly align. We applied a modularity optimization technique known as community detection which groups electrodes into non-overlapping modules by their correlational relationships 61,62 ( Fig. 2d-e, and see Methods).
Using this method, it is possible to study network organization at different levels of granularity. To objectively select the level of granularity, we compared the similarity of our data-driven partitions with those of a commonly employed Lausanne atlas, by calculating the adjusted Rand similarity index 63 . This method yielded parcellation into 6 stable modules (Jaccard index, p<0.05, permutation test, Supplementary Fig. 2). A graph of the network and subdivision into modules is shown in Fig. 2e where module membership is indicated by the color of nodes (iEEG electrodes) and edges (inter-electrode correlation from whole-brain model). To name each module, we pinpointed regions (hubs) with the most influential connectivity profile for each module  To generate a multi-subject whole-brain model of iEEG activity, patient's electrode locations across participants were first represented in a common space (Montreal Neurologic Institute (MNI) space). Electrode locations and sample recordings for a few example patients are shown. Activity was then randomly sampled in 1 min intervals across daytime hours to obtain a stable representation of resting-state brain activity across a 2-hour period. b. Individual inter-electrode correlation matrices were constructed for each participant at locations where electrodes were present. Subject-level full brain correlational models were then predicted using radial basis function (RBF)-weighted averages to estimate brain activity correlations at locations where sensors were not present (left). Subject-level correlational models were then averaged to generate a population level whole-brain correlational model (right). c. Local field potential activity for each of the 4,244 electrodes was then reconstructed using Gaussian process regression with the population-level model as a prior and  activity where electrodes were present as the marginal likelihood. d. Multiscale community detection was applied to the whole-brain model to group electrodes (nodes) into non-overlapping modules (communities) by their correlational relationships 61,62 . First, the population-level correlational model was reordered by the module assignment according to the modularity cost function. Network modules were identified at different levels of granularity by varying the tuning parameter, γ 29,49 . Increasing γ partitions the brain into increasing numbers of modules with a limit equal to the number of electrodes, as shown here for 3 values of γ (left). Next, the stability of this clustering at each value of γ was assessed by calculating module allegiance, which describes the probability that any two electrodes occupy the same module on repeated module detection 67 (right). A γ-value of 1.19 was selected by comparing the similarity of partitions generated by values of γ with those of a commonly used brain atlas 63 , resulting in 6 modules. Of note, one of the modules is small and difficult to resolve in the figure. e. The graph of the large-scale network with module membership delineated by the color of the nodes and edges for selected γ of 1.19 is shown along with a schematic representation of the 6 modules. f. Hubs for each module were identified by selecting electrodes with the lowest 10% participation coefficients. Values were then averaged for each Lausanne brain region per module and weighted by the distribution of electrodes across Lausanne regions in all modules. Hub weight is indicated by the size of hub, and module assignment is indicated by hub color. Module 5 contained insufficient number of electrodes for hub identification (0.3% of total sample) and coefficients across all electrodes were utilized to name this module. L-DLPFC=left dorsolateral prefrontal cortex, L-OT=left occipitotemporal cortex, L-OFC=left orbitofrontal cortex, R-MF=right medial frontal cortex, R-FT=right frontotemporal cortex

Relationship of functional network identification to depression status
We next investigated the whole-brain iEEG model to study brain networks underlying depression.
All patients in this study were undergoing surgical mapping with multi-channel iEEG for seizure localization as part of their standard medical care. Depression was measured coincident in time with iEEG recordings, using the validated PHQ-9 measure 58-60 . In accordance with literaturederived rates of depression in this population [35][36][37][38][39][40] , 43% of our population had self-reported depression as defined by this measure (PHQ-9≥10, n=23), and 33% had mild or no symptoms of depression, which defined our control group (PHQ-9≤5, n=18). The two groups did not vary in age, sex, type of epilepsy, antidepressant usage, or anti-epileptic drug class (t-test, X 2 , p>0.4,

Supplementary Table 1).
Work over the past 50 years has highlighted the predictive potential of EEG to detect depression [68][69][70][71] . Disruption in frontal alpha power is one of the earliest and most-studied markers of depression [11][12][13][14][15][16] , but its predictive potential remains unclear due to inconsistent findings across studies [17][18][19] . We hypothesized that alpha power may be one part of a larger network story, and that by leveraging the high temporal resolution of iEEG, as well as the direct access to subcortical structures, we could perform a spectral analysis of brain activity across functional modules that would identify a network biomarker of depression. We calculated relative spectral power from the where each feature contained information about a spectral power band across one functional module; we therefore collectively referred to these as "spectral-spatial features." We then utilized supervised machine learning to identify network activity that was predictive of depression. Our machine learning classification pipeline is shown in Fig. 3a. Neuronal activity is highly collinear, which suggests an underlying latent structure. We used principal component analysis to not only reveal these underlying hidden network variables but also to mitigate overfitting concerns by condensing the representation into fewer dimensions. We then utilized an ensemble of random forest and logistic regression machine learning classifiers with leave-one-out cross-validation to identify the spectral-spatial network features with the greatest discriminatory power. We found that four of the principal components had the strongest predictive ability to detect depressed from non-depressed subjects. Their component weights represent their contribution towards likelihood of depression. We show these weights in Fig. 3b. Utilizing the four most discriminative components alone, we achieved a mean classification accuracy of 82.7% on the training set and 82.4% on the test set (sensitivity=0.833, positive-predictive value = 0.87). In comparison, the same classification pipeline applied to a null model obtained from randomly permuting the target class labels led to a classification accuracy of 50.4%. The receiver operating curve (ROC) derived from cross-validation for the learned model in comparison to a null model is shown in Fig. 3c with an area under the curve (AUC) of 0.88. Together, these data suggest that a parsimonious model with four principal components, which capture major sources of variance in spectral-spatial features, can detect subjects with depression from the control group significantly better than chance (see also Supplementary Fig. 4).
We next turned to an examination of the individual spectral-spatial features contained within the four components to identify the frequency bands and modules that comprise this putative biomarker. The component loadings on the individual spectral-spatial features within the four components in Fig. 3b represent the circuit activity that detects depression (for full description see Supplementary Table 4). To represent this biomarker spatially, we weighted the spectral distribution of the components with respect to their component loadings and component weights for each module (Fig. 3d). On visual inspection two gross patterns of spectral activity across the modules emerged. The first was high alpha power across the L-OT, R-FT, and mid-hemispheric modules (attention and default mode regions, modules 2,5, and 6 in Fig. 3d). The second was high delta and low alpha and theta power in the L-DLPFC and OFC modules (executive and limbic regions, modules 1 and 3 in Fig. 3d). These results suggest that low-to-mid-frequency activity across broad networks characterize depression at the group level. These observations motivated the subsequent statistical analysis to define the two patterns quantitatively.

Distinct Network Expression Patterns Define Depression Biotypes
We next asked how the biomarker that detected depression at the group level was expressed at the patient level. The rationale for this analysis is that depression is most likely to be a collection of several underlying disorders, which suggests there may be inter-individual heterogeneity in neural circuits involved. In moving toward a personalized approach to medicine, these individual biomarkers could identify and eventually track disease status in a continuous manner. To answer this question, we identified clusters of subjects who expressed different neurophysiological activity patterns. We first identified the log-odds impact of each original spectral-spatial feature from the group-level biomarker at the patient level. We then tested the distribution of feature impact on depression classification probability across participants by grouping covariates with similar feature impact scores using an agglomerative hierarchical clustering algorithm (see Methods) [72][73][74][75] . We found two distinct patterns of network activity (network expression patterns (NEPs)) that were distributed across two groups of patients (Fig. 4a). The first network (NEP1) was marked by increased beta power in the L-OT module, and increased alpha and decreased delta power over the L-OT and R-FT modules. The second network (NEP2) was marked by decreased theta in the L-DLPFC, L-OFC, and R-FT modules, and decreased alpha, beta power together with increased delta power within the L-DLPFC and L-OFC modules. These two NEPs were consistent with the core features of the group-level biomarker. However, their existence importantly demonstrated that different core features were relevant in different subjects.
We next quantified the impact of each NEP on each participant's probability of being classified as depressed using a sensitivity analysis. The difference in classification probability when one NEP was left out was attributed to the presence of that NEP, which quantified the relative importance of that NEP toward the classification of depression. Fig. 4b shows the probability contribution of each NEP for each subject in the depressed group (top plot) and control group (bottom plot).
While we anticipated that each individual would exhibit several NEPs with differing contributions to their depression classification, an alternate pattern emerged from the data. We found that increased activity in either NEP was correlated with depression, but that each patient exhibited activity in only one of the two NEPs. Thus, depressed participants fell into two groupings based on NEP activity which we refer to as biotype-1 (40% depressed subjects) and biotype-2 (35% depressed subjects). Biotype-1 was largely driven by NEP1 (n=8, mean probability contribution =0.44, SD=0.17) and biotype-2 was largely driven by NEP2 (n=7, mean probability contribution=0.40, SD=0.16, Fig. 4c). The remaining 25% of participants belonged to a group with mixed or no contribution from the two identified NEPs and may be evidence of additional biotypes that were not resolved in our dataset. Two distinct groups also emerged from the control participants. The first group (43% control patients) exhibited mixed or no contribution from the two depression NEPs, as we anticipated. In the second group, there was a strong decrease in one of the two NEPs that countered a smaller increase in the opposing NEP (57% of control group). This reciprocal activity was also observed in the depressed group, but in the opposing direction, with a strong increase in one NEP. We might speculate that relative NEP activity could represent either risky or conversely, protective activity profiles, and that NEP activity could be modulated in either direction to treat depression. The anatomical distribution of the two circuits underlying these depression biotypes is shown in Fig. 4d. The probability difference was attributed to the presence of the NEP. This probability contribution is represented by the colored bars overlaid over each patient's total probability of being depressed as derived from the machine learning classification model (gray bars, probability > 0.5 leads to classification of depression). The perturbations do not sum to produce the total classification probability; rather each quantifies the relative importance of that NEP toward depression. Bars in the positive direction indicates a positive contribution toward depression, and those in the negative direction indicate a protective contribution toward depression. Subjects where one NEP did not drive classification probability as evidenced by equal contribution of each NEP to classification in both the positive and negative direction shown with bars in dark gray (mixed profile). Subjects classified incorrectly shown on far right of each plot (misclassified). c. Mean probability contribution of each NEP to the associated biotype is shown. NEP-1 (purple bars) contributed most strongly to the probability of depression in biotype-1 (mean=44% probability contribution, SD=0.17) and NEP-2 (blue bars) contributed most strongly to biotype-2 (mean=40% probability contribution, SE=0.16). Number of participants who exhibit each NEP shown above

Network Organization is Disrupted Across MDD Biotypes
In addition to alterations in the spectral content of network activity in depression, previous studies have observed distinct deficiencies in connectivity across depression networks 27,28,31-33 . A fundamental interest in neuroscience is the relationship between the brain's neural activity and its underlying functional and structural connectivity, which remains unknown. We expected that alterations in functional network topology would be present in our depressed population and that we could delineate new relationships between activity and functional connectivity with our highresolution dataset to more fully characterize network markers of depression. We performed a connectivity analysis using correlation of local field potential activity across modules as an estimate of functional connectivity between electrodes. The graph of our whole-brain iEEG model ( Fig. 2e) defines correlational relationships between electrodes across our total population. We thus examined these correlational relationships across control and depressed groups independently in order to measure the relative differences of functional network organization between the two groups. To do so, we examined the relative difference in connectivity strength across electrodes within and between functional modules. Fig. 5a shows the two-dimensional representation of the functional network structure for control (left) and depressed (right) groups.
In comparison to the control group, we qualitatively observed an overall reduction in the segregation between modules in the depression network.
To quantify these differences, we first calculated the inter-and intra-modular connectivity strength by calculating the correlation between electrodes located in the same module (intra-modular) and across different modules (inter-modular). We next compared the distribution of correlation strengths between depressed and control groups for each possible module pair using a Cohen's On the basis of the above analyses we were able to parse specific connectivity components that characterize the two depression biotypes (Fig. 5c), unifying both activity and connectivity analyses across cortical and deep structures in a way that has not previously been possible. In biotype-1 we observed increased beta power in the L-OT module, and right-left asymmetry in the alpha and delta bands over right frontal/L-OT modules with weaker intra-and inter-modular connectivity throughout. In biotype-2 we observed a hyperactive left frontal cortex that was more highly connected within itself but more weakly connected to R-FT module. Lower theta bilaterally was observed in this biotype.

Fig. 5. Intra-and Inter-modular connectivity signatures of depression and control groups.
a. Connectivity structure derived using whole-brain iEEG model recalculated for the control group (left) and depressed group (right) with module membership delineated by the color of the nodes (electrodes), and edges (connections between electrodes) delineated by the black interconnecting lines. b. Heatmap of significant Cohen's d values calculated from the distribution of correlation strengths between depressed and control groups for each possible module pair and compared to Cohen's d values for a null distribution derived from permuted nodal module assignment. Those that survived multiple comparison testing (p<0.001) were retained (red: increased connectivity for depressed group; blue: increased connectivity for control group; white: not significant). c. Schematic of biotype 1 (left) and biotype 2 (right) showing both connectivity and spectral power underlying each pattern. Increased connectivity strength shown in red, and decreased connectivity shown in blue (hub=intramodular, line=intermodular connectivity). Color of shaded area refers to module number as shown in color legend in panel a.

Discussion
The field of depression is limited by the absence of quantitative biomarkers, which are critical to objectively assess disease state and develop advanced treatments. In this report, we present the largest study of direct neural recordings aimed at identifying depression networks, made possible by iEEG recordings paired in time with a depression measure. The opportunity to directly record a b

Figure 5. Network organization
semi-chronically from cortical and subcortical structures in this manner enabled us to build computational models of neural activity and incorporate both activity and connectivity analyses to resolve new biotypes of depression. We found a neurophysiological biomarker that was able to detect depression with 82% accuracy. We further found two distinct patterns of circuit dysfunction across this biomarker that were expressed in 84% of our depressed patients.
Overall, our findings suggest that two highly interconnected circuit activity patterns are involved in depression and that capturing this complexity is required for an objective diagnosis of Recently, Betzel and colleagues successfully applied a correlational network model similar to SuperEEG to multi-subject iEEG recordings, followed by community detection, and found network organization to be representative of that obtained from DTI and fMRI 86 . We extended these findings, by applying the iEEG model to the study of disease status for the first time. Drysdale and colleagues used hierarchical clustering, similar to the clustering we utilized, on fMRI connectivity features and identified four depression biotypes 72 . One of these biotypes was dominated by globally decreased connectivity focused in the right OFC and left temporal-occipital region consistent with our biotype-1. Another exhibited increased frontal and decreased temporal connectivity consistent with our biotype-2. Of note, Drysdale and colleagues matched symptom profiles with fMRI data as a starting dimension for further exploration of depression biotypes. In contrast, in our method we sought to take those features that classified depression and determine if common activity patterns were observed independent of subject or physician input. Despite these differences, the similarity in biotypes -each developed in an empirical, data-driven approach from different modalities -is quite remarkable. The semi-chronic nature of our recordings enabled us to assess trait markers with neurophysiology alone and enhanced the connectivity analysis with an understanding of the neural activity that accompanied altered connectivity. Future analyses should evaluate how these putative biotypes map to symptom profiles of depression.
We acknowledge some weaknesses in the results presented. It remains unknown whether the depression networks we identified are related to the presence of epilepsy. Depression in epilepsy is thought to arise from similar origins to primary depression (stress 41 , inflammation 42 , medications 43,44 , circuit dysfunction 45 ), and is responsive to antidepressants 46 suggesting it can provide valuable insight into depression broadly. At minimum, the results presented here shed light on understanding depression in epilepsy patients. More importantly, the methods and findings provide a basis for applying similar approaches to broader depression populations and the potential for integrating such findings with non-invasive network features. While our whole-brain iEEG model was extensive in coverage, we did not have electrodes placed in all brain regions, including some regions implicated in depression [87][88][89][90][91] and the density of electrode sampling varied across brain regions leading to uncertainty in the accuracy of estimation in sparsely sampled areas 47 . We dealt with this constraint by discounting the effect of each individual node degree before running community detection and comparing network measures to a null model that accounted for overall node density. At the time we carried out this analysis, SuperEEG was the only method developed to carry out a group-level analysis on this type of data. This approach bordered on computational infeasibility, even with the use of an 8-node high performance computing cluster. More methods are likely to emerge with advancements in data processing capabilities and accessibility. With such methods, we may be able to reduce assumptions and the estimation burden, extend coverage to more brain regions, and utilize larger samples.
In conclusion, using a valuable dataset of intracranial recordings, we identified novel biomarkers of depression. These results have important implications for disease subtyping, diagnosis, treatment planning, and monitoring of depression status. Quantitative markers of depression could enable a new wave of closed-loop interventional therapies that personalize treatment based on neurophysiological signals.

Methods: Patient Characterization:
Participants included 54 adults (49% female) aged 20-67 who had medication-refractory epilepsy and were undergoing intracranial monitoring as part of their standard medical care. Neural data from these participants comprised our full dataset and was utilized to build the whole-brain iEEG model of LFP time-series. Participants were screened for depression following electrode implantation and concurrent with neural recordings using the Patient Health Questionnaire-9 (PHQ-9), a 9-item self-report instrument validated for depression screening [58][59][60] . A score ≥ 10 was used to define the MDD group (moderate depression) and a score ≤ 5 defined the non-depressed control group generating a sample of 23 depressed subjects (56%) and 18 controls (44%). The remaining 13 patients were used in the first step of the study (model building) but not the second (model utilization). Data comprised a consecutive series of patients recruited from University of California, San Francisco and Kaiser Permanente, Redwood City, California over a 5-year period.
This study was approved by the University of California, San Francisco Institutional Review Board with written informed consent provided by all subjects.

Data Acquisition and Pre-Processing:
Data acquisition of iEEG recordings were acquired using the Natus EEG clinical recording system at a sampling rate of 1-2 kHz. Standard iEEG/ECoG pre-processing techniques were conducted in python including application of a 2-250Hz bandpass filter, notch filters at line noise frequency and harmonics (60Hz, 120Hz, 180Hz, 240Hz), down sampling to 512Hz, and common average referencing to the mean of all channels.

Construction of Whole-Brain iEEG Model:
A functional connectivity imputation technique was utilized to estimate whole-brain iEEG activity for each subject (SuperEEG, developed by Jeremy Manning and Lucy Owens) 47 . This method involved four main steps outlined in Supplementary Fig. 1. First, pre-processed iEEG signals were chunked into 60s non-overlapping blocks and filtered for putative epileptiform activity or artifacts. We then randomly sampled the 60s intervals across daytime hours (8am-10pm) and concatenated them into 2 hours blocks, each representative of resting-state activity. To filter the activity of putative epileptiform activity or artifacts we identified signals that deviated from the norm using kurtosis, a measure of infrequent extreme peaked deviations 95 . Individual 60s blocks were retained if the number of artifactual signals (kurtosis>10) was less than 5% of the total sample. where was set to 20 based on prior work that sought to maximize both spatial resolution and generalizability to any brain location 64 . This relationship was utilized as the Weight matrix ( ) and applied to every electrode set ( , ) for each patient as where the rbf function expands the single patient to include estimated temporal recording relationships from sensors in the full population. Third, the Weight matrix was then combined with the single patient correlation matrix ( " ( , )) to estimate each patient's full population anatomical connection structure using, ? " = @ 9 " 9 " A where r(·) represents the Fisher-z transformation. The patient specific estimates of full brain correlation matrix 9 " and the spatial proximity weighting matrix 9 " were combined in a weighted average to compute a single whole-brain expected correlation matrix 9 as 9 = @ ∑ 9 " * "'% ∑ 9 " Finally, estimated LFP activity for each electrode was reconstructed using Gaussian process regression according to where represents the patient, ! is the set of indices where we had observed recordings for patient and " is the set of indices that were to be estimated. The submatrix 9 , ! ,-! represents the estimated correlations between locations of known activity and locations of reconstructed activity. The submatrix 9 -! ,-! represents correlations between known neural recordings.
Construction of the above model required extensive computational resources. Therefore, we sought to utilize the minimum required information required to obtain the majority of resting state information to enable computational feasibility. Using the 10h benchmark as the largest feasible model we could build, we compared 2, 4, 6 and 8 hour models to the 10 hour model and found that the difference in adding additional time beyond 2h was marginal and could be computed at a fraction of the computational cost. We therefore utilized the 2h model for further analysis ( Supplementary Fig. 2a).

Signal Processing:
Standard signal processing techniques were applied to the time-series activity across all reconstructed electrodes. This included continuous wavelet transformation using the Morlet transform wavelet method (6-cycles) 96 performed in 30s intervals to obtain power spectra in 6 frequency bands (delta = 1-4Hz, theta = 5-8Hz, alpha = 9-12Hz, beta = 13-30Hz, low gamma (gammaL) = 31-70Hz, high gamma (gammaH) = 71-150Hz). Relative power was calculated by dividing the power of each frequency band by the total power across the 6 frequency bands for each electrode. Signals were summarized by taking the mean power across time for each spectral band and were z-scored across patients.

Electrode Clustering into Functional Modules:
We used a well-validated technique, multiscale community detection, to identify distributed functional sub-networks (modules) across the model 61,62,97-100 . Individual functional connectivity models generated in the whole-brain iEEG reconstruction were used as a starting point in this analysis. Using the Louvain algorithm 101 , we identified an optimal parcellation of electrodes into discrete functional modules by maximizing a modularity cost function defined by the following relationships, where is a ones matrix, ∘ is the Hadamard product and Gi,j is 0 if node pair ( , ) are assigned to different modules and 1 if the pair is assigned to the same module, is modularity, is the connection weights (correlation) between node and , is the Newman-Girvan null model 62 and is the weighting of that null model which is tuned to obtain network modules of different sizes.
As the network fractures into many small modules for higher , the overall modularity decreases. We examined network modularity at values of between 0.5 and 2.1. We then assessed the stability of clustering at each value of by examining module allegiance 67 . It was calculated by repeating module detection 100 times and evaluating the probability that two electrodes occupied the same module.
To select an optimal , we compared partitions generated by each value of with partitions generated by a well-known validated atlas (the Lausanne atlas) by calculating the adjusted Rand score index 102 , which is a measure of similarity between two clusterings (Supplementary Fig.   2b-c). Significance was assessed by a permutation test where the null model was generated by randomly assigning electrodes to each module and calculating the confidence interval of the similarity index generated from 1,000 random permutations and tested at significance level 0.05 for a 2-tailed test. Two similarity peaks were identified, with values of γ that generated 6 and 1,855 modules respectively. The peak with the highest modularity (lowest number of clusters) was selected for further analysis due to our goal of examining the brain at a low level of granularity.
This selection enabled subsequent classification of activity across these clusters without overfitting our model.

Assigning Names to Modules:
We assigned a name to each module by examining the location of each module's most influential electrodes. We utilized the participation coefficient (PaC), which is a degree-based measure of network connectivity that describes a node's functional interaction within and across network modules [64][65][66] . This metric is typically utilized to identify influential hubs across a large-scale network. We utilized it in our study to identify the location of hubs that were most important for driving connectivity in each module identified through community detection. Groups of electrodes with low PaC values indicate hubs with high intramodular connectivity, also known as provincial hubs 103 . Similarly, connector hubs are those with high PaCs and drive intermodular connectivity.
The PaC describes the weight of edges from node i to all other nodes in the same module relative to the weight of edges from that node to all nodes in the network according to where yi is node 's participation coefficient, C is the set of all modules, # ( ) is the sum of all correlations between node and other members of module C and # is the sum of all correlations between node and members of all modules. We calculated the PaC for each electrode across our model, and then selected those with high and low participation values (top/bottom 10%). We then grouped these selected nodes by Lausanne atlas region, eliminating or combining a minority of regions due to having too few electrodes for analysis. We addressed the non-uniform distribution of electrodes across the model by then assigning each Lausanne region a score according to the following hub weight:

Classification:
We utilized a machine learning algorithm to classify subjects with and without depression. Given our high dimensionality feature space generated by spectral decomposition across the wholebrain model (4,244 electrodes x 6 spectral bands), we recognized the need for dimensionality reduction to learn a stable model. Our primary dimensionality reduction technique consisted of averaging spectral power across the electrodes within each module for each spectral frequency band. This yielded 36 features (6 frequency bands x 6 modules) that contained information both about the spectral power and the location of the activity. These features, referred to as spectralspatial features, served as our starting feature space for entry into our classification pipeline. This data was then transformed using principal component analysis (PCA) 104 . PCA enabled us to reduce collinearity and to identify latent variables that described maximally distinct network-based features inclusive of multiple frequency bands and modules. Feature selection was performed by fitting a random forest classifier which evaluates covariates to identify the set of features with the highest discriminatory power 105 . A logistic regression classifier was then utilized to assess classification accuracy. Model parameterization was identified using leave-one-out cross validation 106 . A sparsity promoting regularization (L1) was picked as it selected fewer principal components while controlling for overfitting allowing better generalization. The mean accuracy, precision, recall and ROC-AUC was reported.
To further asses our model validity, we repeated our classification pipeline on a null model obtained from randomly permuting the target class labels. We also examined the stability of classification across models. Supplementary Fig. 5a shows the classification accuracies, precision and recall for the learned model in comparison to the null model and two nonlinear classifiers (random forest, support vector machine).
In order to control for possible differences in epileptiform activity residual to data-cleaning across the modules we calculated line-length, a commonly utilized measure for the detection of epileptiform activity 107 for each electrode and averaged across the modules. We used a logistic regression model to show that line-length across the six modules was not a significant predictor of depression status (R 2 =0.15, p=0.13, logistic regression).

Hierarchical clustering to identify biotypes:
We reasoned that we could utilize the group-level biomarker to identify common features that defined depression at the individual level. To do so, we mapped the principal component values (feature loadings ≥0.2) back to the original space weighted by the logistic regression coefficients.
This provided the log-odds impact of each original feature. We then tested the distribution of feature impact on depression classification probability across depressed participants by grouping similar log-odds impact covariates (thresholded at 0.15) utilizing an agglomerative hierarchical clustering algorithm [72][73][74][75] . A log-odds threshold of 0.15 was selected because it retained classification results for 98% of subjects while isolating the most contributory spectral-spatial features (see Supplementary Fig. 5b for non-thresholded model for comparison). The clustering yielded both patient and feature groupings that defined neurophysiological network expression patterns (NEPs) of depression. We quantified the impact of these NEPs on each participant's probability of being classified as depressed by performing a sensitivity analysis where we withheld each NEP and then attributed the probability decrement from the total classification probability to the withheld activity pattern. Biotypes were defined based on the relative contribution of each NEP to classification probability. Subjects in whom the contribution was not clear or was mixed across the NEPs were placed into a third mixed group.

Connectivity Analysis:
In order to understand the effect of depression on network topology we examined changes in connectivity across the control and depressed subjects. First, inter-and intramodular connectivity strengths were assessed by looking at the correlations between all electrodes within the same module (intramodular) and the correlations between electrodes across all pairs of modules (intermodular). Next, to assess the connections that significantly distinguished depressed subjects from controls, we used a Cohen's d effect size metric and compared the distribution of correlation strengths across depressed and control groups for each possible module pair. To assess significance across these connections we generated a null distribution of Cohen's d values for each module pair, and retained the true Cohen's d values that survived multiple comparisons testing (p<0.001).

Data Availability
Raw data is provided in Supplementary Fig.5 for the hierarchical clustering and in Supplementary Tables 2 (participation coefficient) and 3 (principal component analysis). All datasets generated and analyzed in the current study are available from the corresponding authors upon request.

Code Availability
All custom code, statistical analysis and visualizations were performed in Python and are demonstrated in a set of Jupyter notebooks available for download. The following open-source code was used as described in the methods: SuperEEG (https://github.com/ContextLab/supereeg).