Supplementary Materials For: Title: Global Absence and Targeting of Protective Immune States in Severe COVID-19

Alexis J. Combes1,2,3,‡,*, Tristan Courau1,2,3,‡, Nicholas F. Kuhn1,2,‡, Kenneth H. Hu1,2,‡, Arja Ray1,2,‡, William S. Chen2,4, Simon J. Cleary2,5, Nayvin W. Chew1,2,3, Divyashree Kushnoor1,2,3, Gabriella C. Reeder1,2,3, Alan Shen1,2,3, Jessica Tsui1,2,3, Kamir J. Hiam-Galvez2,6,7, Priscila Muñoz-Sandoval2,7,8, Wandi S Zhu2,7,8, David S. Lee2,9, Yang Sun2,9, Ran You1,2, Mélia Magnen2,5, Lauren Rodriguez2,6, Aleksandra Leligdowicz5, Colin R. Zamecnik2,10, Rita P. Loudermilk2,10, Michael R. Wilson2,10, Chun J. Ye2,9, Gabriela K. Fragiadakis2,3,9, Mark R. Looney2,5, Vincent Chan1,2, Alyssa Ward9, Sidney Carrillo5, The UCSF COMET Consortium, Michael Matthay5 , David J. Erle2,3,5, Prescott G. Woodruff2,5, Charles Langelier11, Kirsten Kangelaris12, Carolyn M. Hendrickson5, Carolyn Calfee5 , Arjun Arkal Rao1,2,3,* and Matthew F. Krummel1,2,*


Abstract.
While SARS-CoV-2 infection has pleiotropic and systemic effects in some patients, many others experience milder symptoms. We sought a holistic understanding of the severe/mild distinction in COVID-19 pathology, and its origins. We performed a wholeblood preserving single-cell analysis protocol to integrate contributions from all major cell types including neutrophils, monocytes, platelets, lymphocytes and the contents of serum. Patients with mild COVID-19 disease display a coordinated pattern of interferonstimulated gene (ISG) expression across every cell population and these cells are systemically absent in patients with severe disease. Severe COVID-19 patients also paradoxically produce very high anti-SARS-CoV-2 antibody titers and have lower viral load as compared to mild disease. Examination of the serum from severe patients demonstrates that they uniquely produce antibodies with multiple patterns of specificity against interferon-stimulated cells and that those antibodies functionally block the production of the mild disease-associated ISG-expressing cells. Overzealous and autodirected antibody responses pit the immune system against itself in many COVID-19 patients and this defines targets for immunotherapies to allow immune systems to provide viral defense.

Main
To understand immune biology amongst COVID-19 patients, we compared them to patients presenting with similar respiratory symptoms but who were not infected with the SARS-CoV-2 virus. We prospectively enrolled 21 SARS-CoV-2 positive inpatients, 11 inpatients with similar clinical presentations consistent with acute lung injury (ALI) or acute respiratory distress syndrome (ARDS), who were SARS-CoV-2 negative-those caused by other infections or of unknown origin-and 14 control individuals. We further categorized these over the next weeks as 'mild/moderate' (M/M: typically short stays in hospital with no need for mechanical ventilation and intensive care) or 'severe' (requiring intubation and intensive care) according to the full clinical course of their disease (Fig  1A/S1A and Table S1). Hence, our study includes patients with mild/moderate (n=11) or severe (n=10) COVID-19 and patients with mild/moderate (n=6) or severe (n=5) non-COVID-19 ALI/ARDS. With the exception of one individual, all our patients who presented with mild/moderate disease remained mild/moderate during hospitalization (Fig S1A), suggesting that mild/moderate and severe are more stable states rather than transient phases of disease in this cohort.
Since the majority of COVID-19 mortality is among patients with the (ARDS)characterized by an exuberant immune response with prominent contributions from neutrophils, monocytes, platelets-we focused upon faithfully collecting these cells along with other major populations. We thus processed early morning blood samples from all individuals within 3 hours of sampling, and after red blood cell lysis, we analyzed the remaining white blood cells by single-cell RNA sequencing (scRNA-seq). After merging, batch-correction and doublet-removal our data comprised 116,517 cells (Fig 1B/S1B) among which we identified neutrophils, platelets, mononuclear phagocytes, T/NK cells, B cells, plasma cells and eosinophils (Fig S1C). We confirmed a positive association between neutrophil count and disease severity and an inverse correlation for lymphoid populations (Fig 1B/S1D) (1)(2)(3). At this level of resolution, findings were similar between SARS-CoV-2 negative and positive individuals (Fig S1E).
Within the neutrophils, we identified seven subtypes (Fig 1C/D), consistent with previous studies (2,4). One population, harboring a strong interferon-stimulated gene (ISG) signature and henceforth termed ISG neutrophils, was highly enriched in SARS-CoV-2 positive patients but not in those whose disease was severe (Fig 1E/F/G). Analysis of populations using a pseudotime method to estimate differentiation trajectories (5) assigned the starting population as the stem LCN2 population (Fig 1H/S1F-G) and suggested three putative late populations: the ISG-expressing population (state 1), a collection of populations sharing expression of NEAT1, MALAT1 and FTH1 (state 2), and a population enriched for ribosomal genes (RIBO.; state 3) which may be en route to cell death. Of these late stages, the ISG subtype was the only one found significantly altered between mild/moderate and severe patients (Fig 1I) and specifically within the SARS-CoV-2 positive individuals (Fig 1J/S1H). ISG signature genes include master anti-viral regulators such as ISG15 and IFITM3 which restricts viral entry into the cytosol (6).
We also undertook a second form of analysis of differentially expressed genes (DEG) from SARS-CoV-2 positive versus negative patients, and from mild/moderate versus severe patients across all neutrophils (Fig S1I-L). This demonstrated that ISG signature genes are differentially higher in all neutrophils, of all subsets, specifically in SARS-CoV-2 positive mild/moderate patients, than in SARS-CoV-2 positive patients with severe disease (Fig  1K/S1M-O). In contrast, a separate neutrophil degranulation gene program is upregulated in mild/moderate as compared to severe disease regardless of COVID status (Fig S1P-Q). This suggests a shared program of degranulation enhancement in all respiratory infections regardless of causative pathogen, and a global rise of the ISG program in all neutrophils in mild/moderate SARS-CoV-2 positive cases that is absent in severe SARS-CoV-2 infection (3).
Assessing the mononuclear phagocytes-monocytes, macrophages, dendritic cells and plasmacytoid dendritic cells (pDC)-yielded 7 clusters of transcriptionally distinct cells subsets, evenly distributed across our cohort with a heterogenous number of genes and unique molecular identifiers detected for each cluster (Fig 2A-B/2SA-D). We identified ISG classical monocytes as being enriched in SARS-CoV-2 positive patients, and particularly those having mild/moderate disease, similarly to neutrophils (Fig 2B-C/S2E-G). pDCs which are substantial producers of the cytokine IFNa are also typically elevated in mild/moderate SARS-CoV-2 positive patients although this falls short of statistical significance in our dataset (Fig 2C). In contrast, elevated DCs that were previously considered a hallmark of COVID-19 patients when compared to healthy controls (2) are also elevated in SARS-CoV-2 negative patients (Fig 2C). ISG monocytes also expressed genes associated with glycolysis, compared to the S100A12 subset that were enriched for genes associated with OxPhos metabolism, consistent with previous reports in bacterial sepsis (7) (Fig S3A). As for neutrophils, differential gene expression analysis demonstrated that ISGs were the dominant genes associated with mild/moderate phenotypes when the entire mononuclear phagocyte pool was assessed (Fig 2D).
ISG monocyte and ISG neutrophil frequencies were strongly correlated with one another in mild/moderate SARS-CoV-2 positive individuals (Fig 2E). Correlating multiple neutrophil subsets versus mononuclear phagocyte subsets across the entire cohort confirmed the strong correlation between ISG neutrophils and ISG monocytes and highlighted a significant correlation between pDCs and NEAT1 neutrophils, which have been previously described in viral infection (30072977) (Fig 2F). Similarly, we performed a comprehensive analysis of T cell and B cell frequencies (Fig S4) and found again that both cell types are significantly enriched in ISG signatures, specifically in mild/moderate COVID-19 patients (Fig 2G). On a patient-by-patient basis, the populations of ISG+ in one compartment correlated with the frequency of ISG expressing cells in another, for example of ISG+ T cells and ISG+ neutrophils, uniquely in mild/moderate patients (Fig 2H). Spearman correlation analysis across multiple cell types in all patients thus showed a collection of correlated ISG populations and a second anti-correlated block of other cell populations, notably those expressing S100A12 (Fig 2I).
Platelets are the mediators of blood coagulation and their activation can be associated with inflammation and infectious diseases, termed immunothrombosis (8). In patients infected with SARS-CoV-2, 25-33% of patients present with thrombocytopenia or thromboembolic events (9). We sought to determine how this might be reflected in changes in the expressed genes and concomitantly in the heterotypic cell doublet frequencies in SARS-CoV-2 infections versus other respiratory infections. Our scRNA-seq whole blood data set allowed us to identify and subset platelets based on established platelet signature genes: PPBP, PF4, CLEC1B, RGS10, RGS18 (Fig 1B/S1C). Analysis of these platelets revealed six clusters (Fig 3A/B), including three subsets ("H3F3B", "HIST1H2AC", and "RGS18") characterized by high expression levels of histone protein-encoding transcripts. Such populations may represent early platelets, based upon the suggestion that these anucleated cells carry transcripts acquired from their parental cells, megakaryocytes, during recent platelet formation (10). One by one comparison of these platelet subtypes among healthy controls and patients with mild/moderate and severe disease revealed only minor depletion of HIST1H2AC platelets with disease severity and increase in ACTB platelets-expressing genes associated with cytoskeletal functions-in both mild/moderate and severe patients (Fig 3C).
To compare platelet turnover between disease severity groups, we overlaid the expression of BCL2L1 onto our platelet data set. This transcript encodes the antiapoptotic protein Bcl-xL, which has been identified as a 'molecular clock' for platelet lifetime (11). This identified the H3F3B cluster as representing 'young' platelets (Fig 3D), a result supported by a second signature of transcripts in young, reticulated platelets(12) (Fig S5A). This population was thus supported as the starting point for a pseudotime analysis, in which the histone-high clusters (H3F3B, HISTH2AC, RGS18) and cytoskeletal protein-high clusters (ACTB, PPBP, TMSB3X) were observed at the start and end of the pseudotime trajectory, respectively (Fig 3E/S5B). Plotting the platelet cell frequencies of our cohort along the trajectory suggested that platelets from all patients with disease were broadly overrepresented at the end of the trajectory (Fig 3F), supporting a discrete and systemic relative loss of young platelets in all patients as compared to controls. Platelets did not harbor an ISG-specific cell cluster (Fig S5C), but akin to myeloid and lymphoid cells, the ISG score in all platelets from mild/moderate patients was increased relative to controls and was comparatively decreased in severe patients, particularly in SARS-CoV-2 infected individuals (Fig 3G).
Taking advantage of this unique dataset, we explored the identification of heterotypic aggregates between platelets and non-platelets by using a 'Platelet First' approach ( Fig  3H, see methods). This approach analytically prioritized capture of every cell that was aggregated with a platelet, prior to doublet assessment. The 'Platelet First' object revealed the presence of platelet transcripts associated with cells that also bore signatures of other major blood cell types (Fig 3H/S5D-E). We separately analyzed a smaller object that included only libraries containing samples pooled prior to cell encapsulation, allowing us to assess inter-patient doublets to conclude that the majority of these aggregates were not formed during the scSeq pipeline (Fig S5F). In plotting the blood cell frequencies within the 'Platelet First' object against the cell frequencies within the original data set, we found a largely linear relationship between the frequency of a given aggregating population (x-axis) and the frequency with which that cell is found in an aggregate (y-axis) (Fig 3I). This first-order linear relationship suggests that, at least in circulating blood, platelets form aggregates indiscriminately with varying other cell types without favoring one or the other. Possibly activated vasculature would provide a cue that would change this pattern of aggregation.
After observing that ISG expression profiles were elevated in every cell type among pateints with mild/moderate disease and COVID-19 but globally reduced with severe COVID-19 disease, we turned our attention to a hypothesis generating holistic approach to data analysis. In an attempt to visualize the global shift in gene expression data across cell types to identify trends with clinical correlates. We first undertook a phenotypic earth mover's distance (PhEMD) approach (13) that identifies and differentiates joint cell frequency patterns to highlight sources of patient-to-patient variation. PhEMD embedding of patients based on their subtype frequencies revealed eight distinct groups of patients (Fig 4A/S6A) where progression from A through H represent patients with generally increasing relative frequency of neutrophils, C, D, G and H include patients with relative enrichment in monocytes and E represents patients with an enrichment of ISG neutrophils and mostly consists of SARS-CoV-2 positive patients with mild/moderate disease (Fig 4B-C). In contrast, Group G, which is an alternative and 'severe' fate for patients is enriched for neutrophil and has a high ratio of cell frequency of S100A12 to ISG neutrophils (Fig S6A).
When we examined serum IFNa levels, we found that mild/moderate individuals made more of this cytokine on average as compared to severe patients, which would be consistent with higher levels of ISG cell populations, however there were patient with severe disease individuals who also made high levels of IFNa (Fig 4D). To integrate the scRNA-Seq cell populations in the context of other clinical and serum fractions, we constructed a Spearman correlation matrix comparing all subtype frequencies described above with a collection of serum cytokines, antibodies and clinical variables (Fig 4E). Following hierarchical clustering of variables, ISG subtypes cluster together and are correlated with serum IFNa concentrations, consistent with these cells arising in response to globally high concentrations of this cytokine as shown previously (3) (Fig S6B). ISGexpressing populations are also associated with low severity of COVID-19 illness, lower plasma levels of SP-D (indicative of alveolar epithelial injury) and, only modestly, with IL-6 levels. We also note the absence of significant correlation between ISG populations and days after symptoms onset, indicative of a disease 'state' (Fig 4E and Fig S1A). We further correlated our subtype frequencies against a high-dimensional panel of plasma protein levels ( Fig S6B) and again observed clustering of most ISG subtypes, which correlated with a host of factors indicative of a strong ISG and Th1 response (CXCL1/6/10/11, TNFB, IL-12B, MCP-2/4), while inversely correlated with others (CCL23, MMP10, HGF). An unexpected anticorrelate of the ISG state was the concentration of serum antibodies against the SARS-CoV-2 Spike and Nucleocapsid proteins. This anticorrelation was profound (Fig 4F) and we considered it a paradox that severe patients have higher levels of potentially neutralizing antibodies. This is in apparent contradiction with a previous report showing that viral load is associated with severity and mortality in COVID-19 (14,15) a difference which could be explained by the fact that these studies focus on patients with high mortality, which was a very rare event in our cohort (Sup Table S1). Both antibody specificities were anticorrelated with the viral load as assessed from nasal swabs (Fig 4G) consistent with though not definitive for being neutralizing. As increased antibody titers and decreased viral load have been reported to be a feature of later disease stage(16), we considered the hypothesis that mild/moderate disease -characterized by high frequency of ISG+ signatures -would simply precede severe disease states. However, antibody titers in severe patients are consistently higher compared to mild/moderate patients even two weeks beyond symptom onset (Fig 4H), and only one of our 11 mild/moderate patients would go on to exhibit a severe disease ( Fig S1A). Moreover, we observed no statistical correlation between days of onset and the presence of ISG+ cell populations (Fig 4E). These elements would seem to argue against a simple time relationship between mild/moderate and severe states.
Returning to the profound antibody reactivity and the global loss of ISG populations even in the presence of serum IFNa (Fig 4D), we hypothesized that phenotypic differences in our two groups of COVID-19 patients might also be mirrored or influenced or by systemic factors such as those carried in the blood and affecting all cell populations. We thus first asked whether serum from severe patients contained antibodies against ISG-expressing cells by directly applying serum to healthy peripheral blood mononuclear cells (PBMCs) from heathy individuals that were first induced to express IFITM3 (part of our ISG signature, encoding a protein that blocks viral entry (6)) by culturing them with IFNa for 24 hours, followed by flow cytometry analysis of monocytes and lymphocytes (Fig S7A). While we observed only low levels of serum IgG binding in serum from 1/9 mild/moderate patients, sera from 5/7 patients with severe COVID-19 illness displayed significant binding (Fig 5A/S7B). Staining was more pronounced on monocytes as compared to T and B cells which were largely stained with similar frequencies when using mild/moderate or severe serum (Fig S7C). When examining specificities in those patients that did not stain ISGdifferentiated cells directly, we found that serum from patient 1050 produced antibodies that were specific for IFNa (right inset Fig5A), consistent with a very recent study (17) that also found these in approximately 12% of COVID patients. This finding nevertheless does not explain patients lacking ISG cells despite presence of IFNa in their serum ( Fig  4D).
We then asked whether factors in the serum of severe patients affect the induction of the ISG signature gene pattern, including IFITM3, in response to culture with IFNa. We thus added patient serum at 10% into the IFNa stimulation conditions and found that, whereas control serum or serum from mild/moderate patients had no effect on differentiation as measured by either IFITM3 level or the frequencies of CD14+CD16 + intermediate monocytes produced (Fig 5B/5D and Fig S7D), all severe patient serum tested had profound effects on differentiation, varying from absolute block to partial inhibition.
To test whether these effects on IFN-induced production of ISG cell populations were in fact mediated by antibodies, we pre-adsorbed patients' sera with Protein A/G beads to deplete them. This antibody depletion restored both IFITM3 induction and the total yield of interferon-stimulated monocyte cells (Fig 5C/D S7D). A similar block of ISG signature generation in response to IFNa was observed for other populations including lymphocytes, showing that this effect was global and similarly mediated by antibodies in serum severe patient (Fig 5E/S7D-E).
Taken together, this shows that an antibody response in severe patients targets ISG cell populations and their generation. In our cohort, this general effect of antibody manifests in all severe patients, whereas antibodies against the cytokine IFNa itself were seen only in one of seven patients, a similar frequency recently reported(17). Antibodies in many of these patients have direct specificity for determinants on the surface of ISG monocytes. The molecular specificities of these other antibodies are likely to be many and varied based on the differing patterns in this relatively small sample set and it will remain to be determined how and why tolerance is broken to the ISG pathway, in the course of infection. One likely candidate for modulation of B cell response is direct infection of monocytes by SARS-CoV-2. In vitro incubation of the virus with healthy cells indeed results in intracellular expression of both IFITM3 (indicating activation of this program) and spike protein ( Fig S7F). If, in an early immune response, the ISG program is preferentially presented alongside the proteins from the pathogen, and the immune system of the patient is not already self-tolerant to those ISG proteins due perhaps to a lifelong absence of their profound expression, tolerance to those cells and those proteins may be broken. Conversely, as inflammatory monocytes normally restrict antibody generation (18), their infection and lysis by virus may in turn release overexuberant B cell responses to many antigens, not just those that are newly produced during the infection. Regardless, this work suggests that targeting overexuberant and autoreactive B cells with drugs such as rituximab (19) or through introduction of IVIG(20), perhaps alongside introduction of convalescent serum-derived antibodies to provide ongoing viral protection, may be an avenue to defeat the global suppression of protective ISG mediated viral immunity. Statistical significance was assessed using a two-way ANOVA test with multiple comparisons for panels B, G, I and J, and using a Wilcoxon test for panel K. * p-value < 0.05; ** p-value < 0.01; *** p-value < 0.001; **** p-value < 0.0001.   Figure 1B (x-axis) versus same cell type frequency within 'Platelet First' object (y-axis). The identity line x=y is drawn as a reference. Each dot represents a healthy control or SARS-CoV-2 positive patient sample and are color-coded by disease severity. Pearson r correlation coefficient and two-tailed p value are shown for each cell type. Top: Box plots of y/xratio for each healthy control or patient sample, separated by disease severity. Differences in C. and I. were calculated using a two-way ANOVA test with multiple comparisons. * p.value < 0.05; ** p.value < 0.01; *** p.value <0.001; **** p.value < 0.0001; ns: non-significant. Severe (pink) sera were pre-treated with protein G/A before incubation with PBMC to deplete antibodies. Right: Box plot of IFITM3 induction in lymphocytes. Differences in D. and E. were calculated using a two-way ANOVA test with multiple comparisons. * p.value < 0.05; ** p.value < 0.01; *** p.value <0.001; **** p.value < 0.0001; ns: non-significant.

This PDF file includes:
Supplemental  Step 1). This was followed by retaining all cells with >1 platelet-specific transcripts PF4 or PPBP (Step 2).
Step 2 guaranteed analysis of cell events and aggregates containing platelets. Identically to our original data set in Figure 1B, integration of data was done using Harmony (Step 3), and the 'Platelet First' object was then analyzed using the Seurat v3 pipeline (Step 4). E. Violin plots of the percentage of mitochondrial and ribosomal genes within clusters identified in the 'Platelet First' object. F. Inter-sample doublet rates in inferred platelet-involved heterotypic doublets show that platelet aggregates occur in vivo. DBL, doublet. SNG, singlet. Differences in F. were calculated using a onesided Student's t test * p.value < 0.05 and ** p.value < 0.01.

Supplementary Figure 6: Identifying Immune System Correlates in SARS-CoV-2 positive and negative Patients Data
A. Cell fraction histograms representing bin-wise mean of relative frequency (i.e., cell fraction) of each immune cell subtype for all patients in a given group, colored as described in Fig4A. B. Matrix of Spearman correlation coefficients between all subtype frequencies (out of major celltypes e.g. Neut ISG represents % out of all Neutrophils) obtained from scRNA-Seq versus protein analyte abundance in plasma as measured using Olink assay on a patient-by-patient basis excluding healthy controls. Patients for which data were unavailable were excluded from correlation analysis for each comparison. Variables on both axes were ordered via hierarchical clustering. ISG subtypes and protein levels strongly correlated with their frequency highlighted in brown. Subtypes and proteins strongly anti-correlated with ISG+ subtypes highlighted in red. (n=31 for all comparisons) * p<0.05, ** p<0.005, *** p<0.0005.

Supplementary Figure 7. Quantification of Serum Staining and ISG Generation in the Presence of Patient Serum. A.
Gating strategy for PBMCs to identify different subpopulations. B. Geometric MFI of serum staining on CD14+ monocytes treated with IFNa, quantifying data in Figure 5A. C. Geometric MFI of serum staining of CD3+ and CD19+ lymphocytes, following treatment with IFNa, quantifying data in Figure 5A. D. Modulation of intermediate to classical CD14 monocytes transition by mild/moderate (orange) and severe (red) patient serum. Each plot represents a single serum sample. Representative experiment from three independent trials and two different healthy PBMC donors. E. Histograms CD3+ CD19+ lymphocytes from healthy donor cultured with IFNa and serum from heathy donor (blue), mild/moderate (orange) and severe (red) SARS-CoV-2 positive patients. Mild/Moderate (light yellow) or Severe (pink) sera were pre-treated with protein G/A before incubation with PBMC. Each plot represents a single serum sample. Representative experiment from two independent trials and two different healthy PBMC donors. F. Contour plots and histograms of CD14+ monocytes from healthy donor co-cultured with SARS-COV-2 virus for 48h at 0.1 (pink) and 1 (red) MOI. Differences in B. and C. were calculated using a two-way ANOVA test with multiple comparisons. * p.value < 0.05; ** p.value < 0.01; *** p.value <0.001; **** p.value < 0.0001. ns, non-significant.

Material and Methods
Patients, participants, severity score, and clinical data collection Patients admitted to the Hospital of the University of California with known or presumptive COVID-19 were screened within 3 days of hospitalization. Patients, or a designated surrogate, provided informed consent to participate in the study. This study includes a subset of patient enrolled between April 8 and May 1 in the COMET (COVID-19 Multi-immunophenotyping projects for Effective Therapies; https://www.comet-study.org/) study at UCSF. COMET is a prospective study that aims to describe the relationship between specific immunologic assessments and the clinical courses of COVID-19 in hospitalized patients. Healthy donors (Ctrl) were adults with no prior diagnosis of or recent symptoms consistent with COVID-19. This analysis includes samples from participants who provided informed consent directly, via a surrogate, or otherwise in accordance with protocols approved by the regional ethical research boards and the Declaration of Helsinki. For inpatients, clinical data were abstracted from the electronic medical record into standardized case report forms. We used both a severity score at the time of sampling and at the end of hospitalization (Fig S1A). In both cases, severity assessment was based on three main parameters: level of care, need for mechanical ventilation, and time under mechanical ventilation. Mild/moderate patients are floor/ICU patients who did not require mechanical ventilation during their time of hospitalization and spent no more than 1 day in ICU. Severe patients are patients who required intensive care and mechanical ventilation (typically 5 days or more). Therefore, our cohort is composed of 21 COVID-19 positive patients (11 mild/moderate and 10 severe), 11 COVID-19 negative patient (6 mild/moderate and 5 severe), and 14 Healthy participants. Information on age, sex, type of infection, days of on onset, viral load, and CBC count are listed in Table S1. The study is approved by the Institutional Review board: IRB# 20-30497.

Isolation of blood cells and processing for scRNA-seq:
ScRNA-seq was performed on fresh whole blood in order to preserve granulocytes. Briefly, peripheral blood was collected into EDTA tubes (BD, catalog no. 366643). Whole blood was prepared by treatment of 500µL of peripheral blood with RBC lysis buffer (Roche, 11-814-389-001) according to manufacturer's procedures. Cells were then counted and 15.000 cells per individual were directly loaded in the Chromium TM Controller for partitioning single cells into nanoliter-scale Gel Bead-In-Emulsions (GEMs) following manufacturer's procedures (10x genomics). Some samples were pooled together (at 15,000 cells/ sample) prior to GEM partitioning. Single Cell 5' reagent kit v5.1 was used for reverse transcription, cDNA amplification and library construction of the gene expression libraries (10x Genomics) following the detailed protocol provided by 10x Genomics. Libraries were sequenced on an Illumina NovaSeq6000 using 28 cycles for R1 and 98 cycles for R2. All samples were encapsulated, and cDNA was generated within 6 hours after blood draw. Data pre-processing of 10x Genomics Chromium scRNA-seq data: Sequencer-obtained bcl files were demultiplexed into individual sample the mkfastqs command on the Cellranger 3.0.2 suite of tools (https://support.10xgenomics.com). Feature-barcode matrices were obtained for each sample by aligning the raw fastqs to GRCh38 reference genome (annotated with Ensembl v85) using the Cellranger count. Raw feature-barcode matrices were loaded into Seurat 3.1.5(2) and genes with fewer than 3 UMIs were dropped from the analyses. Matrices were further filtered to remove events with greater than 20% percent mitochondrial content, events with greater than 50% ribosomal content, or events with fewer than 100 total genes. The cell cycle state of each cell was assessed using a published set of genes associated with various stages of human mitosis (3).

Inter-sample doublet detection
Libraries containing samples pooled prior to loading were processed using Freemuxlet (https://github.com/statgen/popscle), the genotype-free version of Demuxlet(4), to identify clusters of cells belonging to the same patient via SNP concordance. Briefly, the aligned reads from Cellranger were filtered to retain reads overlapping a high-quality list of SNPs obtained from the 1000 Genomes Consortium (1KG) (5). Freemuxlet was run on this filtered bam using the 1KG vcf file as a reference, the input number of samples/pool as a guideline for clustering groups of cells by SNP concordance, and all other default parameters. Cells are classified as singlets arising from a single library, doublets arising from two or more libraries, or as ambiguous cells that cannot be accurately assigned to any existing cluster (due to a lack of sufficient genetic information). Clusters of cells belonging to a unique sample were mapped to patients using their individual Freemuxlet-generated genotype, and ground truth genotypes per patient identified via bulk RNASeq. The pairwise discordance between inferred and ground-truth genotypes was assessed using the bcftools gtcheck command (6). Ambiguous, and doublet events were filtered from the major analysis (see platelet-first analysis).

Data quality control and Normalization:
The filtered count matrices were normalized, and variance stabilized using negative binomial regression via the scTransform method offered by Seurat (7). The effects of mitochondrial content, ribosomal content, and cell cycle state were regressed out of the normalized data to prevent any confounding signal. The normalized matrices were reduced to a lower dimension using Principal Component Analyses (PCA) and the first 30 principal coordinates per sample were subjected to a non-linear dimensionality reduction using Uniform Manifold Approximation and Projection (UMAP). Clusters of cells sharing similar transcriptomic signal were identified using the Louvain algorithm, and clustering resolutions varied between 0. 6 and 1.2 based on the number and variety of cells obtained in the datasets. Clusters were loosely grouped into major cell types (T/NK, B/Plasma, mononuclear phagocytes, Neutrophil, Platelet, and Erythrocytes) using a curated list of 5 genes per cell type (5-gene signature) and the Seurat AddModuleScore function. Briefly, genes in the library are binned into one of 12 bins based on average expression in the dataset. The average expression of the genes in each signature are compared to a background list of randomly selected from the bins and used to generate a score per cell for each signature.

Intra-sample heterotypic doublet detection
All libraries were further processed to identify heterotypic doublets arising from the 10X sample loading. Processed, annotated Seurat objects were processed using the DoubletFinder package (8). Briefly, the cells from the object are modified to generate artificial duplicates, and true doublets in the dataset are identified based on similarity to the artificial doublets in the modified gene space. The prior doublet rate per library was approximated using the information provided in the 10x knowledgebase (https://kb.10xgenomics.com/hc/en-us/articles/360001378811) and this was corrected to account for homotypic doublets using the per-cluster numbers in each dataset. Heterotypic doublets were removed from the major analysis (see platelet-first analysis).

Data integration and Batch correction:
The individual processed objects per library were filtered to remove Erythrocyte contamination. The raw and log-normalized counts per library were then pruned to retain only genes shared by all libraries. Pruned counts matrices were merged into a single Seurat object and the batch (or library) of origin was stored in the metadata of the object. The log-normalized counts were reduced to a lower dimension using PCA and the individual libraries were aligned in the shared PCA space in a batch-aware manner (Each individual library was considered a batch) using the Harmony algorithm (9). The resulting Harmony components were used to generate a batch corrected UMAP, and to identify clusters of transcriptionally similar cells. Clusters were broadly labeled based on the 5-gene signature ( Fig S1) using a modified, bootstrapped version of the Seurat AddModuleScore to account for the numerous sequencing batches in our dataset. The modified function ran the AddModuleScore on random subsets of the data (subsampling rate = 0.6) 10 times and averaged the score to provide a stable score per signature. A Seurat object was generated for each broad cell type containing clusters scoring highly for that cell type. Each broad cell-type object was subjected to the same harmony analysis to generate batch-aware log normalized counts that were used for visualization and subtype identification. To visualize the effect of harmonizing our single cell data, we identified the library diversity in the neighborhood of every cell on the plot. The neighborhood of a cell is defined as the collection of n nearest neighbors in the UMAP space (where n = sqrt(total cells)), pruned to retain cells lying within the 90th percentile of all calculated neighbor distances. The diversity is the set of all libraries represented within the neighborhood.

Differential expression tests and cluster marker genes, cluster annotation and volcano plot
Differential gene expression (DGE) tests were performed on log-normalized gene counts using the Poisson test (with a latent batch variable to account for multiple library preparations) provided by the FindMarkers/FindAllMarkers functions in Seurat. Genes with > 0.35 log-fold changes, an adjusted p value of 0.05 (based on Bonferroni correction), at least 25% expressed in tested groups, were regarded as significantly differentially expressed genes (DEGs). Cluster marker genes were identified by applying the DE tests for upregulated genes between cells in one cluster to all other clusters in the dataset. Top ranked genes (by log-fold changes) from each cluster of interest were extracted for further illustration. The exact number and definition of samples used in the analysis are specified in the legend of Figure 1, 2 and 3 and summarized in Table S1. The neutrophils, mononuclear phagocytic cells, T cells, B cells and platelets subtypes were identified by comparing cluster marker genes with public sources referenced in the text. The R package EnrichR were used to generate volcano plot from differential gene expression using FindMarkers function in Seurat.

Platelet First scSeq Analysis
To identify the differential coagulation of platelets, we reintroduced heterotypic doublets to each library and filtered them to extract cells expressing at least 1 UMI of PF4 or PPBP, both plateletspecific marker genes. The raw and log-normalized counts per library were integrated using Harmony and processed as above. Broad cell types were identified using the score generated with the bootstrapped AddModuleScore and the per-sample rate of platelet aggregation with each cell types was inferred to be the fractions of cell counts in this dataset to the fractions of cell counts in the overall analysis. Significance testing was conducted using a non-parametric Kruskal-Wallis test with multiple comparisons.

Monocle analysis
Raw counts from the Individual cell-specific were used to create a monocle3 (10-12) cell_data_set object, and the batch-corrected PCA and UMAP embeddings were imported directly from the Seurat object. Each cell-specific trajectory was inferred by reverse embedding the UMAP coordinates using the DDRTree method. The root cell states for the trajectory in monocytes and neutrophils were chosen based on literature, and for platelet cell based on the signature list defined in Figure S1. Relative pseudotime was obtained through a linear transformation relative to the cells with the lowest and highest pseudotimes ( (1-min_pseudotime)/max_pseudotime ).

Generation of gene expression scores
ISG and Degranulation scores were generated by taking the mean of log-normalized expression for a particular set of genes related to the specific pathway or phenotype.

Embedding a low-dimensional representation of patients using PhEMD:
PhEMD was employed to generate a three-dimensional embedding of patients based on their immune cell profiles (13). Briefly, PhEMD first generates a reference map of cell subtypes, then uses Earth Mover's Distance (EMD) to compute pairwise dissimilarities between patients (incorporating patient-to-patient differences in cell fractions of each cell subtype as well as intrinsic dissimilarities between subtypes based on the cell subtype reference map), and finally applies a dimensionality reduction technique to the patient-to-patient distance matrix to generate a final embedding of patients. The Seurat implementation of 3D Uniform Manifold Approximation and Projection (UMAP) was used to map the cell-subtype space using the Harmony batch-corrected components as input and a "min.dist" parameter of 0.4, and cell subtypes (i.e., clusters) were defined as described in the "Data quality control and Normalization" section of Methods (2,9). Dissimilarity between each pair of cell subtypes was defined as the distance between the centroids (in UMAP space) of all cells assigned to the two respective subtypes. PHATE was applied to the EMD patient-to-patient distance matrix to generate the final 3D embedding of patients (14).

Luminex Assay for Antibody Titer
Highly immunogenic linear regions of the SARS-CoV-2 proteome were isolated by ReScan and conjugated to Luminex beads as previously described (15). Briefly, high concentration T7 phage stocks displaying immunodominant epitopes of the S, N and ORF3a proteins were propagated and grown to high (>10 11 PFU/mL) titer then were each conjugated to unique bead IDs according to manufacturer's Antibody Coupling Kit instructions (Luminex). Whole N protein (RayBiotech) beads were conjugated similarly using manufacturer instructions with 5µg of protein per 1 million beads. For other whole protein Luminex-based beads, MagPlex-Avidin Microspheres (Luminex) were coated with either the S protein RBD (residues 328-533) or the trimeric S protein ectodomain (residues 1-1213). All beads were blocked overnight before use and pooled on day of use. 2000-2500 beads per ID were pooled per incubation with patient serum at a final dilution of 1:500 for 1 hour, washed, then stained with an anti-IgG pre-conjugated to phycoerythrin (Thermo Scientific, #12-4998-82) for 30 minutes at 1:2000. Primary incubations were done in PBST supplemented with 2% nonfat milk and secondary incubations were done in PBST. Beads were processed in 96 well format and analyzed on a Luminex LX 200 cytometer. Median Fluorescence Intensity from each set of beads within each bead ID were retrieved directly from the LX200 and log transformed after normalizing to the mean signal across two intra-assay negative controls (glial fibrillary acidic protein (GFAP) and Tubulin phage peptide conjugated beads).

Luminex Assay for Serum Cytokines
Soluble proteins were quantified in EDTA anticoagulated plasma using the Luminex® multiplex platform (Luminex, Austin TX) with custom-developed reagents (R&D Systems, Minneapolis, MN), as described in detail ((16) or single-plex ELISA (R&D Systems, Minneapolis, MN). Analytes quantified using the Luminex® multiplex platform were read on the MAGPIX® instrument and raw data were analyzed using the xPONENT® software. Analytes quantified using single-plex ELISA were read using optical density. Values outside the lower limit of quantification were assigned a value of 1/3 of the lower limit of the standard curve for analytes quantified by Luminex and 1/2 of the lower limit of the standard curve for analytes quantified by ELISA.

O-link Assay for Serum Factors
Circulating proteins were measured in plasma using the multiplexed Proximity Extension Assay (PEA) from Olink Proteomics AB (Uppsala, Sweden). 20 μL each of plasma collected from the COMET patient cohort (21 COVID-19 positive, 13 COVID-19 negative, and 14 healthy individuals) were analyzed using the Olink® Target 96 Inflammation panel, which is a set of 92 inflammationrelated protein biomarkers. Plasma for all samples regardless of COVID-19 status were inactivated using a final concentration of 1% (v/v) Triton-X-100 solution over 2 hours. Data from the analyzed protein biomarkers is presented as Normalized Protein eXpression (NPX) values, an arbitrary unit on a log2 scale.

ELISA Method for Serum IFNa Measure
IFN-α levels were quantified from serum by an ELISA (catalog numbers 41115 for IFN-α; PBL Assay Science). ELISA was performed according to the manufacturer's instructions with minor modifications. Briefly, an 8-point standard curve was prepared and diluted in sample buffer. Serum was also diluted by adding 80-90 ul serum to 30 ul sample buffer (depending on availability of serum). Samples were prepared in duplicates, whereas standards were prepared in triplicates. Initial incubation was performed for 20 hours at 4C. Antibody was added and incubated at 4C overnight. HRP was added and incubated 1hr at room temperature, TMB substrate was added for 30min and incubated in the dark at room temp, stop solution was added and samples were read using a SpectraMax M2 Microplate Reader (Molecular Devices) at 450nm. For analysis, a 4parameter logistic fit was applied to OD values of the standards after background subtraction. Samples with ODs below blank samples were considered as 0 pg/ml IFN-α.
Bead ELISA: 10 7 5um Sulfate latex polystyrene beads (Thermo Fisher) were resuspended in 1ml of PBS to which 1ug of proteins (BSA, huIFNa (Stemcell IFN alpha-2A) were added to bind by passive absorption over 1 hr on ice. Beads were washed 1x in PBS and blocked with 1ml of blocking buffer (PBS containing 1mM EDTA and 2%FCS) for one hour. Beads were spun and resuspended in 1ml of blocking buffer and 10ul (10 5 beads) were moved to individual tubes to which 5ul of sera was added followed by incubation for 1hr on ice). These were washed, resuspended in 50ul of blocking buffer containing Goat anti-human IgG-Alexa Fluor 647 (Jackson Immunoresearch) and incubated for 1hr on ice followed by a final wash and analysis by flow cytometry.

SARS-CoV-2 detection by PCR
PCR testing for SARS-CoV-2 was carried out on respiratory specimens mixed 1:1 in DNA/RNA Shield (Zymo Inc) using an in-house Clinical Laboratory Improvement Amendments (CLIA) validated assay at the UCSF Clinical Microbiology Laboratory. PCR primers targeted the SARS-CoV-2 envelope (E) and RNA-dependent RNA polymerase (RdRp) genes plus human RNAse P as a positive control.

SARS-CoV-2 infection of PMBC
SARS-CoV-2 isolate USA-WA1/2020 was provided by Dr. Melanie Ott and propagated in Vero E6 (ATCC CRL-1586) cells in Dulbecco's Modified Eagle Medium (UCSF Cell Culture Facility) supplemented with 10% FBS. Vero E6 cells were infected with the SARS-CoV-2 virus for 72h at 37C and 5% CO2. The supernatant was collected and viral titer was quantified using a plaque assay in Vero E6 cells. All work was done under BSL3 conditions. PBMC were infected with SARS-CoV-2 virus at a multiplicity of infection (MOI) 0.1 or 1 for 72 hours. Cells were then harvested and stained with fixable Live/Dead Zombie NIR (Biolegend) in PBS followed by fixation with 4% paraformaldehyde for 1 hour. Intracellular staining to Spike (SARS-CoV-2 Spike S1 Antibody, Rabbit mAb (SinoBiological) and IFITM3 was subsequently performed using eBioscience Foxp3/Transcription Factor Staining Buffer Set (Thermo Fisher Scientific) followed by surface antigen staining.

Statistical Analysis and Data visualization
Statistical analyses were performed using GraphPad prism or the R software package. Null hypotheses between two groups were tested using the non-parametric Mann-Whitney test to account for non-normal distribution of the data. Likewise, for multiple groups, comparisons were made by two-way ANOVA or non-parametric Kruskal-Wallis test followed by multiple comparisons. The specific statistical tests and their resultant significance levels are also noted in each figure legend. The R packages Seurat, ggplot2 (version 3.1.0) (Wickham, 2016) GraphPad Prism and Adobe Illustrator were used to generate figures.

Data Resources and Code Sharing:
Raw Gene expression matrices will be available on GEO at the time of publication. Scripts used to process all data will be shared on Github along with relevant clinical information for each patient.