Targeted Hybridization Capture of SARS-CoV-2 and Metagenomics Enables Genetic Variant Discovery and Nasal Microbiome Insights

ABSTRACT The emergence of novel severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) genetic variants that may alter viral fitness highlights the urgency of widespread next-generation sequencing (NGS) surveillance. To profile genetic variants of the entire SARS-CoV-2 genome, we developed and clinically validated a hybridization capture SARS-CoV-2 NGS assay, integrating novel methods for panel design using double-stranded DNA (dsDNA) biotin-labeled probes, and built accompanying software. This test is the first hybrid capture-based NGS assay given Food and Drug Administration (FDA) emergency use authorization for detection of the SARS-CoV-2 virus. The positive and negative percent agreement (PPA and NPA, respectively) were defined in comparison to the results for an orthogonal real-time reverse transcription polymerase chain reaction (RT-PCR) assay (PPA and NPA, 96.7 and 100%, respectively). The limit of detection was established to be 800 copies/ml with an average fold enrichment of 46,791. Furthermore, utilizing the research-use-only analysis to profile the variants, we identified 55 novel mutations, including 11 in the functionally important spike protein. Finally, we profiled the full nasopharyngeal microbiome using metagenomics and found overrepresentation of 7 taxa and evidence of macrolide resistance in SARS-CoV-2-positive patients. This hybrid capture NGS assay, coupled with optimized software, is a powerful approach to detect and comprehensively map SARS-CoV-2 genetic variants for tracking viral evolution and guiding vaccine updates. IMPORTANCE This is the first FDA emergency-use-authorized hybridization capture-based next-generation sequencing (NGS) assay to detect the SARS-CoV-2 genome. Viral metagenomics and the novel hybrid capture NGS-based assay, along with its research-use-only analysis, can provide important genetic insights into SARS-CoV-2 and other emerging pathogens and improve surveillance and early detection, potentially preventing or mitigating new outbreaks. Better understanding of the continuously evolving SARS-CoV-2 viral genome and the impact of genetic variants may provide individual risk stratification, precision therapeutic options, improved molecular diagnostics, and population-based therapeutic solutions.

This study by Nagy-Szakala, D., Couto-Rodriguez, M., et al describes a hybridization capture NGS assay for identification and sequencing of clinical isolates of SARS-CoV-2. The developed assay is highly sensitive, comparable in reliability to traditional RT-PCR and importantly, can identify SARS-CoV-2 genetic variants. In addition, the NGS pipeline was utilized to profile the microbiome isolated from specimen. The coupling of SARS-CoV-2 detection and sequencing will provide a valuable tool in improving the surveillance of SARS-CoV-2 variants as well as identification of potentially new variants of concern. The role of bacteria during COVID-19 has been debated and evidence for and against have been published since the onset of the pandemic. Here, the authors identify a probable relationship between certain taxa of bacteria and SARS-CoV-2 infection. Overall, the methods and data presented in this study will be an important addition to the current SARS-CoV-2 surveillance toolkit. Listed below are some comments that will further broaden the application of the presented assay.
1. The authors observe an enrichment of certain bacterial taxa in SARS-CoV-2 positive specimens. Generating a correlation plot between the observed bacterial taxa and SARS-CoV-2 Ct values will provide a rationale for proposing a role of the specific bacteria during SARS-CoV-2 infection. Furthermore, an additional correlation between clinical score and bacteria should be presented. 2. Shotgun metagenomics was performed on DNA extracted from samples. Most respiratory viruses that impact global health are RNA viruses and would not be assessed by the current pipeline. This downfall of the current assay should be discussed and improved upon. The presence of influenza virus, a negative sense RNA virus, in Figure 4 should be explained.
3. Several studies have demonstrated the potential of animal reservoirs of SARS-CoV-2 such as cats, dogs, mink among others. Surveillance of animal reservoirs using the presented assay will not only add to our understanding of SARS-CoV-2 adaptation during infection of animal hosts but could potentially identify viruses similar to SARS-CoV-2 with pandemic potential. 4. Seasonal respiratory viruses have a major yearly burden on human health. Adapting the current pipeline for surveillance of Influenza viruses should be discussed. 5. In Figure 3D, the dot color used to represent "TEST Sample" and "19A" are very similar and should be changed to make them clearly distinguished. 6. Figure 5C and 5D are out of date and need to be updated to include the most recent genome sequence and phylogenetic clades, respectively. 7. In Figure 5B (Actinomyces graevenitzii), there are two clear outliers present in the positive samples. These specific samples should be discussed in the context of SARS-CoV-2 and presence of other bacteria taxa.
Reviewer #2 (Comments for the Author): The manuscript by Nagy-Szakal et al describes the development and validation of a new NGS method for sequencing SARS-CoV-2 genome. This new method depends on targeted capture and sequencing the genome of the COVID19-causing virus. The manuscript also describes a novel pipeline to analyze NGS data using the capture hybridization method. In addition, the authors profiled the metagenomics and the microbiome of nasopharyngeal samples from COVID-19 positive and negative cases. Generally, the manuscript is well-written and potentially impactful. However, I have the following comments that are needed to be addressed before publishing this paper: variants. While this is a hypothetical possibility, the authors provide no evidence that this is the case. Indeed, the positive and negative percent agreements for detection of SARS-CoV-2 with standard RT-PCR are 97 and 100%, respectively. These data suggest that amplicon-based sequencing can detect different variants seen in this study -For metagenomic data, it is important to stress that SARS-CoV-2 negative samples should not be considered normal as many of these samples are likely collected from cases with symptoms suggesting COVID19, so many of these cases may have other respiratory diseases Staff Comments:

Preparing Revision Guidelines
To submit your modified manuscript, log onto the eJP submission site at https://spectrum.msubmit.net/cgi-bin/main.plex. Go to Author Tasks and click the appropriate manuscript title to begin the revision process. The information that you entered when you first submitted the paper will be displayed. Please update the information as necessary. Here are a few examples of required updates that authors must address: • Point-by-point responses to the issues raised by the reviewers in a file named "Response to Reviewers," NOT IN YOUR COVER LETTER. • Upload a compare copy of the manuscript (without figures) as a "Marked-Up Manuscript" file. • Each figure must be uploaded as a separate file, and any multipanel figures must be assembled into one file. Please return the manuscript within 60 days; if you cannot complete the modification within this time period, please contact me. If you do not wish to modify the manuscript and prefer to submit it to another journal, please notify me of your decision immediately so that the manuscript may be formally withdrawn from consideration by Microbiology Spectrum.
If you would like to submit an image for consideration as the Featured Image for an issue, please contact Spectrum staff.
If your manuscript is accepted for publication, you will be contacted separately about payment when the proofs are issued; please follow the instructions in that e-mail. Arrangements for payment must be made before your article is published. For a complete list of Publicat ion Fees, including supplemental material costs, please visit our website.
This study by Nagy-Szakala, D., Couto-Rodriguez, M., et al describes a hybridization capture NGS assay for identification and sequencing of clinical isolates of SARS-CoV-2. The developed assay is highly sensitive, comparable in reliability to traditional RT-PCR and importantly, can identify SARS-CoV-2 genetic variants. In addition, the NGS pipeline was utilized to profile the microbiome isolated from specimen. The coupling of SARS-CoV-2 detection and sequencing will provide a valuable tool in improving the surveillance of SARS-CoV-2 variants as well as identification of potentially new variants of concern. The role of bacteria during COVID-19 has been debated and evidence for and against have been published since the onset of the pandemic. Here, the authors identify a probable relationship between certain taxa of bacteria and SARS-CoV-2 infection. Overall, the methods and data presented in this study will be an important addition to the current SARS-CoV-2 surveillance toolkit. Listed below are some comments that will further broaden the application of the presented assay.
1. The authors observe an enrichment of certain bacterial taxa in SARS-CoV-2 positive specimens.
Generating a correlation plot between the observed bacterial taxa and SARS-CoV-2 Ct values will provide a rationale for proposing a role of the specific bacteria during SARS-CoV-2 infection. Furthermore, an additional correlation between clinical score and bacteria should be presented. 2. Shotgun metagenomics was performed on DNA extracted from samples. Most respiratory viruses that impact global health are RNA viruses and would not be assessed by the current pipeline. This downfall of the current assay should be discussed and improved upon. The presence of influenza virus, a negative sense RNA virus, in Figure 4 should be explained. 3. Several studies have demonstrated the potential of animal reservoirs of SARS-CoV-2 such as cats, dogs, mink among others. Surveillance of animal reservoirs using the presented assay will not only add to our understanding of SARS-CoV-2 adaptation during infection of animal hosts but could potentially identify viruses similar to SARS-CoV-2 with pandemic potential. 4. Seasonal respiratory viruses have a major yearly burden on human health. Adapting the current pipeline for surveillance of Influenza viruses should be discussed. 5. In Figure 3D, the dot color used to represent "TEST Sample" and "19A" are very similar and should be changed to make them clearly distinguished. 6. Figure 5C and 5D are out of date and need to be updated to include the most recent genome sequence and phylogenetic clades, respectively. 7. In Figure 5B   This study by Nagy-Szakal, D., Couto-Rodriguez, M., et al describes a hybridization capture NGS assay for identification and sequencing of clinical isolates of SARS-CoV-2. The developed assay is highly sensitive, comparable in reliability to traditional RT-PCR and importantly, can identify SARS-CoV-2 genetic variants. In addition, the NGS pipeline was utilized to profile the microbiome isolated from specimen. The coupling of SARS-CoV-2 detection and sequencing will provide a valuable tool in improving the surveillance of SARS-CoV-2 variants as well as identification of potentially new variants of concern. The role of bacteria during COVID-19 has been debated and evidence for and against have been published since the onset of the pandemic. Here, the authors identify a probable relationship between certain taxa of bacteria and SARS-CoV-2 infection. Overall, the methods and data presented in this study will be an important addition to the current SARS-CoV-2 surveillance toolkit. Listed below are some comments that will further broaden the application of the presented assay.
1. The authors observe an enrichment of certain bacterial taxa in SARS-CoV-2 positive specimens. Generating a correlation plot between the observed bacterial taxa and SARS-CoV-2 Ct values will provide a rationale for proposing a role of the specific bacteria during SARS-CoV-2 infection. Furthermore, an additional correlation between clinical score and bacteria should be presented.

Response:
Thank you for that feedback. In response, we generated a Pearson correlation plot between the top 50 most abundant bacterial taxa and SARS-CoV-2 Ct values. However, we observed no correlation among these variables. The enrichment observed and outlined in the paper was determined by comparing the SARS-CoV-2 infection vs no SARS-CoV-2 infection states. It is possible that viral load does not play a role in the enrichment observed, or that our data is not appropriate to fully address this.
2. Shotgun metagenomics was performed on DNA extracted from samples. Most respiratory viruses that impact global health are RNA viruses and would not be assessed by the current pipeline. This downfall of the current assay should be discussed and improved upon. The presence of influenza virus, a negative sense RNA virus, in Figure 4 should be explained.

Response:
We extended our discussion including other RNA and DNA viruses to the hybridcapture based assay on P. 20 including "Future work includes the extension of this hybrid capture NGS-based assay to characterize other viral and bacterial genomes, total nucleic acid extraction to define the entire microbiome and virome, and the collection of clinical metadata to define risk stratification and disease pathogenesis in relation to SARS-CoV-2 Regarding influenza, with your feedback and our subsequent analysis we have determined this to be an accidental detection of RNA viruses (such as influenza A). This was further investigated using BLAST to map to the influenza and other viruses, it turned out to be a false positive. We removed all RNA viruses from the manuscript text and also updated Figure 4 removing the viral information.
3. Several studies have demonstrated the potential of animal reservoirs of SARS-CoV-2 such as cats, dogs, mink among others. Surveillance of animal reservoirs using the presented assay will not only add to our understanding of SARS-CoV-2 adaptation during infection of animal hosts but could potentially identify viruses similar to SARS-CoV-2 with pandemic potential.

Response:
We agree this context is important. We added a section explaining the keys of using this technology for surveillance of animal reservoirs to Prospective (Supplementary Materials). "NGS provides a valuable tool for the detection of emerging viruses in domestic animals and wildlife, and generates critical data that is needed to characterize the potential for a virus to be pathogenic in humans. Several studies have demonstrated the potential of animal reservoirs of SARS-CoV-2 such as cats, dogs, bats, minks among others (6). Surveillance of animal reservoirs using the presented assay will not only add to our understanding of SARS-CoV-2 adaptation during infection of animal hosts but could potentially identify viruses similar to SARS-CoV-2 with pandemic potential." 4. Seasonal respiratory viruses have a major yearly burden on human health. Adapting the current pipeline for surveillance of Influenza viruses should be discussed.

Response:
We extended the discussion with additional focus on using hybrid-capture based NGS assays determining other respiratory pathogens and influenza. "Future work includes the extension of this hybrid capture NGS-based assay to characterize other viral and bacterial genomes, total nucleic acid extraction to define the entire microbiome and virome, and the collection of clinical metadata to define risk stratification and disease pathogenesis in relation to SARS-CoV-2 genetic variants. The SARS-CoV-2 pandemic has shown the importance and utility of genomic variant surveillance as a tool to understand viral dynamics and evolution to guide public health strategies. A hybrid capture approach is not only applicable to SARS-CoV-2 but also for other respiratory pathogens such as influenza and future outbreaks." 5. In Figure 3D, the dot color used to represent "TEST Sample" and "19A" are very similar and should be changed to make them clearly distinguished.