The LDLR, APOB, and PCSK9 Variants of Index Patients with Familial Hypercholesterolemia in Russia

Familial hypercholesterolemia (FH) is a common autosomal codominant disorder, characterized by elevated low-density lipoprotein cholesterol levels causing premature atherosclerotic cardiovascular disease. About 2900 variants of LDLR, APOB, and PCSK9 genes potentially associated with FH have been described earlier. Nevertheless, the genetics of FH in a Russian population is poorly understood. The aim of this study is to present data on the spectrum of LDLR, APOB, and PCSK9 gene variants in a cohort of 595 index Russian patients with FH, as well as an additional systematic analysis of the literature for the period of 1995–2020 on LDLR, APOB and PCSK9 gene variants described in Russian patients with FH. We used targeted and whole genome sequencing to search for variants. Accordingly, when combining our novel data and the data of a systematic literature review, we described 224 variants: 187 variants in LDLR, 14 variants in APOB, and 23 variants in PCSK9. A significant proportion of variants, 81 of 224 (36.1%), were not described earlier in FH patients in other populations and may be specific for Russia. Thus, this study significantly supplements knowledge about the spectrum of variants causing FH in Russia and may contribute to a wider implementation of genetic diagnostics in FH patients in Russia.


Introduction
Familial hypercholesterolemia (FH) is a common autosomal codominant disorder, characterized by elevated low-density lipoprotein (LDL) cholesterol levels causing premature atherosclerotic cardiovascular disease [1]. In two meta-analyses of 2020, similar results were obtained on the prevalence of heterozygous FH (HeFH) in the general population: one USA). The panel flanked exonic and adjacent intronic sequences of 25 genes (UTR + CDS + 100 bp padding). VCF files were generated from BAM files on a Torrent Server (Thermo Fisher Scientific, Waltham, MA, USA) with default parameters. VCF files were annotated using Ion Reporter (Thermo Fisher Scientific, Waltham, MA, USA) with Annotate Variants analysis tool. For Nextseq 550, the library preparation was performed using the SeqCap EZ Prime Choice Library kit (Roche, Basel, Switzerland). Two Roche panels were used, consisting of 24 (CDS + 25 bp padding) and 244 (CDS + 25 bp padding) genes. All three panels included the LDLR, APOB and PCSK9 genes. All stages of sequencing were carried out according to the manufacturers' protocols. Reads were aligned to the reference genome (GRCh37). Sequencing analysis resulted in fastq files. Data processing was performed with BWA, Picard, bcftools, GATK3 and generally followed the GATK best practices for variant calling. We applied standard GATK hard filters for single nucleotide substitutions (MQ, QD, FS, SOR, MQRankSum, QUAL, ReadPosRankSum) and for short insertions and deletions (QD, FS, QUAL, ReadPosRankSum). Single nucleotide variants and short indels were annotated with ANNOVAR.

Whole Genome Sequencing and Bioinformatic Analysis
DNA was extracted from whole blood sample using QIAamp ® DNA Mini Kit (Qiagen, Hilden, Germany). A WGS library was prepared using Nextera DNA Flex kit (Illumina, San Diego, CA, USA) according to manufacturer instructions. Paired-end sequencing (150 bp) was performed to mean sequencing coverage of 30× or more. Reads were aligned to the reference genome (GRCh38) and small variants were called using Dragen Bio-IT platform (Illumina, San Diego, CA, USA) and joint-called with GLnexus [12].
Structural variant (SV) calling was performed with smoove software [13]. Annotation was performed using an Ensembl Variant Effect Predictor (VEP) [14]. All variants were visually inspected in an Integrative Genomics Viewer (IGV) [15] and breakpoint regions were investigated with PCR and Sanger sequencing. Mobile elements (ME) SVA, LINE1 and Alu were called using MELT software [16] and annotated with VEP [14]. Images were prepared using the R programming language. For Figure 1 a trackViewer package was used [17].

Clinical Interpretation
The following canonical transcripts were used in this work: NM_000527.5 (LDLR), NM_000384.3 (APOB), and NM_174936.4 (PCSK9). For clinical interpretation, short genetic variants with overall frequencies for European (non-Finnish) in the gnomAD database of <0.5%, or missing in the gnomAD, were selected. SV-only variants with frequencies of <0.5% for European (non-Finnish) were left for evaluation. No ME insertions were found for LDLR, APOB or PSCK9. Evaluation of the pathogenicity of the variants was carried out in accordance with the recommendations of the American College of Medical Genetics and Genomics (ACMG) with modifications [18]. The following types of variants are reported in the article: pathogenic (P), likely pathogenic (LP) and variant of unknown significance (VUS). All variants were analyzed for their presence in the databases (LOVD, ClinVar and HGMD) [5,19]. , and APOB genes, specific for the Russian population. For the LDLR gene, due to the large quantity, only 30 novel variants found in this study are shown (with the exception of four large structural variants presented in Figure 2). Number of index patient is indicated in the circle (0 is for variants found in other studies), color indicates clinical interpretation: red, orange and yellow for pathogenic (P), likely pathogenic (LP) and variant of uncertain significance (VUS), respectively. Coordinates are given in hg38 assembly.

Figure 2.
Exonic structure of the native LDLR gene and its large structural variants found in this study. Exon border shape (flat and right or left pointing) shows the phase of the reading frame (+0; +1; +2); if borders don't match, a frame shift occurs (deletions exon 9-10 and 16-17). NMD marks a variant that likely leads to the nonsense-mediated decay.

Sanger Sequencing
The validation of NGS results was done by Sanger sequencing. PCRs were performed in 20 µL of a mixture containing 0.2 mM of each nucleotide, 1× PCR buffer, 20 ng of the DNA, 10 ng of each primer, 2.5 U of DNA polymerase. Amplification was performed on a GeneAmp PCR System 9700 thermocycler (Thermo Fisher Scientific, Waltham, MA, USA) with the following parameters: 95 • C-300 s; 30 cycles: 95 • C-30 s, 62 • C-30 s, 72 • C-30 s; 72 • C-600 s. Before the Sanger reaction, the obtained amplicons were purified using ExoSAP-IT (Affymetrix, Santa Clara, CA, USA) according to the manufacturer's protocol. The nucleotide sequence of PCR products was determined using the ABI PRISM ® BigDye™ Terminator reagent kit v. 3.1 followed by analysis of the reaction products on an automated DNA sequencer Applied Biosystem 3500 DNA Analyzer (Thermo Fisher Scientific, Waltham, MA, USA).

Genetic Test Results
In our study we performed genetic testing of 595 unrelated patients with FH, of which six patients demonstrated the phenotype of HoFH and the rest had clinical features of HeFH. Target sequencing was performed for 401 patients and whole genome sequencing was performed for 405 patients (both methods were performed for 211 patients). In 405 WGS patients we called SNPs, short indels, long SVs and ME insertions.
We identified 122 different potentially causative variants in LDLR, 13 variants in APOB, and 21 variants in PCSK9 in 294 unrelated patients (Figures 1 and 2, Tables A1-A3). No potentially causative variants were found in 301 of 595 patients (50.6%). Out of these 294 patients, one patient was a true homozygote, four compound heterozygotes with two LDLR variants on different chromosomes (in trans), one compound heterozygote with two LDLR variants on the same chromosome (in cis), two compound heterozygotes with two LDLR variants of unknown mutual arrangement of alleles, six double heterozygotes (harboring two variants in two different genes) and one patient with three variants in three genes (Table A4), the rest were simple heterozygotes. A total of 34 variants in LDLR, six variants in APOB and six variants in PCSK9, were found in this study for the first time. Most of these variants were unique but some LDLR variants occurred in several unrelated patients: p.Cys68Phe, p.Pro196Arg, p.Cys318Trp, p.Tyr375Asp and p.Ile566Phe. Of 35 variants previously described in the literature [6][7][8][9][10][20][21][22][23][24][25][26][27][28][29][30][31][32][33][34][35][36][37][38][39] only for the Russian population, six variants were also found in this study. Most of these variants were also unique, except for variant LDLR-p.Cys160Gly, that was found in six unrelated patients. Of all variants (the percentage of all identified potentially causative alleles (310 alleles found in this study)) the most common were: LDLR-p.Gly592Glu-9.4%, LDLR-p.Leu401His-9%, APOB-p.Arg3527Gln-7.4%, LDLR-p.Cys329Tyr-2.6%, LDLR-p.Cys160Gly-1.9%. Most of the variants described above were SNPs and short indels. Only five large SVs were found in this study and all of them in LDLR gene ( Figure 2). Four novel deletions were found and a tandem duplication previously described in a patient of Czech origin (ClinVar ID: 251140). No ME insertions were found in any of the studied genes.

Description of All Variants in Russia
In total, when combining our data (156 LDLR, APOB and PCSK9 variants) and the data of the systematic review (91 LDLR, APOB and PCSK9 variants), we described 224 variants: 187 LDLR variants, 14 APOB variants, and 23 PCSK9 variants (Tables A1-A3). A significant proportion of variants-36.1% (67 LDLR variants, six APOB variants and eight PCSK9 variants)-was not described in FH patients in other populations and may be specific for Russia.
In accordance with the criteria of pathogenicity, 38 LDLR variants were classified as pathogenic (P), 53 as likely pathogenic (LP) and 95 as variant of unknown significance (VUS). In the APOB gene there were four LP and 10 VUS, and in the PCSK9 gene four LP and 19 VUS (Table 1). 14 (0/4/10) 6 (0/1/5) 6(0/1/5) Novel: variants found in this study for the first time. Possibly unique for the Russian population: variants found in this study and previously described only for the Russian population. (*)-for one variant it was impossible to determine the category of pathogenicity. However, it was earlier described in the literature as pathogenic.

Discussion
This study was based on the largest number of participants of any genetic FH study in Russia to date. Including collected literature data, this study reported 224 variants found in the Russian population, either novel or reported before, with 81 variants described only in Russian FH patients. These data on the spectrum of the LDLR, APOB and PCSK9 variants can be useful for clinical interpretation when carrying out a genetic diagnosis of FH in Russia. It also improved knowledge about the genetics of FH in general. Thus, according the results of this study, Russia is ranked fourth among countries with the largest number of variants described in FH patients, after the United Kingdom, the Netherlands and Italy [18]. In our study, we did not carry out a functional analysis of the identified variants and used ACMG recommendations to assess their pathogenicity. About half of the variants described here were assigned a category of uncertain significance and, possibly, in the future with the advent of new data, their causality may be revised. It would also be desirable to assess the clinical significance of the combined effect of two or more variants identified in patients with HeFH (Table A4).
The WGS-based SV analysis was performed for 405 patients for whom no relevant variants were found by targeted sequencing. The fact that no large SVs were found either in PCSK9 or in APOB may be explained by their gain-of-function pathogenicity model. Taking into account the literature data, nine large rearrangements in LDLR were described for the Russian patients earlier and their proportion of the total number of unique variants (n = 187) of the LDLR gene was 4.8%, which is slightly less than the share of large LDLR rearrangements in the ClinVar database (6.1%) [5]. The presence of large deletions, encompassing exonic LDLR regions, suggests that multiplex ligation-dependent probe amplification could be a useful method in genetic confirmation of FH.

Conclusions
This study significantly supplements knowledge about the spectrum of variants causing FH in Russia and may contribute to a wider implementation of genetic diagnostics in Russian FH patients. Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.

Data Availability Statement:
The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.

Acknowledgments:
The authors are grateful to the patients and their family for their continuous contributions and support of our research.

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.     1 Only for variants found in this study a number of index patients is given. Variants from systematic review are labelled with "0". 2 1-described only in this study, 2-described in this study and in other studies in Russia, 3-described in this study, in other studies in Russia and other countries, 4-described in this study and other countries, 5-did not occur in this study, described in other studies in Russia, 6-did not occur in this study, described in other studies in Russia and other countries. * No data on coding sequence alteration in reference. c.*415G > A VUCS [32] 1 Only for variants found in this study a number of index patients is given. Variants from systematic review are labelled with "0".