Clonal hematopoiesis is associated with protection from Alzheimer's disease

Clonal hematopoiesis of indeterminate potential (CHIP) is a pre-malignant expansion of mutated blood stem cells that also associates with non-hematological disorders. Here, we tested whether CHIP was associated with Alzheimer's disease (AD). Surprisingly, we found that CHIP carriers had reduced risk of AD dementia or AD neuropathologic features in multiple cohorts. The same mutations found in blood were also detected in the microglia-enriched fraction of brain in 7 out of 8 CHIP carriers. Single-cell chromatin accessibility profiling of brain-derived nuclei in two CHIP carriers revealed that the mutated cells were indistinguishable from microglia and comprised between 42-77% of the total microglial pool. These results suggest a role for mutant, marrow-derived cells in attenuating risk of AD, possibly by supplementing a failing microglial system during aging.

1A, Fig. S1). We obtained similar results using Cox proportional hazards regression models or when including family as a clustered variable for FHS (Table S4).
We next sought confirmation of this surprising inverse association in an independent cohort, the Alzheimer's Disease Sequencing Project (ADSP), a case-control study for AD with whole exome sequencing (WES) data from brain or blood-derived DNA (13). APOE genotype is the strongest genetic risk factor for AD (14), with APOE 2 alleles conferring protection from disease and APOE 4 alleles conferring increased risk, as compared to those with APOE 33 (Fig. S1). The sample selection strategy for ADSP resulted in cases and controls that were not well-matched for age at WES blood draw for those carrying APOE 2 or 4 alleles (see Materials and Methods). Given CHIP's strong association to age, this selection bias presented a major source of confounding that precluded the analysis of carriers of these alleles. However, APOE 33 AD dementia cases and controls were well matched for age (Table S5), permitting us to test for an association to CHIP in this set. After excluding those with missing information on age at blood draw, a total of 1,446 controls and 1,104 AD dementia cases with blood-derived whole exome sequencing were available for this analysis (Table S5). In this set, there were no overlapping participants between ADSP and TOPMed. CHIP variants were identified in ADSP using an approach previously described (6) and the prevalence of CHIP was appropriate for the age of the cohort ( Table S5). The sequencing depth in ADSP was higher than for TOPMed, which resulted in greater sensitivity to detect smaller clones (Fig. S2). Clone size, which is approximated by the variant allele fraction (VAF), has previously been shown to be an important predictor of risk for blood cancer (4,15) and cardiovascular outcomes (6,7). In order to directly compare outcomes in ADSP to TOPMed, we limited the definition of CHIP carriers to those with VAF>0.08 in this analysis-a cutoff that was chosen because it resulted in a VAF distribution that was nearly identical to TOPMed (Fig. S2). We found that CHIP with VAF>0.08 was associated with reduced risk of AD dementia in ADSP (odds ratio [OR] 0.66, p=5.5 x 10 -4 ) (Fig. 1B, Table S6). In contrast, having VAF≤0.08 had no association to AD dementia (OR 1.25, p=0.23) ( Table S7), suggestive of a dose-response relationship between the size of the mutant clone and protection from AD dementia. Indeed, higher VAF was also significantly . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted December 13, 2021. ; https://doi.org/10.1101/2021.12.10.21267552 doi: medRxiv preprint associated with protection from AD dementia when modeled as a continuous variable (Table S7). A meta-analysis of ADSP, CHS, and FHS showed that CHIP-carriers had a significant reduction in risk of AD (OR 0.64, p=3.0 x 10 -5 ) (Fig. 1C). In sum, our human genetic association analyses demonstrate that CHIP is associated with protection from AD dementia in multiple cohorts, and that the degree of protection is proportional to the size of the mutant clone.
Mendelian randomization is a form of causal inference in which genetic variants known to influence the risk of a particular exposure or trait (in this case CHIP) are assessed for an association to a clinical outcome (in this case AD). We selected three independent common variants that reached genome-wide significance in a CHIP genetic association study as the instrumental variables for CHIP exposure (rs2853677, rs7726159, rs58322641)[(7).
Summary statistics from a large AD genome-wide association study (GWAS) and GWAS-by-proxy metaanalysis (16) were then used to perform Mendelian randomization using the inverse-variance weighted method (17). Here, an increase in the genetic risk of CHIP was associated with reduced odds of AD (OR 0.92 per 1 log-odds increase in risk of CHIP, p=5.6 x 10 -3 , Fig. S3).
The hallmark neuropathological features of AD, regional accumulation of beta-amyloid plaques and tau neurofibrillary tangles, can also be found in some people without a clinical diagnosis of dementia. A neuritic plaque density score developed by the Consortium to Establish a Registry for AD (CERAD) (18) and Braak stage for neurofibrillary tangle distribution (19) are commonly used to assess for these changes at brain autopsy, with an increasing score indicative of more extensive accumulation of pathologic features. A subset of participants in ADSP had brain autopsy performed after death, which allowed us to test whether CHIP was associated with ADrelated neuropathologic change (ADNC) in those without dementia (Table S8). Here, the presence of CHIP was associated with having a lower CERAD neuritic plaque score (OR 0.50, p=3.2 x 10 -3 ) and Braak stage (OR 0.56, p=0.015) using ordinal logistic regression after adjusting for age, sex and APOE genotype ( Fig. 1D-E, Table S8).
This suggests that CHIP associates with protection from the burden of neuritic plaque and neurofibrillary tangle formation even in the absence of clinical dementia.
. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted December 13, 2021. ; https://doi.org/10.1101/2021.12.10.21267552 doi: medRxiv preprint We next asked whether the protection from AD seen in CHIP carriers was influenced by APOE genotype. In those age 70 or older, APOE genotype was strongly associated with AD dementia risk in those without CHIP (p=5.8 x 10 -5 by log-rank test). This effect was not seen in CHIP carriers of the same age (p=0.70 by log-rank test), though the smaller sample size in this group may have limited our power to find an association ( Fig. 2A). In competing risks regression models stratified by APOE genotype, there was a similar magnitude of AD dementia risk reduction in CHIP carriers who were APOE 33 or who carried an APOE 4 allele, but not in those who had the protective APOE 22 or APOE 23 genotypes (Fig. 2B, Table S9). These findings suggest that the mechanism by which CHIP associates with reduced risk of AD dementia might be redundant with the protection conferred through APOE 2.
We also assessed whether the risk of AD dementia varied based on the specific mutated gene. Of the most commonly mutated genes in CHIP, all were associated with protection from AD dementia to a similar degree ( Fig. 2C, Table S10).
We wondered whether cells bearing CHIP-associated mutations could be found in brain, a finding that would strengthen the likelihood of a causal association between CHIP and AD risk. We obtained brain DNA-derived WES data from 1,776 persons in ADSP, of whom 1,462 had AD dementia (82.3%), and assessed for the presence of CHIP-associated variants in these samples. Similar to a prior study (20), we found mutations consistent with CHIP in 17 brain samples, including 15 with AD dementia (Fig. 3A, Table S11). Paired blood DNA was not available for the brain exome samples, so we could not determine whether the mutations we identified were indicative of blood-derived cells in brain.
Hematopoietic cells, nearly all of which are microglia, comprise ~1-10% of the total cells in the brain across brain regions (21) and the limit of detection for clonal hematopoiesis by WES at 80X sequencing depth is ~4% of cells . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted December 13, 2021. ; https://doi.org/10.1101/2021.12.10.21267552 doi: medRxiv preprint harboring a mutation in a sample (4). Therefore, it would be difficult to detect clonal hematopoiesis mutations from unfractionated brain for the vast majority of CHIP carriers using WES. We hypothesized that most CHIP carriers would have the mutations detectable in the brain if examined using more sensitive methods. To test this hypothesis, we obtained tissue samples from the occipital lobe, and in some cases putamen or cerebellum, of 8 donors from the Adult Changes in Thought (ACT) cohort who were known to have CHIP from blood exome sequencing, as well as 1 person without CHIP (Table S12). All persons were in their 80s and 7 out of 9 were without dementia and had no/low ADNC at the time of death. The 8 CHIP carriers had mutations in DNMT3A, TET2, ASXL1, SF3B1, and GNB1 with the highest frequency in DNMT3A (4 out of 8) and TET2 (3 out of 8) (Table S12), which is representative of the relative proportion of these mutations in the general population (7).
In addition, 2 out of the 8 harbored two different CHIP mutations. To determine whether bone marrow derived cells carrying CHIP mutations were present in the brains of these individuals, we digested the frozen brain tissue and isolated intact nuclei, from which we extracted DNA for amplicon sequencing. We detected the same mutations that were present in blood in 6 out of 8 unfractionated brains with VAF ranging from 0.004 to 0.02 ( Fig. S4, Table S12). Of note, the two donors where CHIP variants were not identified in unfractionated brain (ACT1, ACT8) both had AD dementia during life. In contrast, all 6 donors where CHIP variants were found in unfractionated brain were without dementia and free of ADNC.
The CHIP variants detected in whole brain DNA might have originated from residual circulating hematopoietic cells in the vasculature, such as granulocytes or lymphocytes. Alternatively, myeloid cells such as microglia in the brain parenchyma could have been the source of the mutations, although microglia are believed to have little contribution from HSC-derived cells in adulthood (22). To distinguish between these possibilities, we conceived a strategy to enrich for these cells from frozen brain tissue. Since the tissue was not viably cryopreserved, isolation of cells based on expression of membrane antigens was not possible. Instead, we used antibodies to nuclear transcription factors to enrich for mononuclear phagocytes, such as macrophages and microglia (Fig. 3B). We stained nuclei for the neuronal-specific transcription factor NeuN (RBFOX3) and c-Maf, a transcription factor expressed in phagocytes as well as some neurons and non-hematopoietic glial cells. We then sorted 4 populations . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted December 13, 2021. ; https://doi.org/10.1101/2021.12.10.21267552 doi: medRxiv preprint based on the presence or absence of these markers (Fig. 3B). The CHIP somatic variants were found in the NeuNc-Maf + population in 7 out 8 brains, with a VAF that ranged from 0.02 to 0.28 (representing 4% to 56% of NeuNc-Maf + nuclei). In contrast, CHIP somatic variants were not detected in the NeuN+ c-Maf-neuronal population and were absent or at low levels in the other two populations (Fig. 3C, Fig. S4). The VAF was lower in the NeuNc-Maf+ nuclei from cerebellum compared to occipital cortex in the three samples where tissue from both regions was available. In sample ACT8, CHIP variants were robustly detected in the NeuN-c-Maf+ population from occipital cortex but not putamen. These results indicate that there is a substantial contribution to the brain mononuclear phagocytic pool from circulating mutated cells, and that there is also regional heterogeneity in the frequency of infiltrating CHIP+ cells in brain.
Our flow cytometric analysis indicated that a prominent myeloid population bearing CHIP mutations was present in the brains of most CHIP carriers. However, it was unknown if the mutated cells were similar to endogenous brain microglia or were instead a distinct myeloid population not normally found in brain, such as monocytederived macrophages or dendritic cells. To better understand the phenotype of these mutated cells, we performed single-cell assay for transposase-accessible chromatin sequencing (scATAC-seq) on brain samples from three ACT participants: a brain donor without CHIP (ACT9), one with TET2-mutant CHIP (ACT6), and one with DNMT3A-mutant CHIP (ACT2) ( Table S12). For the ACT6 donor, we analyzed tissue from cerebellum and putamen, whereas occipital cortex was assessed for the other two donors. scATAC-seq was performed on unsorted nuclei, as well as sorted NeuN-c-Maf+ nuclei for each sample. After aligning and filtering the scATAC-seq reads, our samples had a median of 12,287 fragments per cell and a median enrichment of fragments in transcription start sites of 9.31, indicating that we recovered high quality scATAC-seq libraries from these archived samples (Fig. S5). In total, we recovered high quality scATAC-seq profiles for 38,206 cells. We then aggregated our data with scATAC-seq data from 10 samples (an additional 72,984 cells) from a comprehensive scATAC-seq characterization of the adult human brain (Corces 2020) (21). After clustering and dimensionality reduction, we identified 18 clusters encompassing the major brain cell types (Fig. 4A-B), including one cluster that contained previously described microglia as well as myeloid cells from each of our samples (cluster 9) (Fig. 4B, Fig. S6, . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.  Table S13). No other hematopoietic cell types were observed in the human brain samples. For cells within cluster 9 grouped by sample, inspecting pseudo-bulk ATAC-seq tracks revealed accessible chromatin at the microglia marker genes TMEM119, P2RY12, SALL1, CSF1R, and SIGLEC8 in each of our samples, which was visually similar to the reference microglia. Cells in this cluster also had high accessibility at the AD-associated genes APOE, TREM2, TYROBP, AXL, and MERTK (Fig. 4C, Fig. S6). As an additional control, we examined reference scATAC-seq profiles of blood monocytes and classical dendritic cells (cDCs) (23) and found that these cells had little to no accessibility at the aforementioned microglia genes. Furthermore, the Cluster 9 tracks in all brain samples had low accessibility at ANPEP (CD13), CXCL2, and ITGAE (CD103), in contrast to monocytes or cDCs (Fig. 4C, Fig. S6). Finally, we quantified the genome-wide similarity of microglia in each pair of samples by considering the number of differential peaks between each pair for cells within cluster 9. The putamen-derived sample showed modest differences from the occipital cortex-derived samples but otherwise essentially no differences were observed, and all comparisons were within the range of variation observed when comparing pairs of the Corces 2020 samples (Fig. S6). These results indicate that the cells in cluster 9 are indistinguishable from microglia and unlikely to contain contaminating monocytes or dendritic cells.
Having established that the only hematopoietic cell type present in these brains was microglia, we used the scATAC-seq data to evaluate the effectiveness of our flow cytometric method for enrichment of these cells. We compared the proportion of microglia in the sorted samples to those present in the unsorted samples and observed that microglia were enriched 11 to 33-fold in the sorted samples from occipital cortex and cerebellum, but only modestly in the putamen sample (Fig. 4D). The percentage of microglia ranged from 7.2% to 25% in the sorted samples, indicating that there were still large numbers of contaminating non-microglial cells in the NeuN-c-Maf+ gate. This suggested that the true fraction of mutated microglia was substantially underestimated by our previous approach using amplicon sequencing of NeuN-c-Maf+ nuclei. To estimate the percentage of mutated microglia more accurately, we first assessed the VAF for the CHIP variant in each unsorted brain sample. Since these are heterozygous mutations, multiplying the VAF by 2 gives an estimate of the percentage of mutated cells in the sample. Since microglia were the only hematopoietic cell type present in these brains, we reasoned that we could . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review) preprint
The copyright holder for this this version posted December 13, 2021. ; https://doi.org/10.1101/2021.12.10.21267552 doi: medRxiv preprint divide the percentage of total mutated cells by the percentage of microglia in each unsorted sample to estimate the percentage of mutant microglia. Using this approach, we calculated that 43% of the microglia in ACT2 harbored the DNMT3A mutation, as compared to 28% of circulating cells in the blood. For the ACT6 donor, 77% and 42% of putamen and cerebellar microglia harbored the TET2 mutation, respectively, compared to 28% of circulating blood cells (Fig. 4E, Table S14). These results indicate that replacement of endogenous microglia by mutant, marrow-derived cells is widespread in the aging brain. The observation that the proportion of mutated cells is substantially greater in the microglial pool than in the blood also suggests that there is positive selection for the mutant cells in the brain microenvironment.
We show here that, unexpectedly, the presence of CHIP is associated with protection from AD dementia. This effect is seen in multiple cohorts, is not due to survival bias, is seen with several different mutated genes, and is strongest in carriers of APOE 33 or APOE 4 alleles. The degree of protection from AD dementia seen in CHIP carriers is greater than carrying an APOE 2 allele, which is the most protective common inherited variant for AD (24). CHIP is also associated with lower levels of neuritic plaques and neurofibrillary tangles in those without dementia, indicating a possible modulating effect of CHIP on the underlying pathophysiology of AD. Consistent with this hypothesis, we also detect substantial infiltration of brain by marrow-derived mutant cells which adopt a microglial-like phenotype. We speculate that the mutations associated with CHIP confer circulating precursor cells with an enhanced ability to engraft in the brain, to differentiate into microglia once engrafted, and/or to clonally expand relative to unmutated cells in the brain microenvironment. These non-mutually exclusive possibilities could provide protection from AD by supplementing the phagocytic capacity of the endogenous microglial system during aging. Alternatively, or in addition, the mutations may alter the functionality of the engrafted myeloid cells in a manner that promotes clearance of pathologic beta-amyloid and/or tau. Understanding the interplay between CHIP and the aging brain may yield valuable information about the pathogenesis of AD and provide insights into slowing its progression.
. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted December 13, 2021.

A) Forest plot for risk of incident AD in CHIP carriers from Cardiovascular Health Study (CHS) and Framingham
Heart Study (FHS) relative to non-carriers. Subdistribution hazard ratios (SHR), 95% confidence intervals (CI95) and Wald p-values were calculated for each covariate from competing risks regression models which included age at blood draw, sex, and APOE genotype as covariates. Results from CHS and FHS were then meta-analyzed using a fixed-effects model for the two cohorts (see Figure S1 for full regression results). B) Forest plot for risk of AD in CHIP carriers relative to non-carriers from the AD Sequencing Project (ADSP) with APOE 33 genotype. Odds ratio (OR), 95% confidence interval (CI95) and Wald p-value were calculated from a logistic regression model that also included age at time of blood draw and sex as covariates (see Table S6 for full regression results). C) Fixed-effects meta-analysis for risk of AD in CHIP carriers using logistic regression in ADSP, FHS, and CHS. D) Plot of odds ratio (OR) and 95% confidence interval (CI95) for increased CERAD neuritic plaque score in ADSP participants without a dementia diagnosis from an ordinal logistic regression model. The covariates included in the model are age at autopsy, APOE genotype, sex, and CHIP status. The p-values were calculated by comparing the t-statistic for each covariate against a standard normal distribution. Full regression results are in Table S8. E) Plot of odds ratio (OR) and 95% confidence interval (CI95) for increased Braak stage in ADSP participants without a dementia diagnosis from an ordinal logistic regression model. The covariates included in the model are age at autopsy, APOE genotype, sex, and CHIP status. The p-values were calculated by comparing the t-statistic for each covariate against a standard normal distribution. Full regression results are in Table S8.
. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted December 13, 2021. ; https://doi.org/10.1101/2021.12.10.21267552 doi: medRxiv preprint . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted December 13, 2021. ; https://doi.org/10.1101/2021.12.10.21267552 doi: medRxiv preprint Fig. 2. Associations of CHIP to AD by APOE genotype and mutated driver gene A) Kaplan-Meier curve showing AD-free probability in CHIP non-carriers (left) and carriers (right), stratified by APOE genotype. Analysis was restricted to those older than 70 at time of blood draw. B) Forest plot for effect of CHIP on AD risk in participants from CHS and FHS stratified by APOE genotype. Participants were binned into those with neutral (APOE 33), low risk (APOE 22 and 23), and high-risk (any APOE 4 allele) groups. Subdistribution hazard ratios (SHR), 95% confidence intervals (CI95) and Wald pvalues were calculated for each covariate (age at time blood draw for sequencing, sex, CHIP carrier status) from competing risks regression models, and results from FHS and CHS were then meta-analyzed using a fixed-effects model (see Table S9 for full regression results). C) Plot of odds ratios for effect of mutated CHIP gene on AD in participants from the TOPMed cohorts (CHS and FHS) and ADSP. Odds ratios (OR), 95% confidence intervals (CI95) and Wald p-values were calculated for each covariate (age at time blood draw for sequencing, sex, cohort, and APOE genotype) from logistic regression models, and results from the TOPMed cohorts and ADSP were then meta-analyzed using a fixed-effects model (see Table S10 for full regression results).
. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted December 13, 2021. ; https://doi.org/10.1101/2021.12.10.21267552 doi: medRxiv preprint Fig. 3. CHIP variants can be found in the microglia-enriched fraction of brain A) Barplot of putative CHIP mutations identified from whole exome sequencing of brain DNA from 1,775 persons in ADSP. Full details on the variants identified are in Table S11. B) Schematic of experimental workflow. Autopsy samples from occipital cortex, cerebellum and putamen were digested to prepare single nuclei suspensions. Nuclei were then stained and sorted using antibodies to C-Maf + (marker of myeloid cells) and NeuN+ (Marker of Neuronal cells), followed by amplicon sequencing for CHIP variants.   . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint

Mutations
The copyright holder for this this version posted December 13, 2021. ; https://doi.org/10.1101/2021.12.10.21267552 doi: medRxiv preprint C) Barplot of the variant allele fraction (VAF) of the CHIP variants from 8 donors (ACT1 to ACT8). For each sample, the VAF in the blood and in the brain C-Maf + NeuNpopulation are shown. Occipital cortex was available for all 8 donors. A bar for cerebellum or putamen is shown if available, otherwise NA in the corresponding color designates lack of an available sample (purple for cerebellum and red for putamen). The CHIP mutations carried by each participant are reported in the box on the right of the barplot.
. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted December 13, 2021. ; https://doi.org/10.1101/2021.12.10.21267552 doi: medRxiv preprint Fig. 4. scATAC-seq of brain samples from CHIP carriers reveals that the mutated cells are similar to microglia and comprise a large proportion of the microglial pool A) scATAC-seq profiles of 111,190 cells from our dataset and the Corces 2020 adult human brain dataset. Each dot represents the scATAC-seq profile of one cell and is colored by its assigned cluster. B) scATAC-seq profiles of all cells colored by which sample it originated from. Samples from Corces 2020 are aggregated and shown in grey. Sorted samples were from the c-Maf+ NeuN-gate. C) Pseudo-bulk tracks for selected gene loci. The top 5 tracks show scATAC-seq coverage of cells from the indicated sample (or aggregated Corces 2020 samples) within C9, the microglia cluster. The monocyte and classical dendritic cell (cDC) tracks are from the Satpathy 2019 hematopoiesis dataset. D) Fraction of cells in cluster C9 (microglia) for sorted versus unsorted brain samples. E) Proportion of microglia (mg) bearing a CHIP mutation in each sample, calculated by dividing the percentage of cells in cluster 9 in each unsorted sample by 2 times the VAF of the CHIP mutation for that sample. The brain regions are abbreviated as Ce for Cerebellum, OC for Occipital cortex, and P for putamen.
. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted December 13, 2021. ; https://doi.org/10.1101/2021.12.10.21267552 doi: medRxiv preprint