Integrated evaluation of telomerase activation and telomere maintenance across cancer cell lines

In cancer, telomere maintenance is critical for the development of replicative immortality. Using genome sequences from the Cancer Cell Line Encyclopedia and Genomics of Drug Sensitivity in Cancer Project, we calculated telomere content across 1299 cancer cell lines. We find that telomerase reverse transcriptase (TERT) expression correlates with telomere content in lung, central nervous system, and leukemia cell lines. Using CRISPR/Cas9 screening data, we show that lower telomeric content is associated with dependency of CST telomere maintenance genes. Increased dependencies of shelterin members are associated with wild-type TP53 status. Investigating the epigenetic regulation of TERT, we find widespread allele-specific expression in promoter-wildtype contexts. TERT promoter-mutant cell lines exhibit hypomethylation at PRC2-repressed regions, suggesting a cooperative global epigenetic state in the reactivation of telomerase. By incorporating telomere content with genomic features across comprehensively characterized cell lines, we provide further insights into the role of telomere regulation in cancer immortality.


25
Telomeres, repetitive nucleoprotein complexes located at chromosomal ends, are an important 26 component of genomic stability (De Lange, 2009). As protective chromosomal caps, telomeres 27 prevent potentially lethal end-fusion events (McClintock, 1941(McClintock, , 1942 and mis-processing of 28 chromosomal ends as damaged sites by the DNA repair machinery (De Lange, 2005; Verdun & 29 Karlseder, 2007). Due to factors such as incomplete DNA replication and oxidative stress, 30 telomeres gradually shorten with successive rounds of cell division (Olovnikov, 1973). If left 31 unchecked, telomere attrition eventually triggers growth arrest and senescence, and further 32 shortening can lead to acute chromosomal breakage and cell death. Telomere shortening 33 therefore acts as a major obstacle in the course of tumor development (Hackett & Greider, 34 2002), and inhibition of telomere maintenance offers still largely untapped opportunities for 35 targeted cancer therapies (Damm et al., 2001;Dikmen et al., 2005;Flynn et al., 2015). 36 Telomere shortening in embryonic development and in certain adult cell populations is 37 offset by telomerase (Greider & Blackburn, 1985), a ribonucleoprotein enzyme with a core 38 reverse transcriptase, TERT, that lengthens telomeres by catalyzing the addition of TTAGGG 39 nucleotide repeats from an inbuilt RNA template component, TERC (Feng et al., 1995;Yu et al., 40 1990). Although telomerase is transcriptionally silenced in the majority of somatic cells, 41 telomerase is reactivated in over 85% of all human cancers (N. W. Kim et al., 1994). 42 Reactivation of telomerase is associated with a diverse set of genomic alterations, the most 43 common of which include highly recurrent mutations in the TERT promoter (Horn et & Smith, 2015), comparisons between telomere 125 content and mutations in ATRX and DAXX yielded significant associations only between DAXX 126 alterations and GDSC WES telomere content (Supplementary Fig. 3a). We further repeated 127 association tests with TP53, VHL, and IDH1 as identified previously among TCGA samples 128 (Barthel et al., 2017), with which we confirm that truncating VHL mutations are associated with 129 reduced telomere lengths (Supplementary Fig. 3a) although this may be confounded by the 130 high occurrence of VHL mutations in kidney cell lines. Whereas we found relatively few 131 significant associations between telomere content and mutations, we note that we were limited 132 to an absolute estimate of telomere content as opposed to a relative measure of somatic 133 telomere lengthening, which requires a paired normal sample (Barthel et al., 2017). 134 Telomere content associates with CST complex dependencies 135 Having explored the association between telomere content and non-perturbative annotations, 136 we next considered whether variations in telomere content could confer or reduce selective 137 vulnerabilities to inactivation of certain genes. In particular, we hypothesized that telomere 138 content may be associated with vulnerabilities to reductions in the levels of telomere-regulating 139 proteins. To reveal such associations, we correlated our telomere content estimates with gene 140 inactivation sensitivities assessed via genome-wide CRISPR-Cas9 (Avana (Meyers et al.,141 2017)) and RNAi viability screens (Achilles RNAi (Tsherniak et al., 2017) and DRIVE (McDonald 142 et al., 2017)). Although we found no dependencies that displayed outlier associations with 143 telomere content in the Achilles RNAi screen (Supplementary Fig. 3b), we discovered that 144 sensitivity to Avana CRISPR-Cas9 knockouts of each of the three CST complex proteins was an 145 outlier association with telomere content estimates computed from both the GDSC WES and 146 CCLE WGS data (Fig. 2a,b, Supplementary Fig. 5b). Namely, increased sensitivities to 147 knockout of the CST complex components, which are key mediators of telomere capping and 148 elongation termination (Chen et al., 2012), were correlated with lower telomere content 149 ( Supplementary Fig. 5a). Although the CST complex was not assessed in the DRIVE 150 screening dataset, we also found TERF1, a key shelterin component, to be among the positive 151 correlated genes with telomere content in the DRIVE panel (Supplementary Table 2). 152 Using the CST complex as a seed set, we subsequently queried all dependencies under 153 the premise that associated gene dependencies reflect coordinated functions (Pan et al., 2018). 154 Within the Avana panel (n = 757-769), we found significant (FDR < 0.01) outlier associations 155 between the CST complex genes and genes encoding five additional telomere-associating 156 proteins (ACD, POT1, TERF1, TERF2, TINF2). These five additional proteins, together with 157 TERF2IP, comprise the shelterin complex, the protector and regulator of telomere length and 158 topology (De Lange, 2005). Interestingly, whereas the five other dependencies were positively 159 associated with telomere content, TERF2IP displayed a weak negative association 160 ( Supplementary Fig. 5c), suggesting that TERF2IP may play a distinct regulatory role in 161 shelterin function compared to the other members. To examine the dependency landscape of 162 the CST complex and these five other telomere-associated proteins, we computed a correlation 163 matrix involving these eight genes, clustering of which yielded two main subgroups: one 164 comprised of the CST complex members, and another of the five other genes (Fig. 2c). Despite 165 this separation, POT1 and TINF2 also displayed notable correlations with CST dependencies, 166 possibly serving as the primary mediators of previously-reported functional interactions between 167 the shelterin and CST complexes (Chen et al., 2012;Wan et al., 2009). 168 Whereas we found strong codependency relationships within this group of eight 169 telomere-associated proteins, we also noticed that certain shelterin members displayed notable 170 codependencies with p53 pathway members such as MDM2, ATM, and TP53 itself 171 ( Supplementary Fig. 6a,b). Because sensitivity to perturbation of the p53 pathway is highly 172 associated with TP53 mutations in cancer (McDonald et al., 2017), we asked if these 173 codependency relationships were also associated with hotspot mutations in TP53. In fact, TP53 174 was a significant (FDR < 0.001) outlier when a comprehensive set of hotspot and damaging 175 mutations was compared against sensitivity to ACD and TERF1 dependencies in the Avana 176 panel (Fig. 2d, Supplementary Fig. 6c) and against TERF1 and TINF2 dependencies in the 177 DRIVE panel (Fig. 2e, Supplementary Fig. 6d). These links between these gene 178 dependencies and TP53 mutation status reprise and extend previous reports of p53-dependent 179 DNA damage responses to TERF1 and TINF2 depletion (Pereboeva et al., 2016;Rosenfeld et 180 al., 2009). Taken together, we find that CST and shelterin dependencies are correlated with 181 each other, telomere content, and TP53 mutation status (Fig. 2f). 182 were found to be closely associated in the Avana dataset (Fig. 3a), with both genes being the 188 top codependency with the other. Although these dependencies were not present as top 189 correlates against telomere content, we reasoned that they may be associated with other 190 genomic markers. In fact, among all genes, PML, which has previously been implicated in the 191 organization and regulation of ATRX and DAXX, was the top gene expression correlate with 192 DAXX dependency (Fig. 3b). To further investigate the genomic markers of these 193 dependencies, we compared ATRX and DAXX dependencies to CCLE quantitative proteomics 194 measurements (Nusinow et al., 2020). We found an outlier association between reduced DAXX 195 dependency and increased protein levels of ZMYM3 (Fig. 3c), a chromatin-binding histone 196 deacetylase complex member involved in the DNA damage response (Leung et al., 2017). 197 Although ATRX and ZMYM3 are located on adjacent bands of the X chromosome, ATRX 198 protein levels were not significantly associated with DAXX dependencies. Interestingly, a 199 complementary association of ZMYM3 protein levels and all Avana gene dependencies not only 200 revealed DAXX as the top associate, but also a significant association with TERF1 dependency 201 ( Fig. 3d). Taken together, these results suggest additional modulators of the ATRX-DAXX axis  202 and connections between DAXX, ZMYM3, and TERF1 regulation. 203

Investigation of ATRX-DAXX dependencies
Patterns and mechanisms of telomerase expression 204 Having thoroughly characterized telomere content and its related dependencies across the 205 CCLE, we next focused on the regulation of TERT transcription. Across 1,019 samples 206 previously profiled with deep RNAseq, we found that hematopoietic cell lines (leukemias, 207 lymphomas, and myelomas) were associated with the greatest mean expression of TERT (P = 208 5.8×10 -26 , two-sided Mann-Whitney U test; we considered TERT promoter status for 503 cell lines. We found that only the C228T 228 (chr5:1,295,228 C>T) mutation was significantly (P = 2.8×10 -5 , two-sided Mann-Whitney U test) 229 associated with an increase in TERT expression (Supplementary Fig. 8b). Surprisingly, the 230 mean level of TERT expression in monoallelic contexts was only slightly lower than that of 231 biallelic contexts, with less than a 1.5-fold difference between the groups (P = 0.03, two-sided 232 Mann-Whitney U test). Given that cells with biallelic TERT expression do so with twice the 233 transcriptional source sites as those with monoallelic TERT expression, this reduced difference 234 may be a consequence of the effects of the TERT promoter mutation in producing particularly compared to a previous report using CCLE WGS data (Huang et al., 2015). Out of these 157 241 cell lines, 87 express TERT from a single allele. Moreover, of these 157 cell lines, 129 have a 242 sequenced promoter, with which we confirm that promoter mutations unanimously drive 243 monoallelic expression (Supplementary Fig. 7c). Our expanded set of cell lines also reveals 244 several new tissues of origin in which TERT is monoallelically expressed without a mutant 245 promoter, such as hematopoietic cell lines (Supplementary Fig. 7a,b). This high proportion of 246 TERT monoallelic expression then led us to ask whether there are genomic alterations aside 247 from promoter mutations that could lead to ASE. Under the assumption that such alterations 248 may also induce ASE in a larger region than a promoter mutation, we determined ASE status in 249 consequence of TERT promoter mutations. To address this hypothesis, we performed a 296 genome-wide search for CpG islands (CGIs) with significant differences in methylation levels in 297 TERTp mutant cell lines compared to TERTp wild-type ones. If TERT hypomethylation were a 298 downstream consequence of TERT promoter mutations, then we would expect TERT 299 hypomethylation to be an isolated event, and thus there would be few CGIs outside the vicinity 300 of TERT with methylation levels correlated with TERTp mutant status. Surprisingly, we instead 301 found a broad genome-wide distribution of CGIs that were hypomethylated in TERTp mut samples 302 relative to TERTp WT samples (Fig. 5a). Moreover, when correlated with a panel of global histone 303 modification levels, we found that TERTp mutants exhibited increased levels of 304 H3K9ac1K14ac0 and H3K9ac1K14ac1 marks (Fig. 5c), which have been suggested as marks 305 of transcriptionally active chromatin (Ruthenburg et al., 2007). Likewise, when H3K9ac1K14ac0 306 levels were compared against a genome-wide panel of CGI ASM levels, the TERT CGI 307 (chr5:1,289,275-1,295,970) was the top correlate ( Fig. 5d). 308 To better understand the distribution of these TERTp mut -hypomethylated CGIs, we 309 1,000 TERTp mut -hypomethylated CGIs (Fig. 5a), we found significant (FDR < 0.0001) and 312 robust 10-fold enrichment for polycomb repressive complex 2 (PRC2)-repressed regions (Fig.  313 5f) previously characterized in several cell lines (HepG2, GM12878, HeLa-S3, K562, and 314 HUVEC). Beyond these top 1,000 hypomethylated CGIs, CGIs overlapping with PRC2-315 repressed segments were broadly hypomethylated in TERTp mut cell lines and accounted for 316 nearly all of the previously observed skew towards hypomethylation (Fig. 5b). Interestingly, the 317 enrichment of PRC2 segments was much smaller (around 3.5-fold) in the remaining profiled cell 318 line, H1-hESC. Against ENCODE ChIP-seq peak region sets, we also found significant overlap 319 with the H3K9me3 and H3K27me3 heterochromatin marks (Fig. 5g). Furthermore, we also 320 observed a moderate 2-fold (P = 5.1×10 -24 , Fisher's exact test) enrichment for regions within ten 321 megabases of most telomeres, consistent with previous reports that PRC2-repressed and 322 H3K27me3-marked regions are enriched in telomeric and subtelomeric regions (Rosenfeld et 323 al., 2009). The enrichment of these hypomethylated regions among telomere-proximal regions 324 may also be indicative of a recently-reported telomere position effect, which has been shown to 325 affect the chromatin accessibility of the TERT locus (W. Kim Fig. 9a,b). Among 337 13,547 CGIs, we again found an enrichment of hypomethylation of PRC2-overlapping CGIs 338 ( Supplementary Fig. 9c), although this was less prominent than previously noted in the CCLE. 339 LOLA enrichment analysis for TERTp mut -hypomethylated CGIs in the TCGA likewise confirmed 340 significant enrichments (FDR < 0.0001) of PRC2-repressed regions and associated histone 341 modifications as the top enriched region sets (Supplementary Fig. 9d,e). However, the fold-342 enrichment was less (about 5-fold) than that observed in the CCLE and did not display any 343 significant enrichment in telomere-proximal regions (P = 0.70, Fisher's exact test). Although the 344 skew towards hypomethylation in TERT promoter mutants among these TCGA samples was 345 weaker than in the CCLE samples, this may be the result of the more heterogeneous nature of 346 these primary TCGA samples as well as the differences in coverage between the Illumina 450k 347 array and RRBS. 348 In addition to telomere content, we also investigated the landscape of telomere 373 maintenance mechanisms, namely mechanisms of TERT reactivation, across cancer cell lines. 374 The enrichment of TERT promoter mutations in certain tissues has inspired several 375 explanations, and our findings in both the CCLE and TCGA suggest a specific epigenetic 376 signature that may underlie this unique pathway of telomere maintenance. We found that in 377 TERT promoter mutants, CpG islands were preferentially hypomethylated in PRC2-repressed 378 regions located near telomeres, which may relate to previous reports of a long-range telomere When multiple read groups were present in a sample, telomere content was computed as a 424 mean of the individual read group estimates weighted by the total read count per group. 425 Whereas we found decent agreement between overlapping samples in CCLE WGS and GDSC 426 WES, we found a comparatively weak correlation between both sets and the CCLE WES 427 estimates (Supplementary Fig. 1). Therefore, we excluded the CCLE WES telomere content 428 estimates from subsequent analyses. 429 In comparing the CCLE WGS and GDSC WES estimates, we also noticed a batch effect 430 resulting in two clusters of GDSC WES estimates. To identify and correct this batch effect 431 across all GDSC WES estimates, we observed that these batches were distinguished by 432 frequencies of reads containing exactly 4, 5, and 6 telomeric motifs. We then ran a k-means 433 clustering on these read frequencies to estimate the clusters across all GDSC WES samples, 434 which were subsequently adjusted by re-centering the mean of one cluster (after applying a z-435 scored log-transformation) to match the mean of the other. 436 We also attempted to use Telseq to estimate telomeric repeat-containing RNA (TERRA) 437 expression across 1,019 RNA-seq samples from the CCLE. However, because the majority of 438 these samples were found to contain little or no reads containing telomeric reads, TERRA 439 capture was determined to be too low for any meaningful analysis.  We also considered processed methylation estimates available on the CCLE data portal, 463 namely the TSS 1kb upstream estimates as well as the promoter CpG cluster estimates, which 464 we correlated against telomere content estimates. For these annotations, we filtered out regions 465 with a standard deviation of less than 0.05. 466

Genomic and transcriptomic markers
Results of TERT and TERC expression associations, as well as telomere content 467 associations, are available in Supplementary Table 2 To identify codependencies with the CST complex members, we employed an iterative 478 approach to identify highly ranked correlations. In particular, starting with a seed set of genes 479 (the base case of which was the CST complex), we searched for codependencies between two 480 genes x and y under the criteria that the r 2 association between the two is among the top five for 481 x vs. all other genes, and among the top five for y vs. all other genes as well. We recursively 482 applied this method four times, which added the five shelterin components ACD, POT1, TERF1, 483 TERF2, and TINF2 to our gene set. To construct the clustered correlation matrix in Fig. 2c To rank and visualize the codependencies shown in Supplementary Fig. 6a,b and the 499 dependency-mutation associations shown in Fig. 2d,e, we used a signed q-value approach. We 500 first transformed the raw false discovery rates by taking the negative of the base-10 logarithm, 501 and we then applied a sign to this transformed value as determined by the direction of the 502 codependency (the sign of the correlation coefficient) or dependency-mutation association 503 (negative for greater sensitivity in mutants, and positive otherwise). 504 Dependency analyses results are available in Supplementary Table 3. 505

Characterization of allele-specific TERT expression
506 Allele-specific expression may be detected by looking for discordant counts of reads mapping to 507 single-nucleotide polymorphisms (SNPs) in DNA-sequencing vs. RNA-sequencing reads 508 (Huang et al., 2015). In particular, allele-specific expression is evidenced by the biased 509 frequency of a single allele of a heterozygous SNP in RNAseq reads compared to that of DNA-510 sequencing reads. To assess TERT expression in the context of allele-specificity, we examined 511 cell lines for which DNA (WES or WGS) and RNA (RNAseq) sequencing data were available. 512 To identify heterozygous anchor SNPs, we considered mutations in the TERT gene body called 513 using Mutect 1.1.6 (Cibulskis et al., 2013) with default settings. We then applied a filter for 514 mutations with at least eight reads supporting both the reference and alternate alleles that 515 passed the Mutect quality control filter (i.e. classified as PASS). To force call the matching allele 516 frequencies in RNA, we processed the matching aligned RNAseq reads using the 517 ASEReadCounter tool provided in GATK 3.6 (Van der Auwera et al., 2013) with arguments -518 minDepth 8, --minBaseQuality 16, --minMappingQuality 255, and -U 519

ALLOW_N_CIGAR_READS. 520
We then used these RNA and DNA allele frequencies to classify cell lines as monoallelic 521 and biallelic expressors of TERT as well as two neighboring genes, SLC6A19 and CLPTM1L. In 522 particular, we examined the odds ratio derived from a binary contingency table with the two sets 523 of categories being the context (DNA vs. RNA) and the allele (reference vs. alternate) of the 524 read counts. To account for edge cases where the denominator of the odds ratio was zero, we 525 added a pseudocount of 0.5 to each category before computing the odds ratio. We then 526 denoted MAE lines as those having an odds ratio computed using the major allele as the 527 denominator of greater than five. In instances where there were multiple informative SNPs, we 528 considered only the SNP with the greatest supporting total RNA-seq read count. In cases where 529 the same SNP was detected across multiple sources (for instance, in both CCLE WES and 530 WGS), we considered the source with the greatest coverage of the SNP. 531 Allele-specific calls for TERT, SLC6A19, and CLPTM1L are described in 532 Supplementary Table 4. 533

534
To characterize and compare CpG-level ASM around the TERT genomic region, we utilized 535 RRBS data generated by the CCLE (Ghandi et al., 2019). Mapped BAM files were downloaded 536 from the CCLE FireCloud workspace, and ASM levels for each CpG pair were estimated using 537 the allelicmeth command from the MethPipe package (Song et al., 2013). Within each sample, 538 we first included only CpG pairs with a minimum coverage of eight reads. Next, among all 928 539 cell lines, CpG pairs included in less than 5% of these samples were excluded. 540

541
To estimate ASM, we employed a strategy similar to the original MethPipe ASM pipeline. 542 For each pair of CpGs, we considered the four combinations of methylation states between the 543 two CpGs: methylated-methylated (mm), methylated-unmethylated (mu), unmethylated-544 unmethylated (uu), and unmethylated-unmethylated (mm). For semi-methylated CpG pairs not 545 subject to ASM, we would expect high and relatively equal frequencies of the mu and um pairs, 546 whereas for ASM CpG pairs, we would expect the allele bias to result in high mm and uu counts 547 and low um and mu counts. To quantify this imbalance, we used the mean square contingency 548 coefficient (Φ) with a pseudocount of 0.5. Namely, for each CpG pair, we computed 549 where ̇= + 0.5, ̇= + 0.5, ̇= + 0.5, and ̇= + 0.5. ASM CpG pairs 551 therefore had a positive Φ, whereas non-ASM pairs had a Φ of around 0. We rounded negative 552 Φ values to 0. Before computing these imbalance values, we excluded CpG pairs with a 553 methylation level of less than 0.1 or greater than 0.9 on either CpG, so as to filter out CpG pairs 554 that were likely to be fully methylated or completely demethylated. 555 We first examined the ASM levels of the TERT locus, which considered as the TERT 556 gene body as well as the flanking five kilobase regions. For these methylation estimates, we 557 excluded CpGs with less than 25% valid ASM estimates. We segmented these CpGs into five 558 To identify genome-wide methylation events indicative of TERT promoter mutations, we 563 searched for correlates with TERT promoter mutation status among average methylation levels 564 of CpG islands (CGIs). CpG island annotations were downloaded from the UCSC genome 565 browser at http://hgdownload.cse.ucsc.edu/goldenpath/hg19/database/cpgIslandExt.txt.gz. To 566 filter out low-coverage CpG islands, we considered only CpG islands with at least eight CpG 567 sites. Methylation levels per island were then estimated by taking the mean across all CpGs 568 profiled within the island. Using the same filtering parameters, we also computed mean ASM 569 estimates across these CGIs. 570 TERT locus methylation estimates are described in Supplementary Table 5. DepMap data are available at online portals as previously described. 618 Figure 1. Telomere content and related genomic features across human cell lines. Cell lines were grouped by cancer type and ordered by telomere content within each type, and are displayed such that each column represents a cell line. Telomere content measurements reflect combined z-scored estimates derived from CCLE WGS and GDSC WES with means for samples with telomere content estimates from both sources. Relative copy number values are shown as log2(relative to ploidy + 1) -1. Cell lines shown are filtered such that annotations for telomere content, TERT and TERC RNA-seq expression, TERT and TERC copy number, and ATRX and DAXX mutation status are all available, and each cancer type is represented by at least ten cell lines (n = 683 cell lines total). RNA expression estimates are in terms of log2(TPM+1). CNS: central nervous system; PNS, peripheral nervous system; UADT, upper aerodigestive tract.   Boxes, interquartile range (IQR); center lines, median; whiskers, maximum and minimum or 1.5 × IQR; notches, 95% confidence interval of bootstrapped median using 1,000 samples and a Gaussian-based asymptotic approximation. *P < 0.05, **P < 0.01, n.s, not significant; two-sided Mann-Whitney U test.  Table 6). b, Kernel density distributions of rank-biserial correlations between CGI methylation levels for PRC2-overlapping regions and non-PRC2-overlapping regions. A negative correlation indicates that a CGI is hypomethylated in TERTp mut cell lines relative to TERTp WT ones, and a positive correlation indicates the opposite. PRC2 regions were sourced from the HepG2 segmentation. c, Rank-biserial correlations between TERTp status (mutant or wild-type) and global histone modification levels (n = 302-475). Significance determined by twosided Mann-Whitney U test. d, Pearson correlation levels between global H3K9ac1K14ac0 levels and ASM imbalance of CGIs (n = 261-884). e, H3K9ac1K14ac0 levels are significantly increased in TERTp mutants. Boxes, interquartile range (IQR); center lines, median; whiskers, maximum and minimum or 1.5 × IQR; notches, 95% confidence interval of bootstrapped median using 1,000 samples and a Gaussian-based asymptotic approximation. *P < 0.01, n.s, not significant; two-sided Mann-Whitney U test. f, LOLA core set enrichment analysis of CGIs hypomethylated in TERTp mut cell lines reveals enrichment of PRC2-repressed regions. g, LOLA ENCODE Roadmap region enrichment analysis of CGIs hypomethylated in TERTp mut cell lines reveals enrichment of H3K9me3 and H3K27me3 regions. Boxes, interquartile range (IQR); center lines, median; whiskers, maximum and minimum or 1.5 × IQR; notches, 95% confidence interval of bootstrapped median using 1,000 samples and a Gaussian-based asymptotic approximation. *P < 0.05, two-sided Mann-Whitney U test against WT/silent category; n.s, not significant. b, Volcano plots of Pearson correlations and false discovery rates (q values) of associations between merged telomere content estimates and several profiling datasets. Sample sizes listed in Supplementary   Figure S4. Transcriptomic associations between TERT, TERC, and telomere content. a and b, Pearson correlations between log2(TPM + 1) levels of TERT mRNA and telomere content within tissue subtypes in the CCLE WGS and GDSC WES datasets, respectively. CNS, central nervous system; PNS, peripheral nervous system; UADT, upper aerodigestive tract. c, Associations between total TERT (ENSG00000164362.14) mRNA, full-length TERT (ENST00000310581.5) mRNA, minus-beta TERT (ENST00000296820.5) mRNA, TERC (ENSG00000270141.2) RNA, and z-scored log2-transformed telomere content estimates in the CCLE WGS and GDSC WES datasets. mRNA expression measured as log2-transformed TPMs with a pseudocount of +1. d, Associations between exon inclusion levels of TERT exons 7  Table 3). c, Selected correlations between CCLE WGS and GDSC WESderived telomere content and Avana dependencies of CST and shelterin complex members. Figure S6. TP53 mutation status and shelterin member dependencies. a, Codependencies of CST and shelterin complex members in the Avana CRISPR-Cas9 dataset as measured by Pearson correlation and the associated two-sided P value (n = 710-769 cell lines). q-values are shown for correlations between each gene indicated on the x-axis and all other genes, with qvalues transformed and ranked by the sign of the correlation. b, Repeat of codependency analysis in (a), but for DRIVE RNAi dependencies (n = 88-386 cell lines). c and d, comparison of dependencies of select members (Avana and DRIVE, respectively) with respect to TP53 hotspot mutation status. Boxes, interquartile range (IQR); center lines, median; whiskers, maximum and minimum or 1.5 × IQR; notches, 95% confidence interval of bootstrapped median using 1,000 samples and a Gaussian-based asymptotic approximation. P values determined by two-sided Mann-Whitney U test.  , and mRNA expression as measured in log2(TPM + 1) from RNAseq (c). Boxes, interquartile range (IQR); center lines, median; whiskers, maximum and minimum or 1.5 × IQR; notches, 95% confidence interval of bootstrapped median using 1,000 samples and a Gaussianbased asymptotic approximation. *P < 0.01, two-sided Mann-Whitney U test against wild-type (WT) values; n.s, not significant. d, Correlation between TERT gene body methylation and mRNA expression in promoter-wildtype (left) and promoter-mutant (right) cell lines. e, Frequencies of TERT promoter mutations across different tissue subtypes. Figure S9. Global methylation changes associated with TERT promoter mutations. a, Methylation at the cg11625005 CpG probe and TERT mRNA expression across TCGA samples annotated by TERTp status (n = 1,553). b, Comparison of cg11625005 methylation levels in TERTp mutants and wild-type TCGA samples with TERT mRNA expression greater than 1 (as indicated in the horizontal line in (a). Boxes, interquartile range (IQR); center lines, median; whiskers, maximum and minimum or 1.5 × IQR; notches, 95% confidence interval of bootstrapped median using 1,000 samples and a Gaussian-based asymptotic approximation. P = 1.5 × 10 -26 , two-sided Mann-Whitney U test. c, Kernel density distributions of rank-biserial correlations between CGI methylation levels for PRC2-overlapping regions and non-PRC2overlapping regions. A negative correlation indicates that a CGI is hypomethylated in TERTp mut cell lines relative to TERTp WT ones, and a positive correlation indicates the opposite. PRC2 regions were sourced from the HepG2 segmentation. d, LOLA core set enrichment analysis of CGIs hypomethylated in TERTp mut samples reveals enrichment of PRC2-repressed regions. e, LOLA ENCODE Roadmap region enrichment analysis of CGIs hypomethylated in TERTp mut samples reveals enrichment of H3K9me3 and H3K27me3 regions. f, Genomic distribution of CGIs hypomethylated in TERTp mut CCLE cell lines. Shaded regions denote 10Mb chromosome ends. Table S1. Telomere content estimates and sample information. Table S2. Telomere content and transcriptomic associations. Table S3. Unsupervised dependency associations. Table S4. Allele-specific expression calls. Table S5. TERT locus methylation. Table S6. Genome-wide methylation analysis in the CCLE. Table S7. Genome-wide methylation analysis in TCGA.