Genome-wide DNA methylation and transcriptome integration reveal distinct sex differences in skeletal muscle

Nearly all human complex traits and diseases exhibit some degree of sex differences, and epigenetics contributes to these differences as DNA methylation shows sex differences in various tissues. However, skeletal muscle epigenetic sex differences remain largely unexplored, yet skeletal muscle displays distinct sex differences at the transcriptome level. We conducted a large-scale meta-analysis of autosomal DNA methylation sex differences in human skeletal muscle in three separate cohorts (Gene SMART, FUSION, and GSE38291), totalling n = 369 human muscle samples (n = 222 males, n = 147 females). We found 10,240 differentially methylated regions (DMRs) at FDR < 0.005, 94% of which were


Introduction
Sex differences are evident in nearly all complex traits. Various diseases, including but not limited to cancer, muscular dystrophy, and COVID-19 [1,2], display sex differences in prevalence, onset, progression, or severity. To improve treatment for such diseases, it is crucial to uncover the molecular basis for the sex differences and their consequences on organ function. Sexually differentiated traits and phenotypes stem from a combination of factors, including genetics (gene variants-by-sex interactions [3], XY chromosome complements [4][5][6][7], genomic imprinting [8]), the hormonal milieu [9,10], and gene regulation [11], with the latter likely contributing the most [1].
Recently, a large-scale study from the Genotype-Tissue Expression (GTEx) consortium unravelled mRNA expression differences between the sexes that are not driven by sex chromosomes, across all tissues. Skeletal muscle was particularly divergent between the sexes, as gene expression profiles in this tissue could predict sex with high specificity ≥ 90%, and sensitivity ≥ 98% [4]. These transcriptomic differences underpin the numerous physiological differences in skeletal muscle between males and females, such as differences in substrate metabolism [12][13][14]. For example, females oxidize more lipids and less carbohydrates and amino acids during endurance exercise, and albeit depending on training status, tend to have a higher proportion of type I (slow-twitch) muscle fibres [15], all of which inherently contribute to enhanced fatigue-resistance in female skeletal muscle [16]. As such, females exhibit higher mRNA and protein levels of lipid oxidation-related genes than males [13]. Interestingly, the top gene set corresponding to sex-biased genes in the GTEx study corresponded to targets of the epigenetic writer polycomb repressive complex 2 (PRC2) and its associated epigenetic mark (H3K27me3). This suggests that the sex-specific deposition of epigenetic marks may be the source of sex differences in gene expression. Epigenetics is a system of gene regulation that influences gene expression and is modulated by the genetic sequence and environmental stimuli. DNA methylation is currently the best-characterized epigenetic modification, and has been shown to differ between males and females in various tissues, such as pancreatic islets [17], blood [18,19], and more recently in cultured myoblasts and myotubes [20]. While there is ample evidence for transcriptomic sex differences in skeletal muscle [4,11,12,[21][22][23], it is unclear whether sex differences exist in the DNA methylome of skeletal muscle tissue, and to what extent.
Epigenome-wide association studies (EWAS) are ideal for investigating the impact of sex on genome-wide DNA methylation when addressing both the basis and translational aspect of sex differences. Therefore, we performed a large-scale EWAS meta-analysis to explore sex differences in the DNA methylome of human skeletal muscle tissue, using three datasets from our own laboratory and open-access databases (n = 369 individuals; 217 males, 152 females).
We established a list of robust DNA methylation (CpG) sites and regions showing DNA methylation differences between males and females, and explored their genomic context. We then integrated them with sex-biased gene expression from the GTEx, and inferred the potential downstream effects on skeletal muscle function. Lastly, we confirmed our findings with transcriptomic data from one cohort used in the meta-analysis and targeted qPCR from another cohort.

Males show profound genome-wide autosomal hypomethylation compared with females in human skeletal muscle
The DNA methylation meta-analysis was conducted on 369 individuals from three datasets (217 males, 152 females). We focused exclusively on the 22 autosomes to eliminate the confounding effect of sex differences in the sex chromosome complement where Xchromosome inactivation takes place exclusively in females. All of the Gene SMART cohort individuals were apparently healthy, while the FUSION cohort individuals were either healthy or diagnosed with type 2 diabetes mellitus, and the GSE38291 cohort individuals included monozygotic twins discordant for type 2 diabetes mellitus ( Table 1). Table 1 Characteristics of individuals in each data set included in the DNA methylation meta-analysis. Statistics shown for differences between males and females.
We found 56,813 differentially methylated positions (DMPs, single CpG sites) between males and females, spread across the 22 autosomes, at a stringent meta-analysis False Discovery Rate (FDR) < 0.005 (Figure 1, Supplementary Table 2). Ninety-four percent of DMPs were hypomethylated in males compared with females ( Figure 1A). On average, the magnitude of DNA methylation differences between males and females was +2.8% (hyper DMPs) and -3.5% (hypo DMPs), with the largest effect sizes reaching +15.2% and -35.7%. In each of the three cohorts, participants did not cluster according to sex when including the whole autosomal methylome, but they did cluster according to sex when only focusing on the 56,813 DMPs ( Figure 1B), suggesting that sex explained a substantial amount of variance at the DMPs.
Each data set had a unique study design that required adjustment for factors known to affect DNA methylation, such as age [24] and type 2 diabetes (T2D) [25]. We adjusted each dataset for these factors, but noted that sex was associated with T2D in the FUSION dataset, meaning that male participants from the FUSION cohort more commonly had T2D than females. Therefore, it is possible that the sex-related signal capture in this dataset was Volcano plot of DNA methylation differences between males and females. Each point represents a tested CpG (633,645 in total) and those that appear in color are DMPs at a meta-analysis false discovery rate < 0.005; red DMPs are hypermethylated in males compared with females; blue DMPs are hypomethylated in males compared with females. The x-axis represents the amount of DNA methylation difference between the sexes and the y-axis represents statistical significance (higher = more significant). Two DMPs that were present in all three studies and showed the largest effect size are labeled with the respective CpG and boxplots of β-values from each study appear to the right (hyper DMP) and left (hypo DMP). (B) Principal component analysis plots of the methylation values at the DMPs; each point on the graph represents an individual; males denoted in green, females denoted in orange. partially confounded by T2D. We repeated the meta-analysis excluding T2D participants from the FUSION cohort, but results remained unchanged (Supplementary figure 5). This confirms that our results are not confounded by T2D.
Since the effect of DNA methylation on gene expression depends on the genomic context, we explored the genomic locations of the DMPs to gain insights into their potential function [26]. We compared the distribution of hyper-, hypo-, and non-DMPs among the various chromatin states in human skeletal muscle using the Roadmap Epigenomics project [27]. DMPs were not randomly distributed in the chromatin states (χ 2 p-value < 2.2 x 10 -16 , (across many tissues including skeletal muscle) [28]. Therefore, we performed the chromatin state enrichment analysis on both the male and female chromatin state annotation in skeletal muscle, which yielded equivalent findings. We next determined whether the DNA methylation sex differences are enriched in regions in which the corresponding chromatin state displays sex differences. DMPs were indeed enriched in loci whose chromatin states differ between males and females: 38.7 % of DMPs vs. 32.4% of non-DMPs are in chromatin states that differ between males and females, which means that the odds of a DMP being located in a sex-differing chromatin state increased by a factor of 1. Differentially methylated genes (DMGs) were determined by identifying differentially methylated regions (DMRs), as DMRs remove spatial redundancy (CpG sites ~500 bp apart are typically highly correlated [29]), and may provide more robust and functionally important information than DMPs [30,31]. We identified 10,240 DMRs (Stouffer, harmonic mean of

Genes with sex-biased methylation exhibit sex-biased DNA expression in human skeletal muscle
To gain insights into the potential downstream effects of sex-biased DNA methylation on gene expression, we integrated results from the EWAS meta-analysis of sex with genes whose mRNA expression levels are known to differ between males and females. We used

Validation of GTEx sex-biased genes in the cohorts used for methylation analysis
We sought to validate the sex-biased gene expression obtained from GTEx in a subset of the samples used for methylation analysis since the DMGs and DEGs analyses were obtained from different muscle groups (the DMGs of the current study are from the vastus lateralis while the GTEx DEGs are from the gastrocnemius). Although both are skeletal muscle tissue from the leg, there may be differences in muscle phenotypes in differing muscle groups [32]. Analysis of RNA sequencing data from the FUSION cohort revealed 3,751 autosomal genes with sex-biased expression (FDR < 0.005). The FDR threshold we chose for the FUSION gene expression data was more stringent than the GTEx local false sign rate threshold (lfsr < 0.05), yet, ~34% of the genes which were both DEGs in GTEx and DMGs were also DEGs in the FUSION cohort, totalling 326 genes (hereinto referred to as `overlapping genes`) ( Figure 3A). Given that both the GTEx and FUSION cohorts include participants of relatively older ages, we sought to confirm the mRNA levels in the younger cohort in the analysis (the Gene SMART) for three genes that displayed sex differences at both the mRNA and DNA methylation levels (GGT7, FOXO3, and ALDH1A1) ( Table 2 Table 2 Gene expression and DNA methylation differences between males and females for three genes across the cohorts used in the analysis.

Gene set enrichment analysis of differentially methylated regions
We next performed Gene set enrichment analysis (GSEA) on the DMGs, as GSEA using epigenomic features may reveal distinct enriched pathways that may not display gene expression differences [11,28]. We performed GSEA on both the DMRs and DMPs ( Figure   4). GSEA on the DMRs revealed enrichment of several Gene Ontology (GO) terms, one  between males and females. Numbers next to pathways denote the number of enriched genes in the pathway; numbers next to genes denote the number of pathways (from the ones displayed) that the gene belongs to.

DNA methylation and gene expression of GGT7, FOXO3 and ALDH1A1 consistently differ between males and females in human skeletal muscle
Three-hundred twenty-six genes exhibited differential methylation in the metaanalysis and differential expression among the GTEx and FUSION cohorts, termed `overlapping genes`. Of those genes, we tested three for gene expression levels, GGT7,

Discussion
We conducted a large-scale meta-analysis of DNA methylation differences between males and females in skeletal muscle, and integrated them with transcriptomic data. We revealed that males display profound genome-wide hypomethylation compared with females.
We then showed that many sex-biased genes found in GTEx also exhibit sex-biased DNA methylation, which was partially confirmed in the FUSION cohort. We then validated the gene expression (qPCR) levels of three genes with large DNA methylation and expression differences between the sexes across cohorts, and confirmed the higher gene expression in males of GGT7 and ALDH1A1. Finally, we showed that the DMGs are overwhelmingly involved in muscle contraction, as well as other metabolic and anatomical structure-related pathways.
In the present study, the overwhelming majority (94 %) of the DMPs were hypomethylated in males. Interestingly, global autosomal hypomethylation in males has been observed in various other tissues [34], including blood [35,36] and pancreatic islets [17].
Uncovering the molecular mechanisms at the root of these epigenetic differences between the sexes was beyond the scope of this paper, but there are few possible explanations.
Differences in cell type proportions between the sexes may partly explain our findings [36][37][38], as type I fibres are hypermethylated compared with type II fibers [39], and as females tend to have a higher proportion of type I fibres than males [15]. The DNA methylation of MYH7 was higher in males across the meta-analysis, potentially indicating lower MYH7 gene expression, which is a unique characteristic of type I slow twitch fibres and not type II fast twitch fibres [40]. This may suggest that males in the meta-analysis may have less type I fibres, which may contribute to some of the observed DNA methylation sex differences.
However, neither MYH6 nor MYH7 displayed sex-biased gene expression in the GTEx or FUSION cohorts, although this may be confounded by age as both cohorts were comprised of older individuals, and older individuals tend to accrue more "hybrid" fibres (co-express more than one myosin heavy chain isoform) with age [41]. Nonetheless, the contribution of differing fibre type proportions on DNA methylation differences between males and females remains to be thoroughly elucidated. Although not well-understood, the sex chromosome complement may also influence autosomal DNA methylation patterns. In cultured fibroblasts, the presence of Sex-determining Region Y (SRY) is associated with lower autosomal methylation levels [42][43][44]. Additionally, a higher number of the X chromosomes, in the absence of SRY, leads to increased methylation levels at a specific sex-differentially methylated autosomal region [44]. This could be attributed to allele dosage compensation, a female-specific process that silences one of the X chromosomes in a cell [45,46].
Approximately one-third of genes 'escape' inactivation, remain transcriptionally active in XX cells, [46][47][48], and have been suggested to affect autosomal DNA methylation via their histone marks [44,49]. Moreover, females with Turner syndrome (partially/fully missing one X) and monosomy X have lower global methylation than XX females, but higher than XY males [50,51]. Finally, sex hormones may contribute to inherent autosomal sex-specific DNA methylation as has been shown in leukocytes [52], but this may only be apparent after taking cellular composition into account [53]. The effect of sex hormones on DNA methylation in skeletal muscle has yet to be explored.  Jones, 2012). Using a permutation test, we showed that DNA methylation differences between the sexes at promoters and enhancers were more often associated with lower gene expression than would be expected by chance alone. DNA methylation differences between the sexes were also particularly prominent in chromatin states that are known differ between males and females. This suggests that DNA methylation differences between males and females reflect alterations in chromatin activity, and differential epigenetic states and expression are likely functionally connected. In line with this, chromatin states that differ between the sexes have been shown to be enriched for sex-biased genes across various tissues, including skeletal muscle [28]. However, it is not yet possible to assess whether the relationship reflects correlation or direct causality. There is still debate around whether epigenomic features drive regulatory processes or are merely a consequence of transcription factor binding [28]. A recent study analysing sex differences in regulatory networks in the GTEx database identified that many transcription factors (TF) have sex-biased targeting patterns [11]. Further supporting the effect of TF on sex-biased gene expression, another recent study also on the GTEx database found enrichment of TF binding sites in the promoters of sex-biased genes [4].
We identified 326 genes with consistent differential skeletal muscle DNA methylation and expression across 1,172 individuals altogether (369 individuals from three cohorts for DNA methylation and 1,077 individuals from two cohorts for gene expression). Although we found profound global DNA hypomethylation in males, of the overlapping genes there were equivalent numbers of genes over-and under-expressed in males compared with females for both GTEx and FUSION. Indeed, hypermethylation is not always associated with decreased gene expression [54]. The substantial overlap between differentially methylated genes and differentially expressed genes highlights many genes that may be of interest for their roles in muscle-related processes. We focused on three of these genes that displayed a large DNA methylation difference between males and females, are highly expressed in skeletal muscle, or play a role in skeletal muscle function: HDAC4 given its role in neurogenic muscle atrophy [55,56] and in the response to exercise [57]; DEPTOR given its role in muscle glucose sensing which in turn augments insulin action [58]; GRB10 given that it is imprinted and has been shown to change in methylation with exercise/training [59]; FOXO3 for its role in ageing, longevity, and regulating the cell cycle [60]; ALDH1A1 for its role in aldehyde oxidation and because sex differences in skeletal muscle mRNA levels have been reported, suggesting that males might be able to metabolize aldehydes (i.e. alcohol) more efficiently than females [12]; and GGT7 for its role in antioxidant activity [61]. Of the three genes which were validated across GTEx, FUSION, and Gene SMART, two of them showed consistently higher male expression levels (GGT7, ALDH1A1) while one showed opposite sex-biased expression (FOXO3) in the young versus the old cohorts. FOXO3 expression was lower in males in the older cohorts (GTEx and FUSION), and higher in males in the younger cohort (Gene SMART). Other studies have shown that males have higher FOXO3 expression in young skeletal muscle [62] and that elderly females have higher skeletal muscle FOXO3 expression than younger females [63]. While FOXO3 skeletal muscle gene expression differs between males and females, it seems that the direction is opposite in young and old individuals, which emphasizes the caution that should be used when interpreting sex differences across a large age range of individuals. The promoter, 1 st exon, and gene body of GGT7 were hypomethylated in males and males had higher GGT7 expression. GGT7 is highly expressed in skeletal muscle and metabolises glutathione, which is a ubiquitous "master antioxidant" that contributes to cellular homeostasis. Efficient glutathione synthesis and high levels of glutathione-dependent enzymes are characteristic features of healthy skeletal muscle and are also involved in muscle contraction regulation [64].
In conclusion, we showed that the DNA methylation of hundreds of genes differs between male and female human skeletal muscle. Integration of the DNA methylome and transcriptome, as well as gene expression validation, identify sex-specific genes associated with muscle metabolism and function. Uncovering the molecular basis of sex differences across different tissues will aid in the characterization of muscle phenotypes in health and disease. The effects of upstream drivers on sex differences in the muscle methylome, such as transcription factors, the XY chromosomes, hormones, and cell type differences still need to be explored. Molecular mechanisms that display sex differences in skeletal muscle may help uncover novel targets for therapeutic interventions.

Datasets
We conducted a meta-analysis of three independent epigenome-wide association

DNA Extraction and Methylation Method-Gene SMART study samples
Genomic DNA was extracted from the samples using the AllPrep DNA/RNA MiniKit

PREPROCESSING
The pre-processing of DNA methylation data was performed according to the bioinformatics pipeline developed for the Bioconductor project [68]. Raw methylation data were pre-processed, filtered and normalized across samples. Probes that had a detection pvalue of > 0.01, located on X and Y chromosomes or cross-hybridizing, or related to a SNP frequent in European populations, were removed. It is important to note that the list of crosshybridizing probes was supplied manually [69] as the list supplied to the ChAMP package was outdated. Specifically, there are thousands of probes in the Illumina microarrays that cross-hybridize with the X-chromosome and may lead to false discovery of autosomal sexassociated DNA methylation [70]. The BMIQ algorithm was used to correct for the Infinium type I and type II probe bias. β-values were corrected for both batch and position in the batch using ComBat [71].

STATISTICAL ANALYSIS
We adjusted each EWAS for bias and inflation using the empirical null distribution as implemented in bacon [72]. Inflation and bias in EWAS are caused by unmeasured technical and biological confounding, such as population substructure, batch effects, and cellular heterogeneity [73]. The inflation factor is higher when the expected number of true associations is high (as it is for age); it is also greater for studies with higher statistical power [72]. The results were consistent with the inflation factors and biases reported in an EWAS of age in blood [72]. Results from the independent EWAS were combined using an inverse variance weighted meta-analysis with METAL [74]. Next, we integrated a comprehensive annotation of Illumina HumanMethylation arrays [79] with chromatin states from the Roadmap Epigenomics Project [27] and the latest GeneHancer information [80]. DMPs that were annotated to two differing chromatin states were removed for simplicity and because there were very few such DMPs. GSEA on KEGG and GO databases was performed on DMRs and DMPs using the goregion and gometh (gsameth for Reactome) functions in the missMethyl R package [81] [82].

Integration of DNA Methylation and Gene Expression
The Genotype-Tissue Expression (GTEx) Project sex-biased data was downloaded from the GTEx Portal on 08/26/2020 and filtered for skeletal muscle samples. The enrichment of DMG for GTEx DEGs was done by supplying the list of sex-biased genes to the gsameth function in the missMethyl R package [81,82], which performs a hypergeometric test, taking into account biases due to the number of CpG sites per gene and the number of genes per probe on the EPIC array. Therefore, caution should be taken when interpreting the number of DMPs reported per DMG. The analysis for direction of correlation between DNA methylation and gene expression was performed by randomly shuffling DNA methylation effect sizes and performing 10,000 permutations to assess how often a negative correlation occurs. This analysis was performed for both GTEx and FUSION transcriptome data and yielded similar results; data presented reflect results from the integration of differential methylation with differential GTEx expression. Significance reported for GTEx sex-biased genes is represented as the local false sign rate (lfsr) which is analogous to FDR [83]. GTEx effect sizes are represented as mash posterior effect sizes [83], in which positive values indicate male-biased genes and negative values indicate female-biased genes. FUSION and Gene SMART gene expression significance statistics are represented as FDR and p-value, respectively, and effect sizes as fold changes for both cohorts.

Validation of top genes with qPCR
Skeletal muscle previously stored at -80°C was lysed with the RLT buffer Plus buffer (Qiagen) and beta-mercaptoethanol using the TissueLyser II (Qiagen, Australia). DNA was extracted using the AllPrep DNA/RNA Mini Kit following the manufacturer guidelines (Qiagen, Australia). RNA yield and purity were assessed using the spectrophotometer (NanoDrop One, Thermofisher). RNA was reverse transcribed to cDNA using a commercially available iScript Reverse Transcriptase supermix (cat #1708841) and C1000