Identification of MicroRNAs as Diagnostic Biomarkers for Breast Cancer Based on the Cancer Genome Atlas

Breast cancer is the most common cancer among women worldwide. MicroRNAs (miRNAs or miRs) play an important role in tumorigenesis, and thus, they have been identified as potential targets for translational research with diagnostic, prognostic, and therapeutic markers. This study aimed to identify differentially expressed (DE) miRNAs in breast cancer using the Cancer Genome Atlas. The miRNA profiles of 755 breast cancer tissues and 86 adjacent non-cancerous breast tissues were analyzed using Multi Experiment Viewer; miRNA–mRNA network analyses and constructed KEGG pathways with the predicted target genes were performed. The clinical relevance of miRNAs was investigated using area under the receiver operating characteristic curve (AUC) analysis, sensitivity, and specificity. The analysis identified 28 DE miRNAs in breast cancer tissues, including nine upregulated and 19 downregulated miRNAs, compared to non-cancerous breast tissues (p < 0.001). The AUC for each DE miRNA, miR-10b, miR-21, miR-96, miR-99a, miR-100, miR-125b-1, miR-125b-2, miR-139, miR-141, miR-145, miR-182, miR-183, miR-195, miR-200a, miR-337, miR-429, and let-7c, exceeded 0.9, indicating excellent diagnostic performance in breast cancer. Moreover, 1381 potential target genes were predicted using the prediction database tool, miRNet. These genes are related to PD-L1 expression and PD-1 checkpoint in cancer, MAPK signaling, apoptosis, and TNF pathways; hence, they regulate the development, progression, and immune escape of cancer. Thus, these 28 miRNAs can serve as prospective biomarkers for the diagnosis of breast cancer. Taken together, these results provide insight into the pathogenic mechanisms and potential therapies for breast cancer.

Early detection and improved treatment can aid in better survival and outcomes in patients with breast cancer. Mammography for breast cancer is a widely used screening tool. However, the extensive use of mammography has been hindered by the cost and expertise required for mammography. On the other hand, alternative methods, such as ultrasound screening, are highly operator-dependent. In addition, tumor serum markers, such as carbohydrate antigen 15-3 (CA-15-3) and carcinoembryonic antigen (CEA), are nonspecific and have limited sensitivity and specificity [5,6].
Even though well-characterized subtypes and early detection have reduced the burden of treatment for patients, more specific molecular targets are needed to increase the survival

The Cancer Genome Atlas (TCGA) Data Analysis
Raw data for miRNAs and clinical information of breast cancer were obtained from the TCGA open source repository (http://firebrowse.org/) on 01/28/2016. To verify clinical diagnostic values, data for all clinical samples, including age, race, tumor stage, molecular subtype, and reads per million miRNAs, were included for 755 breast cancer samples and 86 adjacent non-cancerous breast tissues. Other clinical variables (treatment, surgical type, etc.) were not analyzed in the current study. Data were divided into different stages, including early stage (stages 1 and 2), locally advanced stage (stage 3), and metastatic stage (stage 4). Data from eight samples with unknown stages were excluded from this study. Clinical information from TCGA data is shown in Table S1 (Supplementary Material).

miRNA Expression Profiles
To determine miRNA expression profiles and identify DE miRNAs, hierarchical clustering and volcano plot analyses were performed using Multi Experiment Viewer (MEV) software version 4.4. Principal component analysis (PCA) was also performed to assess population clustering and the parameters responsible for the distinction between the groups. The mean values of DE miRNAs between cancerous and non-cancerous breast tissues were compared using Student's t-test, and the false discovery rate-adjusted p-value (q-value) was calculated.

Constructin Regulatory Network between miRNAs and Their Targets and Pathway Enrichment Analysis
The target genes of the selected DE miRNAs were predicted using miRNet (http://www. mirnet.ca/). The miRNA target gene network was constructed based on mapping analysis [22].
Furthermore, the target genes in the network were analyzed using Cytoscape software version 3.8.1 with ClueGO for the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis. ClueGO parameters were set as indicated: GO term fusion selected; only display pathways with p < 0.001 with Bonferroni step-down analysis; and kappa score of 0.4 [23].

Statistical Analysis
All statistical analyses were performed using GraphPad Prism software version 6 (La Jolla, CA, USA), SPSS Statistics software (version 21.0; IBM, Armonk, NY, USA), and Multi Experiment Viewer (MEV) software version 4.4. Student's t-test was used to compare the expression of miRNAs between cancerous and non-cancerous breast tissues. Receiver operating characteristic (ROC) curve analysis and the area under the ROC curve (AUC) were used to assess the diagnostic utility of the selected miRNAs. Analysis of the association between survival and DE miRNAs was performed using miRpower, a web tool to validate survival-associated miRNAs [26]. A database was established using miRNA expression data from the Molecular Taxonomy of Breast Cancer International Consortium (METABRIC). Survival was estimated using Kaplan methods and evaluated using a log-rank test. In all analyses, p < 0.05 was considered statistically significant.

Patients' Characteristics
The miRNA sequencing dataset comprising a total of 755 breast cancer and 86 adjacent non-cancerous breast tissues was obtained from the TCGA breast cancer project. Demographic data and clinical characteristics of the patients are shown in Table 1

Diagnostic Utility of Selected miRNAs
To investigate the diagnostic value of the 28 selected miRNAs, the expression levels of these miRNAs were tested, and were found to be significantly higher or lower in breast cancer tissues than in adjacent non-cancerous tissues (p < 0.001, Figures 3 and 4). The diagnostic performance of the 28 DE miRNAs was determined using ROC curve analysis. The AUCs of the 28 DE miRNAs are listed in Table 2. The AUCs for the top five miRNAs exceeded 0.97: miR-139 (0.99, 95% CI = 0.98-1.00), miR-21 (0.98, 95% CI = 0.97-0.99), miR- Heat map of the selected 28 miRNAs in cancerous and non-cancerous tissues. The heat map is obtained using the two-way hierarchical clustering of 28 significantly expressed miRNAs (Pearson correlation, p < 0.05 by hierarchical clustering analysis). A red dot represents upregulated miRNA, and a green dot represents downregulated miRNA.

Diagnostic Utility of Selected miRNAs
To investigate the diagnostic value of the 28 selected miRNAs, the expression levels of these miRNAs were tested, and were found to be significantly higher or lower in breast cancer tissues than in adjacent non-cancerous tissues (p < 0.001, Figures 3 and 4). The of miR-511-1 and miR-511-2 in luminal B were downregulated compared to HER2positive and TNBC (p < 0.05 and p < 0.001 for both miR-511-1 and miR-511-2). The MiR-139 expression level in luminal A was significantly higher than that in luminal B, HER2, and TNBC (p < 0.05, p < 0.001, and p < 0.01, respectively); miR-676 expression in TNBC was significantly higher than that in luminal A, luminal B, and HER2-positive (p < 0.01, p < 0.001, and p < 0.05, respectively).     Subsequently, the expression levels of the DE miRNAs were investigated according to the cancer stage ( Figure 5). Among the DE miRNAs, the expression level of miR-21 was significantly higher in the metastatic stage (stage 1) and locally advanced (stage 3) stage compared to the early stages (stages 1 and 2) (p < 0.05); the expression levels of miR-141 and miR-200c were significantly higher in the early stages than in the locally advanced stage (p < 0.01 and p < 0.05, respectively). The expression levels of miR-28, miR-139, and miR-143 were significantly lower in the early stages than in the locally advanced stage (p < 0.001, p < 0.05, and p < 0.01, respectively). Furthermore, the expression levels of the DE miRNAs were also analyzed by breast cancer subtypes ( Figure 6). Among the nine upregulated miRNAs, the expression level of miR-21 was significantly higher in luminal B than in luminal A and TNBC (p < 0.05 and p < 0.001, respectively). The expression of miR-183 was significantly upregulated in HER2-positive patients compared to other subtypes (luminal A, luminal B, and TNBC) (p < 0.001, p < 0.01, and p < 0.01, respectively). The miR-200a expression level in TNBC was higher than that in luminal A and HER2-positive subtypes (p < 0.05 and p < 0.05, respectively). For the 19 downregulated miRNAs, the expression levels of miR-28 and miR-3199-1 in luminal A and luminal B were significantly downregulated compared to HER2-positive and TNBC. The expression level of miR-511-1 and miR-511-2 in luminal B were downregulated compared to HER2-positive and TNBC (p < 0.05 and p < 0.001 for both miR-511-1 and miR-511-2). The MiR-139 expression level in luminal A was significantly higher than that in luminal B, HER2, and TNBC (p < 0.05, p < 0.001, and p < 0.01, respectively); miR-676 expression in TNBC was significantly higher than that in luminal A, luminal B, and HER2-positive (p < 0.01, p < 0.001, and p < 0.05, respectively).

Identification of Downstream Target Genes of miRNAs in Breast Cancer
To elucidate the underlying biological functions of miRNAs via negative regulation of the expression of downstream target genes, miRNet was used to predict the target genes of the 28 DE miRNAs. As shown in Figure 7a, a total of 1381 predicted target genes of miR-21 (yellow), miR-125b (green), miR-200b (blue), and miR-429 (red) were obtained. The list of predicted target genes is shown in Table S2. were observed, and predicted target genes were found to be enriched in mitogen-activated protein kinase (MAPK) signaling, hypoxia-induced factor-1 (HIF-1) signaling, central carbon metabolism in cancer, programmed cell death-ligand 1 (PD-L1) expression, and programmed cell death-1 (PD-1) checkpoint pathway in cancer, phosphatidylinositol 3-kinase (PI3K)-Akt signaling, apoptosis, signaling pathways regulating pluripotency of stem cells, tumor necrosis factor (TNF) signaling, pathways in cancer, and microRNAs in cancer pathways. KEGG pathways for the predicted targets are summarized in Table S3.  Next, the predicted target genes were analyzed using KEGG pathway enrichment analysis with the ClueGO plug-in of Cytoscape (kappa score = 0.4, p < 0.001 with Bonferroni step-down analysis) (Figure 7b, Table 3). The relationships between pathways were observed, and predicted target genes were found to be enriched in mitogen-activated protein kinase (MAPK) signaling, hypoxia-induced factor-1 (HIF-1) signaling, central carbon metabolism in cancer, programmed cell death-ligand 1 (PD-L1) expression, and programmed cell death-1 (PD-1) checkpoint pathway in cancer, phosphatidylinositol 3kinase (PI3K)-Akt signaling, apoptosis, signaling pathways regulating pluripotency of stem cells, tumor necrosis factor (TNF) signaling, pathways in cancer, and microRNAs in cancer pathways. KEGG pathways for the predicted targets are summarized in Table S3.

Discussion
Breast cancer is one of the most commonly diagnosed cancers and causes of significant cancer-mediated deaths in women worldwide [27]. Moreover, despite the constant development of diagnostic approaches for cancer, early diagnosis of breast cancer and improvement in survival remain difficult. It has been shown that various imaging approaches, such as mammography, magnetic resonance imaging, positron emission tomography, computed tomography, and single-photon emission computed tomography, can be used for the diagnosis and monitoring of breast cancer patients in various stages [28][29][30]. Currently, numerous studies on new diagnostic approaches for breast cancer using circulating tumor cells, circulating tumor DNA, exosomes, and microRNAs are underway [31][32][33][34].
The miRNAs, a group of small, single-stranded, non-coding RNA molecules, are frequently dysregulated in cancers, including breast cancer [35]. Recent studies have found that specific miRNAs are associated with breast cancer [36,37]. Studies on the clinical applications of miRNAs, such as in diagnosis, prognosis, and therapeutic strategies for cancer, including breast cancer, are also gaining prominence [21]. Here, a systematic analysis of miRNA expression profiles from TCGA was performed to identify potential miRNAs for the diagnosis of breast cancer. First, 28 DE miRNAs were screened for expression in breast cancer tissues compared to adjacent non-cancerous tissues, identifying nine upregulated and 19 downregulated miRNAs. Of these, miR-21 and miR-139 were found to be the most significantly upregulated and downregulated miRNAs, respectively, in breast cancer tissues. Previous studies have shown that miR-21 overexpression in breast cancer is associated with cell proliferation, progression, metastasis, and poor prognosis [38,39]. It has also been reported that miR-21 promotes invasion and cell proliferation by targeting programmed cell death 4 (PDCD4) [38]; miR-139 has been reported to act as a tumor suppressor in several cancer types, such as prostate cancer, endometrial cancer, and breast cancer [40][41][42]. In addition to miR-21 and miR-139, selected DE miRNAs have also been confirmed to function as one of the major components in cancer biology by other groups. The identified DE miRNAs have been studied for their tumor-suppressive or oncogenic functions, but their diagnostic potential in clinical settings has not been fully elucidated.
Therefore, to evaluate the selected DE miRNAs as diagnostic tools for breast cancer, their performance characteristics of sensitivity, specificity, and AUC were analyzed. The results showed a sensitivity of 97%-76% and a specificity of 98%-80%. The AUC values ranged from 0.99 (95% CI = 0.98-1.00) to 0.83 (95% CI = 0.77-0.88) (p < 0.0001). These values are higher than the previously reported sensitivity of 67%-95% and specificity of >95% using the current standard diagnostic tools, such as mammography [43,44]. Several studies have investigated miRNAs for the diagnosis of breast cancer. Hastings et al. reported that the expression levels of miR-148b, miR-376c, and miR-409-3p were upregulated in benign breast tissues compared to those in breast cancer tissues [45]. Additionally, Cookson et al. showed that upregulation of miR-16, miR-21, and miR-451 and downregulation of miR-145 in the plasma of breast cancer patients serves as a screening biomarker [8]. Moreover, the miRNA profile analysis of miR-1, miR-92a, miR-133a, and miR-133b in breast cancer suggested their potential diagnostic performance with high AUC values (0.90 to 0.91) [46]. Taken together, the clinical relevance of the 28 selected DE miRNAs was comparable to that of the other miRNAs.
To establish the functional features of the DE miRNAs, miRNet was used for predicting target mRNAs, and pathway analysis for the predicted targets using KEGG were performed. Significant target genes for miR-21, miR-125b, miR-200b, and miR-429 were identified. The identified target genes are involved in breast cancer, PD-L1 expression and PD-1 checkpoint pathway in cancer, MAPK signaling, apoptosis, and TNF pathways. In particular, PD-1/PD-L1 is expressed on the surface of immune cells, such as T-cells, B-cells, and natural killer T cells, which function as immune checkpoint inhibitors [56][57][58][59]. PD-L1 in cancer cells binds to PD-1 present in T cells, inhibiting T cell function [60]. PD-L1 expression is associated with the occurrence of larger tumor size, high grade, estrogen receptor-negative, progesterone receptor-negative, and HER2-positive breast cancer [61]. PD-L1 is also expressed in 20% of TNBCs [62]. Recent studies have attempted to block the PD-1/PD-L1 pathway to ensure stronger tumor regression in cellular immunotherapies [63][64][65][66][67]. Thus, these results could improve the course of further research on immunotherapeutic strategies.

Conclusions
In conclusion, this study provides a comprehensive analysis of DE miRNAs and their potential targets and diagnostic performance in breast cancer. They may serve as promising diagnostic biomarkers. Additionally, these dysregulated miRNAs should be further investigated using tissue samples and blood samples collected from multiple centers at various stages and subtypes, such as luminal A, luminal B, HER2, and basal breast cancer. Further studies are also needed for validation for DE miRNA targets and prognostic values considering survival days, lymph node, surgical type, and adjuvant treatment.
Supplementary Materials: The following are available online at https://www.mdpi.com/2075-441 8/11/1/107/s1; Figure S1: Kaplan-Meier survival for low DE miRNAs versus high DE miRNAs expression level, Table S1: Clinical information from TCGA, Table S2: List of predicted target genes, Table S3: KEGG pathways for predicted targets. Funding: This study was supported by the research fund of the Catholic University of Pusan 2020.