A High Resolution Melting Analysis-Based Genotyping Toolkit for the Peach (Prunus persica) Chilling Requirement

The chilling requirement (CR) is the main factor controlling the peach floral bud break and subsequent reproductive growth. To date, several peach CR quantitative trait loci (QTLs) have been identified. To improve the accessibility and convenience of this genetic information for peach breeders, the aim of this study was to establish an easy-to-use genotype screening system using peach CR molecular markers as a toolkit for marker-assisted selection. Here, we integrated 22 CR-associated markers from three published QTLs and positioned them on the Prunus persica physical map. Then, we built a PCR-based genotyping platform by using high-resolution melting (HRM) analysis with specific primers and trained this platform with 27 peach cultivars. Due to ambiguous variant calls from a commercial HRM software, we developed an R-based pipeline using principal component analysis (PCA) to accurately differentiate genotypes. Based on the PCA results, this toolkit was able to determine the genotypes at the CR-related single nucleotide polymorphisms (SNPs) in all tested peach cultivars. In this study, we showed that this HRM-PCA pipeline served as a low-cost, high-throughput, and non-gel genotyping solution. This system has great potential to accelerate CR-focused peach breeding.


Introduction
The chilling requirement (CR) is a characteristic that evolved in deciduous fruit tree species, such as peach trees (Prunus persica), and is a measure of the period of time during which trees withstand low temperatures to avoid flowering in winter. The CR duration is the major climatic adaptation that limits the geographic distribution in which an individual tree can flower [1]. To effectively increase the distribution of peach cultivation regions from temperate zones, it is necessary to breed new cultivars with low CRs that are better adapted to subtropical or tropical regions [2,3]. Furthermore, it is important to breed peach cultivars with low CRs given that global climate change has contributed to insufficient chill accumulation that has resulted in decreased peach production [2,4]. However, peach breeding to produce low CRs requires years of cultivation for individuals to reach the adult stage, and CR phenotyping requires multiple years of observation of cumulative chilling and blooming dates. Additionally, the CR in peach trees is known to be a polygenic and heritable trait [5,6] Int. J. Mol. Sci. 2020, 21, 1543 2 of 23 with high broad-sense heritability (e.g., H 2 = 79.5%) [7]. The characteristics of the CR increase the complexity of breeding for CR traits. Therefore, comprehensive genetic CR studies are needed that provide the necessary knowledge to improve breeding efficiency and adapt peach varieties to future climatic conditions. Three current studies have made major genetic breakthroughs by mapping the quantitative trait loci (QTLs) associated with CRs and blooming dates [7][8][9][10]. The first study included 378 individuals of an F 2 population derived from the parental cultivars 'Contender' × 'Fla.92-2C' with high-and low-CR, respectively, for QTL mapping [7]. Four QTLs were found to be responsible for controlling the CR. This study also utilized two low-CR and two high-CR individuals from the F 2 population for whole genome sequencing to generate a list of candidate genes associated with the QTLs, the intervals of which were represented using 18 simple sequence repeat (SSR) markers [8]. Two independent QTL studies used similar bi-parental approaches on two different progeny populations [9,10]. Briefly, 12 QTLs controlling the CR and six representative single nucleotide polymorphisms (SNPs) were identified in the 'V6' × 'Granada' F 2 population [9]. The third study utilized the genotyping-by-sequencing (GBS) method to identify nine SNPs and one SSR representing 10 QTLs from the 'Hakuho' × 'UFGold' F 2 population [10]. These mapping results have contributed positional genetic data and have provided important information for further functional genomic research. Integrating the positions of these QTLs that were identified from each study is important to determine the overlapping QTL region.
These CR-related QTLs are promising in terms of genetic mapping, but the development of convenient breeding tools is required to break the technical entry barriers associated with the use of marker-assisted selection (MAS) [11]. SSR markers can be detected using polymerase chain reaction (PCR)-based techniques with common lab equipment, which is accessible for most breeders. Eighteen SSRs have been used to indicate eight CR-related QTL intervals [8], but these SSR markers only cover a portion of the reported peach CR-related QTLs [7][8][9][10]. Although SSR markers are highly accessible and cost effective, the efficiency of such gel-based and laboratory-intensive approaches may be greatly reduced when the number of samples or markers increases.
A shift in molecular marker approaches from SSR markers to SNP markers has been observed in this decade [12]. The release of the peach reference genome has facilitated the exploration of SNPs from broader genetic pools using genome-wide sequencing and has further developed the SNP array [13][14][15]. As far as peach SNP-based mapping goes, GBS has been utilized to identify SNPs for the construction of linkage maps and to identify QTLs associated with CRs and blooming dates [10]. The peach 9K Infinium II array [13] has been used to genotype SNPs for QTL mapping of the CRs, flowering times, heat requirements, and ecodormancy releases [9]. These sequencing-and array-based approaches are the high-throughput and multiplex platforms. These platforms are able to sequence multiple samples in a single Illumina HiSeq lane and/or to identify thousands of SNPs simultaneously for each sample in an array. Nevertheless, data management, bioinformatic platforms, and analytical tools are necessary to convert sequence information to SNPs [11]. Since a 9K SNP array would generate extra data that are not needed for MAS and the array approach is not designed for targeting particular traits in a MAS when the sample number is large, the development of an accessible and economical MAS platform to detect peach SNPs for CR-related QTLs is important.
The detection of SNPs can be accessible for common labs when various gel-free fluorescence-based methods are employed, including TaqMan®assays, kompetitive allele-specific PCR (KASP) assays, and high-resolution melting (HRM) analysis. The TaqMan®assay utilizes two PCR primers and a dual-labelled allele-specific probe [16,17]. KASP uses three unlabeled primers in combination with two universal fluorescence resonance energy transfer (FRET) quenching reporters [18]. HRM only requires two allele-specific primers for PCR and the melting behavior of the PCR products is used to discriminate SNP variance [19,20]. Given that HRM and KASP do not required fluorophore-labelled allele-specific oligonucleotides and that HRM and KASP reactions can be carried out with universal master mixes and universal FRET cassettes, respectively, these two methods are sufficiently flexible and cost-effective to be adopted by breeders. Current studies have successfully utilized HRM for high-throughput plant genotyping [21][22][23], cultivar identification, and the authenticity analysis of food products [24][25][26]. Considering that HRM only requires two short primers, the cost of HRM is likely to be a little bit lower than that of KASP. Advanced statistical methods for HRM analysis are available to improve variant calling [27], and using HRM for plant genotyping may be a feasible approach.
The establishment of a peach breeding toolbox is usually initiated from mapped genes or loci that have been highly linked with horticulture traits, followed by the identification of specific genetic markers according to genetic position for the implementation of further MAS. Major genes or markers that are highly linked with several qualitative peach traits have been identified. For example, a carotenoid cleavage dioxygenase gene has been characterized and shown to control for yellow or white fruit flesh color [28,29]. Two genes encoding endopolygalacturonases (endoPGs) have also been reported to be candidates that control traits related to stone adhesion and flesh texture [30,31]. Furthermore, the difference between peaches and nectarines is due to fruit skin pubescence, which is controlled by a MYB transcription factor [32]. In addition, qualitative peach resistance to the green peach aphid is conferred by an Rm2 dominant allele that encodes a member of the nucleotide binding site leucine-rich repeat (NBS-LRR) resistance protein family [33]. Lastly, flat and round peach fruit shapes are linked with an SSR marker, UDP98-412, that is present in most flat fruit cultivars [34]. For the implementation of MAS, the association of fruit shape traits is further improved by the use of a haplotype that is represented with 3 SNPs or by a primer pair for two small indels [35]. Notably, the peach blush trait, the color red on the fruit skin surface, is able to be routinely predicted using a 5-SNP haplotype test in a single PCR-based assay [36]. The potential implementation of high-throughput MAS has been further developed based on the positional information for these multiple qualitative traits using peach SNP arrays [37,38] or GBS [39].
To facilitate peach cultivar breeding with the desired CR traits, we aimed to establish an easy-to-use genotyping platform with which plant breeders can develop new peach cultivars by genotyping the representative QTL markers that control the CR. In our study, the publicly available whole genome sequences in the Genome Database for Rosacea [40] were used as a physical reference to integrate published CR-related QTLs, and HRM-optimized primer pairs were designed to be specific to each major-effect QTL. To improve the genotyping accuracy for HRM data, we developed a free and open source R-based pipeline for variance calling. A genotyping toolkit that consists of these primer sets and a pipeline for variant calling will allow breeders to determine the genotypes of the SNP markers that are linked to the CR-related QTLs.

Integration of Genetic Cofactors Representing Major-and Minor-Effect CR-related QTLs on a Physical Peach Genome Map
To situate the genetic cofactors representing the CR-related QTLs onto a physical map, we took advantage of the Prunus persica genome v.2.0.a1 [14], using it as the physical reference with which to position the cofactors based on SSR allele-specific primer sequences or SNP marker sites that were used for QTL mapping in previous studies [8][9][10]. All cofactors were assigned to represent either peach CR major-effect QTLs (Table 1) or minor-effect QTLs (Appendix A Table A1), according to the statistics and descriptions of the original QTL mapping studies [8][9][10].  6 Physical position: the position of the original cofactor on the P. persica genome v.2.0.a1 physical map. The physical position of each marker was defined by scaffold and bp position (scaffold:bp position for SNP and scaffold:bp.bp region for simple sequence repeat (SSR)) in the P. persica genome v.2.0.a1. 7 Marker distance from the cofactor indicates the upstream (+) or downstream (−) distance of the cofactor from the selected SNP marker. When the cofactor is an SSR marker, the distance is counted from the side closest to the selected SNP marker (i.e., from the left side of the SSR for upstream SNPs; from the right side of the SSR for downstream SNPs). 8 N/A: not applicable.

Selection of SNP Markers Presenting Cofactors and the Detection of the Selected SNPs via HRM Analysis
For the establishment of a breeder-accessible genotyping system, cofactors were surveyed by detecting their flanking SNPs via HRM analysis. In order to increase the uniformity of the HRM analysis, all cofactors linked with the SSR markers and some SNP markers were replaced with adjacent SNP markers (i.e., selected markers). To perform HRM genotyping, specific primer pairs with a near-100% PCR efficiency were designed for all selected markers (Appendix A Table A2). The performance of HRM using these primer pairs was tested by polymorphism genotyping in our collection of 27 peach cultivars, including 15 low-chill cultivars and 12 high-chill cultivars. Nonetheless, no suitable primers were designed among the 28 cofactors that were able to differentiate the genotypes of the four cofactors linked to the four minor-effect QTLs in the genomic region, including SNP_IGA_293752, ssrPACITA21, PacC13, and BPPCT038 (Appendix A Table A1). In addition, no polymorphisms were detected on the flanking SNPs of two other minor-effect cofactors, Pchgms170 and SNP_IGA_635355, in any of the 27 tested peach cultivars. Considering that the influence of these minor-effect cofactors on the observed phenotypes was relatively low, these six cofactors were excluded from subsequent experiments. In summary, a total of 22 SNP markers were selected that were comprised of 11 major-effect (Table 1) and 11 minor-effect (Appendix A Table A1) QTLs.
Using the P. persica genome v.2.0.a1 [40] as a reference, we positioned all 22 SNP markers on the physical map to represent the locations of the major-and minor-effect CR-related QTLs ( Figure 1). Notably, the SNPs linked to the major-effect QTLs were mainly located in LG1, LG4, and LG7 and the markers linked to the minor-effect QTLs were scattered on seven out of a total of eight linkage groups (LGs) (Figure 1). In addition, the farthest physical distance between the original cofactors and the selected SNP markers was 20 kbp and most were located within 10 kbp of each other (Table 1 and Appendix A Table A1), indicating that the recombination rate between these cofactors and the SNP marker pairs was very low.
For the establishment of a breeder-accessible genotyping system, cofactors were surveyed by detecting their flanking SNPs via HRM analysis. In order to increase the uniformity of the HRM analysis, all cofactors linked with the SSR markers and some SNP markers were replaced with adjacent SNP markers (i.e., selected markers). To perform HRM genotyping, specific primer pairs with a near-100% PCR efficiency were designed for all selected markers (Appendix A Table A2). The performance of HRM using these primer pairs was tested by polymorphism genotyping in our collection of 27 peach cultivars, including 15 low-chill cultivars and 12 high-chill cultivars. Nonetheless, no suitable primers were designed among the 28 cofactors that were able to differentiate the genotypes of the four cofactors linked to the four minor-effect QTLs in the genomic region, including SNP_IGA_293752, ssrPACITA21, PacC13, and BPPCT038 (Appendix A Table A1). In addition, no polymorphisms were detected on the flanking SNPs of two other minor-effect cofactors, Pchgms170 and SNP_IGA_635355, in any of the 27 tested peach cultivars. Considering that the influence of these minor-effect cofactors on the observed phenotypes was relatively low, these six cofactors were excluded from subsequent experiments. In summary, a total of 22 SNP markers were selected that were comprised of 11 major-effect (Table 1) and 11 minor-effect (Appendix A Table A1) QTLs.
Using the P. persica genome v.2.0.a1 [40] as a reference, we positioned all 22 SNP markers on the physical map to represent the locations of the major-and minor-effect CR-related QTLs ( Figure 1). Notably, the SNPs linked to the major-effect QTLs were mainly located in LG1, LG4, and LG7 and the markers linked to the minor-effect QTLs were scattered on seven out of a total of eight linkage groups (LGs) (Figure 1). In addition, the farthest physical distance between the original cofactors and the selected SNP markers was 20 kbp and most were located within 10 kbp of each other (Table 1 and Appendix A Table A1), indicating that the recombination rate between these cofactors and the SNP marker pairs was very low. LG). The selected SNP markers are shown as closed circles (markers linked to major-effect QTLs) and open circles (markers linked to minor-effect QTLs). Circles filled with red, blue, and yellow represent the markers identified from the 'V6' × 'Granada' [9], 'Contender' × 'Fla.92-2C' [8], and 'Hakuho' × 'UFGold' [10] F2 populations, respectively. The map was plotted using the genoPlotR package in RStodio [41,42].
The raw fluorescence data from the 22 SNP markers of the HRM analysis were analyzed with the commercial software, High Resolution Melt Software v. 2.0 (Applied Biosystems, Waltham, MA, USA). The temperature regions of the melt curves, such as the pre-melt and post-melt regions, that were used for each primer pair are listed in Table 2. In each of the difference plots, the curves of the 27 cultivars were clustered into 2-3 variants based on the reassembled melt curve patterns ( Figure 2). Cultivars in the same cluster were assumed to share the same genotypes for the given SNP marker. To assess the genotypes of the SNP targets, at least three cultivars from each cluster were randomly selected for Sanger sequencing. The sequencing results showed that some cultivars in the same cluster possessed different genotypes ( Figure 2). As an example, the accuracy of three marker (i.e., SNP_IGA_780662, SNP_IGA_786935, and 8_1171818744) variant calls from HRM Software v. 2.0 (Applied Biosystems, Waltham, MA, USA) was 55.6%, 64.3%, and 80%, respectively ( Figure 2). To determine whether this low observed accuracy of the two SNPs was attributed to either the HRM reactions or the use of HRM Software v. 2.0 (Applied Biosystems), the reproducibility of the HRM reactions was assessed by carrying out independent HRM reactions for correctly and incorrectly genotyped samples. After the subsequent analysis of these samples with HRM Software v. 2.0 (Applied Biosystems), we concluded that it is likely that the observed low accuracy was due to the use of HRM Software v. 2.0 (Applied Biosystems) since both the correctly and incorrectly genotyped samples were reproducible in multiple independent experiments. Thus, an alternative analysis method to improve the variant call was required. 1 Selected marker: the flanking SNP marker replacing the original cofactor as the selection marker. The related cofactors are shown in Table 1 and Appendix A Table A1. 2 Pre-melt/ Post-melt region: the temperature regions before/after the active melt region that are used to align the data and perform clustering with the Applied Biosystems High Resolution Melt software v. 2.0. 3 Lower / Upper limit: the minimum/maximum temperatures of the active melt region that are used to align the data and perform clustering via PCA in R studio.

HRM Followed by Principal Component Analysis (PCA) Is a Robust Variant Calling Method for Differentiating Genotypes Based on the Selected SNP Markers
To provide a robust method to call variants for HRM analysis, we established a free and open source R-based pipeline to execute variant calling with PCA, based on the rationale of a previous study [27]. This study claimed regular HRM software uses the shape of melting curves that are not supported by statistics, and suggested automated statistical methods such as PCA was more appropriate [27]. In our simplified version, the normalization of the raw fluorescence data for HRM analysis was carried out using the optimized lower limit (pre-melt region) and upper limit (post-melt region) temperatures of the melt curve analysis for each primer pair ( Table 2). All peach cultivars were grouped into 2-3 clusters, and we randomly selected some cultivars from each cluster for Sanger sequencing to validate the genotypes. As a result, the prediction accuracy was significantly increased with the PCA pipeline method compared to that of the results produced by HRM Software v. 2.0 ( Figure 2). For example, the genotyping accuracy of SNP_IGA_780662 and 8_11718744 increased from 55.6% and 80% to 100%, respectively, and the accuracy of SNP_IGA_786935 increased from 64.3% to 92.9% when the PCA clustering method was applied ( Figure 2). These results support the conclusion that the developed R-based PCA script, which is free to use, is a highly reliable tool for HRM variant calling.

Genotyping of 22 CR-related SNP Markers for 27 Peach Cultivars
The PCA pipeline was subsequently used to determine the genotypes of all the selected SNP markers, including the 11 major-effect markers (Table 3) and the 11 minor-effect markers (Appendix A  Table A3) for 15 low-chill and 12 high-chill peach cultivars. The accuracy of the PCA for most markers was higher than 80%, including 14 markers that showed 100% agreement with the results of the Sanger sequencing validation. All 22 SNPs were observed to have only two types of sequence variants for each SNP position. Notably, the presence of certain SNPs was relatively higher in either the low-chill or high-chill cultivars. For example, SNP_IGA_122351, a selected SNP marker linked to major-effect CR-related QTLs, was homozygous for the A/A genotype in 12 out of 12 high-chill cultivars, and this SNP presented the G/G or A/G genotypes in 14 out of 15 low-chill cultivars (it presented the A/A genotype in the low-chill 'Kuu Taur' cultivar), suggesting that A/A is associated a high-chill genotype (Table 3). Another major-effect SNP, SNP_IGA_427604, was homozygous for the G/G genotype in 15 out of 15 low-chill cultivars but homozygous for only 2 out of 11 high-chill cultivars, suggesting a low-chill associated role of the G/G genotype (Table 3). To further support the association between genetic markers and CR traits, a chi-squared (X 2 ) goodness-of-fit test was used to assess the observed versus expected genotype frequency in low-chill and high-chill peach cultivars. All 11 major-effect markers were significantly associated (p < 0.05) with CR traits, and seven of them presented p values < 0.001 (Table 3). Taken together, the developed PCA pipeline for HRM analysis was successfully applied with 22 CR-related markers for the genotyping of 27 peach cultivars, and potential low-chill associated genotypes for these SNPs were observed.

Potential CR-Related Haplotypes
By combining several SNPs at a specific locus into a single haplotype, the haplotype would then represent multiple CR-related QTLs. Although it would have been a more stringent to develop haplotypes that encompassed each previously reported QTL, we alternatively considered that nearby regions were putative haplotypes given that CR-related QTLs were mainly located on nearby regions of chromosome (Chr) 1, 4, and 7 ( Figure 1). Instead of determining the genotypes of multiple CR-related SNPs, we designed four putative haplotypes to represent these four CR-related loci. Two, 2, 2, and 3 SNPs were used in the haplotype analysis of CR-related loci on Chr 1 (i.e., two positions: Chr1-1 and Chr1-2), Chr 4, and Chr 7, respectively ( Figure 3). The frequencies of certain genotypes in some of these SNPs were enriched in high-CR or low-CR cultivars ( Figure 3, upper panel). Haplotypes were inferred using the Expectation-Conditional-Maximization (ECM) algorithm (CHAPLIN software) [43,44]. For Chr1-1 and Chr1-2, the frequencies of the CA haplotype (53.4%) of Chr1-1 and the GA haplotype (47.1%) of Chr1-2were high and may represent high CR traits (Figure 3, bottom panel, left). The AA and GA haplotypes of Chr4 could also represent high CR traits (Figure 3, bottom panel, middle). Furthermore, the ATA haplotype of Chr7 may represent high CR traits, while GCC and ACC haplotypes could represent low CR traits (Figure 3, bottom panel, right). Low-chill cultivars Okinawa 100 Significance (X 2 -test) 3 *** *** *** * *** ** *** *** ** * *** Accuracy (ratio; %) 4 9/9; 100% 9/9; 100% 9/9; 100% 6/9; 66.7% 8/8; 100% 13/14; 92.9% 9/9; 100% 9/9; 100% 6/7; 85.7% 8/10; 80% 9/9; 100% Alternatively, SNP markers for peach CR-related QTLs are available from previous studies [8][9][10], which can be detected using either GBS or SNP arrays if the breeder is able to afford the expense, equipment, and possesses the knowledge and facilities required for an integrated analysis. In addition, targeted genotyping and sequencing services are available from commercial companies, such as SeqSNP by LGC Biosearch Technologies (Petaluma, CA, USA). As such, indicating the SNPs of interest provided in this study to these services could be an efficient evaluation approach.  [45]. Putative CR-associated SNPs for high CR (red text) or low CR (blue text) is based on SNP frequency and association analysis. CHAPLIN software was used for haplotype analysis and to determine haplotype frequencies.

Advantages and Limits of Using HRM for Genotyping
Although GBS and SNP arrays are high-throughput approaches to determine the genotypes of the markers linked to the CR-related QTLs [8][9][10], some main issues impeding their widespread use in common breeding operations have not yet been resolved. First, no low-cost and high-efficiency genotyping method has been proposed for the genotyping of peach CR-associated SNP markers prior to this study. Either GBS or the peach 9 K Infinium II array may not be cost-effective for the genotyping of only 10-20 markers with hundreds of samples. Several CR-related QTLs can also be genotyped by SSR markers [8], which may be cost-effective as this method only requires the use of Figure 3. Summary of 4 potential CR-related marker sets based on putative haplotyping. The SNP frequency is reflected by the stack symbol height (upper panel) generated by WebLogo [45]. Putative CR-associated SNPs for high CR (red text) or low CR (blue text) is based on SNP frequency and association analysis. CHAPLIN software was used for haplotype analysis and to determine haplotype frequencies.

Breeder Toolbox for Peach CRs
In agreement with qualitative trait approaches, we developed a toolkit in this study to analyze markers associated with a quantitative trait, CR, based on the allelic positions from QTL studies for the further implementation of MAS. Prior to this study, breeders would need to retrieve the genomic sequences close to the CR-related markers from the breeder's toolbox marker converter of the Genome Database for Rosacea [40]. To determine the variants of the retrieved markers, intensive work and validation are required, including the design of primers or probes, the validation of primer efficiency and specificity, the collection of germplasms with low and high CR, and the validation of variant calls using Sanger sequencing. Our study presents a comprehensive approach with experiment-based primer validation that was successfully applied to a collection of peach cultivars with low and high CR levels. A free and open-source R script for HRM variant calling is provided with step-by-step user instructions in addition to a data input template (Supplementary Materials). Alternatively, SNP markers for peach CR-related QTLs are available from previous studies [8][9][10], which can be detected using either GBS or SNP arrays if the breeder is able to afford the expense, equipment, and possesses the knowledge and facilities required for an integrated analysis. In addition, targeted genotyping and sequencing services are available from commercial companies, such as SeqSNP by LGC Biosearch Technologies (Petaluma, CA, USA). As such, indicating the SNPs of interest provided in this study to these services could be an efficient evaluation approach.

Advantages and Limits of Using HRM for Genotyping
Although GBS and SNP arrays are high-throughput approaches to determine the genotypes of the markers linked to the CR-related QTLs [8][9][10], some main issues impeding their widespread use in common breeding operations have not yet been resolved. First, no low-cost and high-efficiency genotyping method has been proposed for the genotyping of peach CR-associated SNP markers prior to this study. Either GBS or the peach 9 K Infinium II array may not be cost-effective for the genotyping of only 10-20 markers with hundreds of samples. Several CR-related QTLs can also be genotyped by SSR markers [8], which may be cost-effective as this method only requires the use of common lab equipment. Nevertheless, there are some drawbacks associated with the use of SSR markers. For example, when assessing the genotypes of the 'Hongqingshui' cultivar, the amplification efficiency for ssrPACITA21 was observed to be very low, and multiple products were found in the PCRs of AMPA103 and PacC13, indicating that the PCR amplification efficiency and specificity of some of the SSR primer pairs were low for some cultivars. Additionally, genotyping using the gel-based SSR method is relatively labor-intensive. This limits the SSR method from being applied in large-scale genotyping projects and makes it particularly susceptible to human error.
We propose that HRM may be a more suitable method for the genotyping of CR-related markers. Firstly, HRM instruments and reagents are less expensive and more universal than those of either the microarray or sequencing approaches. Secondly, the in silico genotyping pipeline provided here not only increases the throughput but also decreases the chance of producing biased results due to human error. Although the singleplex nature of HRM limits its throughput potential for detecting large numbers of SNPs, HRM remains a rapid and simple solution for genotyping peach CR-related markers when compared to that of GBS or SNP arrays. Among all 22 selected SNPs representing the CR-related QTLs, 11 of the selected SNPs were linked with major-effect cofactors (Figure 1), which are the first priority for the implementation of MAS. With respect to qualitative traits, the SNPs associated with six qualitative traits were identified [37,39]. If one SNP represents each qualitative trait and the HRM-PCA pipeline is adopted, a breeder toolbox consisting of a total of 17 SNPs associated with quantitative CR traits and six qualitative traits could be developed in the future. HRM-based methods would thus represent an efficient approach for analyzing this number of markers.

Advantages of Using PCA for Variant Calls
To date, commercialized software packages, such as HRM Software v. 2.0 (Applied Biosystems, Waltham, MA, USA), are probably still the most common tools for analyzing HRM results, but there are some drawbacks associated with the use of these commercial tools. In this study, when HRM Software v. 2.0 (Applied Biosystems) dealt with the widespread melt curves, its accuracy decreased. For example, the accuracy of HRM Software v. 2.0 (Applied Biosystems) for analyzing SNP_IGA780662, SNP_IGA786935, and 8_17718144 was only 55.6%, 64.3%, and 80%, respectively ( Figure 2). In addition, the price of commercial software may represent a huge expense for some breeders with limited budgets. An alternative statistical software, ScreenClust (Qiagen, Venlo, The Netherlands), has been developed to improve the HRM allele assortment and is powered by PCA clustering [27]. By adopting this concept and simplifying the analysis approach, we developed an R script using PCA for melt curve clustering. As a result, our R script presents several advantages. First, the accuracy of our R script is higher when compared to that of HRM software v. 2.0 (Applied Biosystems). For example, the accuracy of our R script for analyzing SNP_IGA780662, SNP_IGA786935, and 8_17718144 was 100%, 92.9%, and 100%, respectively ( Figure 2). Moreover, our R script is free and can be modified by users according to their needs. Not only are step-by-step user instructions described in the Materials and Methods section, but the R script and an input data template file are also available in the Supplementary Materials.

Putative Low CR-Assoicated SNPs Are Potential Candidates for MAS
When reviewing most of the selected markers in the cultivar collection, the frequency of one specific SNP variant was found to be much higher in cultivars with low CR than that of cultivars with high CR (Table 3). These types of SNPs are considered to be putative low-chill associated SNPs. For example, with SNP_IGA_780662, there are more C/C homozygous genotypes observed in the cultivars with low CR, while more A/A homozygotes or A/C heterozygotes are found in high-chill cultivars. Two popular high-chill cultivars, 'Hongqingshui' and 'Nakatsu Hakuto,' in the Taiwan market both show the A/A genotype at this SNP site. The association between genotypes and CR traits is strengthen by the results of our chi-squared test, and this SNP site was found to be highly significant (p < 0.001). Although more germplasms and SNPs in the genome may be required to further the statistical support for this association, it is still very likely that the C/C genotype is a low-chill specific SNP for SNP_IGA_780662 ( Table 3). The putative low chill-associated SNPs of most of the selected markers have been identified (Table 3 and Appendix A Table A3), and they could be potential genotypes for the selection of parental germplasms and candidate progenies for breeding cultivars with low CR.

Recommended Marker Lists for Users of This Toolkit
When linkage maps and QTLs are built for traits controlling the CR [7][8][9][10], it becomes possible to carry out CR MAS at the seedling stage. Three peach F 2 populations have been employed for the mapping of these QTLs [7][8][9][10]. The molecular markers used for genotyping the first F 2 population were SSRs and amplified fragment length polymorphism (AFLP) markers [7]. This was followed by re-sequencing to refine the QTL resolution and develop gene-targeted SSR markers [8]. The other two F 2 peach populations have been mapped using high-throughput SNP arrays and GBS [9,10]. To adopt such various marker systems for MAS, SNP arrays or GBS have been considered to be useful tools to integrate the markers from the three QTL maps, but these tools require advanced facilities and bioinformatic analysis, which represent a barrier to breeders from common labs, as previously discussed. Another feasible strategy is to adopt only one of these three QTL maps and stick with the original marker system used for mapping. By doing so, breeders can avoid the processes of QTL integration and the installation of multiple marker systems in their labs. Nevertheless, a main issue related with the use of only one QTL map derived from one F 2 population is the lack of information of other major-effect QTLs mapped by other F 2 populations. This issue was highlighted in the integration of information in this study given that many major-effect SNP markers that originated from different maps did not overlap and were not co-localized on the physical map produced (Figure 1). To address these issues, this study has contributed a simplified and cost-effective HRM-PCA pipeline, integrated with most of the available major-effect CR-related markers and with a low entry threshold for peach breeders. To further reduce the costs and labor associated with genotyping all selected markers, we offer recommendations to prioritize candidate markers.
The nature of a quantitative trait, such as the CR, is that the trait is controlled by multiple genes and some of these genes contribute to a higher genetic variance of the trait. These genes are known as major-effect genes. As such, selecting genes with high effectiveness and gene pyramiding are key concepts for the prioritization of candidate markers. To this end, markers in this study were categorized as either major-or minor-effect markers according to their contribution to the CR (Figure 1; Table 1; and Appendix A Table A1). Among these markers, 11 SNPs for major-effect QTLs are recommended to breeders as part of a priority list for genotyping. Considering the concept of gene pyramiding, these markers have been suggested so that breeders may survey the genotypes of their germplasm collections.
With respect to the technical limitations of the HRM-PCA method and the polymorphic diversity of the CR-related markers in the peach gene pool, the marker lists provided in this study may even be shortened depending on the accuracy of the HRM primer pairs and the associations between markers and traits. Even though we have optimized the amplification efficiency for all primer pairs to nearly 100%, some of the primer pairs still exhibit lower accuracies than the others (Table 3 and  Appendix A, Table A3). For example, the accuracy of SNP_IGA_134905, 1_40995799, and 4_13747914 was 66.7%, 85.7%, and 80%, respectively. For breeders with limited budgets, we recommend only using markers with high accuracy for major-effect QTLs. Similarly, for breeders who aim to genotype minor-effect QTLs, we recommend first selecting the markers with 100% accuracy. Considering the significant level of association between genetic markers and peach CR traits, a contingency table chi-squared test was carried out to assess this association. A concise list of markers can be generated by selecting the markers with highly significant levels of association. Since 11 major-effect markers and 4 minor-effect markers were found to be significantly associated (p < 0.05) with CR-related traits (Table 3 and Appendix A Table A3), the selected markers can be filtered down from 22 to 15 markers according the association level. Alternatively, putative haplotypes on three loci would be useful to simplify the genotyping for peach CR traits (Figure 3, bottom). These three sets of putative haplotypes can be assessed by determining a total of only 9 SNPs. Nevertheless, this haplotype analysis was based on only 27 germplasms that are relevant to CR studies of peach cultivars in the Taiwan and Southeast Asia regions, so a large number of progeny or peach germplasms that have already been phenotyped for the CR may be required to confirm the association of these putative haplotypes.
In summary, although a total of 22 markers are provided in this study, we recommend that breeders use the marker sets that represent putative haplotypes. If possible, breeders can also select several or all of the markers that are linked with major-effect and/or minor-effect QTLs. We suggest that breeders adjust the number of markers by considering the accuracy of the HRM primer pairs and the significance of the associations between markers and traits.

Conclusions
This study provides a toolkit with optimized primers and experimental settings for assessing the genotypes of 22 SNP CR-related peach markers, including 11 markers linked to major-effect QTLs and another 11 markers linked to minor-effect QTLs. A cost-effective HRM genotyping system was connected with a free and open-sourced R-based PCA variant calling pipeline to empower this toolkit with high accessibility and flexibility for peach breeders. This pipeline has successfully determined genotypes for a collection of peach germplasms consisting of low-chill and high-chill cultivars. Although SNP arrays and GBS are high-throughput genotyping approaches, this simplified and rapid toolkit still has high implementation potential. With this toolkit, breeders are able to assess the genotypes of germplasms, screen a large number of progeny during the early developmental stages, and accelerate peach breeding using MAS for quantitative CR traits. A future extension of this HRM-based toolkit is slated to include SNPs and putative haplotypes linked with qualitative peach traits, which will further establish this toolbox as a comprehensive tool for determining quantitative CR traits and multiple qualitative traits. Considering the germplasms evaluated in this study, the recommended markers are applicable for the germplasms relevant to the Taiwan and Southeast Asia regions, and further studies are needed to validate these SNPs as CR markers in other germplasms outside of Taiwan and Southeast Asia.

Plant Materials
The young and fully expanded leaves of a total of 27 peach (Prunus persica L.) cultivars, spanning low to high CRs, were collected for HRM genotyping. The CR phenotypes (i.e., chilling hours) of 20 cultivars have been previously reported (Table 3)

Genomic DNA Extraction
Genomic DNA was extracted from peach leaf tissues by a modified cetyltrimethyl ammonium bromide (CTAB) method [55]. Briefly, peach leaves were homogenized using a pestle and mortar in liquid nitrogen. One gram of the homogenized sample was resuspended with 11 mL of DNA extraction buffer (100 mM Tris-HCl at pH 8.0, 25 mM EDTA, 1.4 M NaCl, 0.1% polyvinylpyrrolidone with an average mol wt of 40,000 (PVP-40), 2% CTAB, 0.2% β-mercaptoethanol, and 0.15 mg·mL −1 Proteinase K). After being mixed well via inversion and incubated at 65 • C for 30 min, the extract was centrifuged at 3000× g for 10 min. The supernatant was transferred to a fresh tube and then mixed with the same volume of chloroform:isoamyl alcohol (24:1), followed by centrifugation at 8000× g for 10 min. The upper aqueous phase was transferred to a new tube, and a half volume of 5 M NaCl and a 0.6-0.7 volume of ice-cold isopropanol were added. After standing the mixture at 25 • C for 1 h, precipitated genomic DNA was pelleted by centrifugation at 10,000× g for 20 min. The genomic DNA pellet was washed with 70% ethanol and air dried. The dried pellet was dissolved in 0.5 mL of high salt TE buffer (1 M NaCl, 10 mM Tris-HCl pH 8.0, and 1 mM EDTA). To clean RNA contamination, the genomic DNA solution was digested by the addition of 0.5 µL RNase A (20 mg·mL −1 ), followed by incubation at 37 • C for 30 min. The RNase-treated DNA was cleaned up again by a chloroform:isoamyl alcohol (24:1) extraction and subsequent alcohol precipitation with a 0.6-0.7 volume of cold isopropanol. After spinning at 12,000× g for 30 min at 4 • C, the pellet was washed with 70% ethanol and air dried. The dried genomic DNA pellet was dissolved in DNase-free water for subsequent analysis.
The integrity of the genomic DNA was visualized with 0.8% agarose gel electrophoresis. The DNA concentration and purity (A 260 /A 280 and A 260 /A 230 ratios) were evaluated using an Epoch Microplate Spectrophotometer (BioTek, Winooski, VT, USA). The final concentration of DNA was adjusted to 20 ng·µL −1 and stored at −20 • C for subsequent HRM analysis.

HRM Analysis
Specific primers with a near−100% PCR efficiency (Appendix A Table A2) were designed for the HRM assays. To increase PCR efficiency, primer pairs are designed for amplicons with lengths shorter than 150 bp [56]. In addition, PCR efficiency was determined by the construction of optimal standard curves with Ct values and genomic DNA quantity. Briefly, peach leaf genomic DNA was serially diluted with deionized water and used as a template for qPCR analysis. The PCR efficiency was calculated with StepOnePlus v. 2.3 software (Applied Biosystems, Waltham, MA, USA) using a formula of PCR efficiency (PCR efficiency = 10 −1 /slope −1 ) with two technical repeats. HRM was conducted with 2.5 ng of genomic DNA as a template using a StepOnePlus Real-Time PCR System (Applied Biosystems) and MeltDoctor HRM Master Mix (Applied Biosystems) according to the instructions from the manufacturer. The HRM thermocycles were set as follows: enzyme activation at 95 • C for 10 min, 40 cycles of denaturation at 95 • C for 15 s, and annealing/extension at 60 • C for 1 min. Prior to the melting curve analysis, the PCR amplicons were denatured at 95 • C for 15 s and reannealed at 60 • C for 1 min. The melting curves were generated by dissociation at 95 • C with a 1% ramp rate. The amplification and dissociation curves were analyzed using StepOnePlus v. 2.3 software (Applied Biosystems). In addition, a melt curve analysis at the post-PCR step was carried out using HRM software v. 2.0.1 (Applied Biosystems).

PCA for HRM Output Result Clustering
After HRM analysis, raw melt region temperature data and melt region normalized fluorescence data were exported with the StepOnePlus v. 2.3 software. Normalization and PCA were performed on the HRM output raw fluorescence data in RStudio [42]. The samples were clustered and visualized using the 'mclust' [57] and 'plot3D' packages, respectively. To verify the clustering results of the HRM data analyzed via PCA, randomly selected samples from each cluster were sequenced by Sanger sequencing (performed by Mission Biotech Co., Taipei, Taiwan). All R studio scripts are available (Supplementary Materials, also available on GitHub, https://github.com/choulin2/PCA_HRM.git). Please see Section 5.5 for step-by-step PCA pipeline instructions.

Instructions of the HRM Fast Genotyping Platform Analyzed with the PCA Pipeline
After PCR amplification and HRM dissociation, temperature data and normalized fluorescence data of the melt region may be exported as a single file with StepOnePlus v. 2.3 software (Applied Biosystems, Waltham, MA, USA). Afterwards, the temperatures of all wells on each fluorescence read are averaged to represent the temperature of each fluorescence read. All single SNP fluorescence data are listed with the corresponding temperature on each read. This data should be arranged based on the format of the provided template (please see Supplementary Materials "PCA_HRM.example.csv", also available on GitHub, https://github.com/choulin2/PCA_HRM.git) and saved as a '.csv' file using Microsoft Excel (Microsoft, Redmond, WA, USA).
After converting the raw data into this R-readable format, the R scripts we provided (see Supplementary Materials "PCA_HRM.v8.R", also available on GitHub, https://github.com/choulin2/ PCA_HRM.git) can be loaded into R studio. Before running the R scripts, we recommend setting the working directory to the folder containing this R script file and the data .csv file (Session /Set Working Directory /Choose Directory). Then, the data can be imported by running the following scripts: > input_file = readline('Enter the file name: ') > exported.data.file = read.csv(input_file,header = T) After data input is complete, the upper and lower melt temperature limits may be set according to the optimized temperature for each marker ( Table 2) as follows: > max_lim = readline('Enter the upper limit of the melt region: ') > min_lim = readline('Enter the lower limit of the melt region: ') After setting the melt region for analysis, the scripts may be directly run line-by-line from step 2 to step 6 to complete data normalization and conduct the PCA. Afterwards, the latest version of the 'mclust' and 'plot3D' packages should be installed from The Comprehensive R Archive Network (CRAN, https://cran.r-project.org/) for clustering and three-dimensional (3D) graph plotting: In the above script, both of the selected PCs and the expected genotype number for clustering are adjustable. The number of selected PCs may be adjusted by changing the number of analyzed columns from PCA.analysis.file$rotation[,1:3], while G defines how many expected genotype groups were used for clustering.
In order to visualize the clustering results, scatter diagrams of the chosen PCs may be plotted with the 'plot' function of step 7. In addition, the 'identify' function was used to identify selected points on the interactive graphs by pressing the mouse button over the desired point. After running the 'identify' function, it is necessary to press the 'esc' button or the finish tab in order to finish the interactive plot and to run the following scripts. In the last data export step, 3D clustering figures may be obtained, such as in Figure 2, in '.pdf' format as well as a variant calling file, such as "PCA_HRM.example.csv variant call.csv" (see Supplementary Materials). According to the information of the variant call, samples that are clustered in the same group may be identified. A web-based demonstration version of this R script and example data can be accessed on NextJournal (https://nextjournal.com/RNA-Sick/hrm-analysis-with-pca-and-clustering). To validate genotypes, at least one selected sample from each group is subjected to Sanger sequencing, and the represented genotype of each variant call may be defined. In this manner, we may genotype all samples in silico for all the selected CR-related markers (Table 3 and Appendix A Table A3).

Association Analysis and Haplotype Analysis
Peach cultivars were classified by high and low CR, and SNP markers were grouped by genotypes. An association analysis between genetic markers and peach CR traits was performed using a chi-square test in a contingency table that contained the counts of individuals in each category. The chi-squared test statistic and p value of the test were calculated with the 'CHITEST' function in Microsoft Excel to assess the changes of the observed versus expected counts in each category. Regarding the haplotype analysis, genotype data of SNPs in three chromosome regions from all cultivars were converted to "0", "1", or "2". The homozygous AA SNP sequence was "0", the heterozygous AB sequence was "1", and the homozygous BB sequence was "2". These genotypic information were then formatted for Case-control haplotype inference (CHAPLIN) software [43,44], setting low CR and high CR as the case and control, respectively. The modeling of this likelihood approach was set as 'general'.

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.  7 Marker distance from the cofactor indicates the upstream (+) or downstream (-) distance of the cofactor from the selected SNP marker. When the cofactor is an SSR marker, the distance is counted from the side closest to the selected SNP marker (i.e., from the left side of the SSR for upstream SNPs; from the right side of the SSR for downstream SNPs). 8 N/A: not available, indicating that no suitable primers with high PCR efficiency or specificity are available or that no polymorphisms were found at adjacent SNPs in all tested peach cultivars. 9 N/A: not applicable.