Large-scale development of cost-effective SNP marker assays for diversity assessment and genetic mapping in chickpea and comparative mapping in legumes

A set of 2486 single nucleotide polymorphisms (SNPs) were compiled in chickpea using four approaches, namely (i) Solexa/Illumina sequencing (1409), (ii) amplicon sequencing of tentative orthologous genes (TOGs) (604), (iii) mining of expressed sequence tags (ESTs) (286) and (iv) sequencing of candidate genes (187). Conversion of these SNPs to the cost-effective and flexible throughput Competitive Allele Specific PCR (KASPar) assays generated successful assays for 2005 SNPs. These marker assays have been designated as Chickpea KASPar Assay Markers (CKAMs). Screening of 70 genotypes including 58 diverse chickpea accessions and 12 BC3F2 lines showed 1341 CKAMs as being polymorphic. Genetic analysis of these data clustered chickpea accessions based on geographical origin. Genotyping data generated for 671 CKAMs on the reference mapping population (Cicer arietinum ICC 4958 × Cicer reticulatum PI 489777) were compiled with 317 unpublished TOG-SNPs and 396 published markers for developing the genetic map. As a result, a second-generation genetic map comprising 1328 marker loci including novel 625 CKAMs, 314 TOG-SNPs and 389 published marker loci with an average inter-marker distance of 0.59 cM was constructed. Detailed analyses of 1064 mapped loci of this second-generation chickpea genetic map showed a higher degree of synteny with genome of Medicago truncatula, followed by Glycine max, Lotus japonicus and least with Vigna unguiculata. Development of these cost-effective CKAMs for SNP genotyping will be useful not only for genetics research and breeding applications in chickpea, but also for utilizing genome information from other sequenced or model legumes.


Introduction
Chickpea (Cicer arietinum) is the third most important legume crop, a source of dietary protein and a beneficial agricultural crop in the semi-arid regions of the world. The development of sustainable high yielding varieties against persisting abiotic stresses and biotic stresses is a prerequisite to meet the world hunger. Molecular breeding strategies have been adopted to improvise crop improvement programmes in several crops including legumes such as soybean and common bean (see Chamarthi et al., 2011). In case of chickpea, progress in the area of implementation of markers in breeding programmes, however, has been relatively slow. Availability of limited molecular markers coupled with narrow genetic diversity has been the major constraints to hamper development of genetic maps and undertaking trait mapping studies. Marker genotyping cost is another critical factor that determines adoption of markers in breeding programmes as it involves genotyping of large number of segregating lines.
Among different marker systems, simple sequence repeats (SSRs) and SNPs are the markers of choice for genetics and plant breeding applications Gupta and Varshney, 2000). Although the genotyping assays are expensive and ⁄ or time consuming, the SSR markers have been an inevitable choice till date in many crop species including chickpea for large-scale characterization of germplasm collections (Upadhyaya et al., 2008), construction of genetic maps (Choudhary et al., 2009;Nayak et al., 2010;Thudi et al., 2011;Winter et al., 1999) and QTL identification (Aryamanesh et al., 2010;Santra et al., 2000). On the other hand, SNPs are biallelic and the most abundant genetic variations, which are evenly distributed in higher frequencies throughout the genome of most plant species (Allen et al., 2011;Yan et al., 2009). As these markers are amenable for automation and high-throughput approach, the genotyping costs for SNPs can be lowered down. As a result, SNP genotyping of large-scale segregating populations as well as germplasm collections becomes cost-effective for developing high-density genetic maps, genome-wide ª 2012 The Authors association mapping, marker-assisted selection (MAS) and genomic selection (GS) studies (see Varshney, 2010).
Depending on the sample size and number of markers to be analysed, medium-to high-throughput assay platforms such as BeadXpress and GoldenGate assays from Illumina Inc. (San Diego, CA) with varying set of multiplexes (96,384,768 or 1536 SNPs per assay) are available. Such platforms have been developed and used in several crop species such as barley , wheat (Akhunov et al., 2009), maize (Yan et al., 2009), oil seed rape (Durstewitz et al., 2010), soybean (Hyten et al., 2008), cowpea (Muchero et al., 2009), pea (Deulvot et al., 2010) and chickpea (Choudhary et al., 2012;R.V. Penmetsa, N. Carraquilla-Garcia, A.D. Farmer, R.K. Varshney, D.R. Cook, unpublished data). These platforms, however, are cost-effective only when a minimum of 96, 384, 762 or 1536 SNPs are used for genotyping a large number of genotypes (R.R. Mir, P.J. Hiremath, O. Riera-Lizarazu, R.K. Varshney, unpublished results). In cases of molecular breeding applications such as MAS where only few markers are required for genotyping a large number of segregating lines, Illumina-based genotyping assays do not seem to be cost-effective. In such cases, Competitive Allele Specific PCR (KASPar) assay from KBiosciences (Hertfordshire, UK) (http://www.kbioscience.co.uk) seems to be an attractive marker genotyping assay (Allen et al., 2011;Cortes et al., 2011). KASPar assay is a PCR-based novel homogeneous fluorescent SNP genotyping system. It is a very flexible assay and can be carried out on undefined set of markers (http:// www.kbioscience.co.uk/reagents/KASP_manual.pdf, http://www. kbioscience.co.uk/download/KASP.swf).
This study has been undertaken in chickpea with the following objectives: (i) to compile a large set of informative SNPs, (ii) to develop KASPar assays for cost-effective SNP genotyping, (iii) to analyse genetic diversity in the selected Cicer spp. accessions, (iv) to develop a second-generation genetic map based on SNPs, and (v) to determine the extent of genetic synteny of chickpea with some closely related legume species.

Large-scale identification of SNPs
With an objective of developing the cost-effective KASPar assays for chickpea genetics and breeding applications, 2486 informative SNPs were compiled following four approaches ( Figure 1).

Mining of sanger ESTs
On the basis of cluster analysis of 27 259 Sanger expressed sequence tags (ESTs), 9569 unigenes including 2431 contigs and 7138 singletons were identified in an earlier study . A set of 729 contigs having ESTs from at least two genotypes and read depth of ‡5 was explored for SNP selection. An SNP with high polymorphism information content (PIC) value ( ‡0.5) and having at least 50 bp window on either sides was considered from each contig. Finally, a nonredundant set of 286 SNPs from 286 TUSs were selected ( Figure 1).

Allele-specific sequencing of candidate genes
Allele resequencing of 220 genes on a set of 2-20 genotypes representing nine Cicer species provided 1893 SNPs in our earlier study . By considering the criteria of selecting one SNP with higher PIC value from each gene and 50-bp region on both flanking side of the SNP, a total of 183 SNPs present in 183 genes were selected. In addition, four SNPs coming from two drought-responsive genes (Nayak et al., 2009) were also selected ( Figure 1).  (Table S1). It is important to mention here that except for the 187 SNPs from allele resequencing of candidate genes and 604 SNPs from TOGs, the assembled SNPs were not validated earlier. Therefore, the compiled SNPs can be considered as putative SNPs.

Development and validation of KASPar assay
The selected set of 2486 SNPs was used for developing KASPar assays (Table S1). The developed KASPar assays have been designated as Chickpea KASPar Assay Markers (CKAMs). All 2486 CKAMs were used for validation on a panel of 70 genotypes (Table S2). These genotypes include 55 lines ⁄ varieties of the cultivated species (C. arietinum) from 11 countries, three accessions from the wild species (C. reticulatum) and 12 BC 3 F 2 lines generated after introgressing a genomic region containing QTLs for several drought tolerance traits from ICC 4958 into JG 11 by using marker-assisted backcrossing approach (unpublished results).
A total of 2005 (80.6%) CKAMs were validated of the 2486; of these, 1341 (66.8%) CKAMs were polymorphic among 58 genotypes, 664 (33.1%) were monomorphic in the genotypes tested, and 481 (19.4%) failed to generate a useful amplification signal (Table S1, Figure 2). No attempt was made to redesign the primer for failed CKAMs. A comparison of SNP predicted in silico (assembled) and alleles called in the KASPar assays for the 2005 validated CKAMs showed 100% consistency. The PIC values for the polymorphic CKAMs varied between 0.02 and 0.50 with an average of 0.12 (Table S1).
Analysis of CKAMs on the parental genotypes of the mapping populations showed higher polymorphisms in interspecific (C. arietinum · C. reticulatum) crosses than in intraspecific (C. arietinum · C. arietinum) crosses. Among interspecific crosses, maximum number of polymorphisms (930 CKAMs) was observed in the reference mapping population (ICC 4958 · PI 489777) followed by crosses segregating for Helicoverpa resistance, that is, ICC 3137 · IG 72953 (620 CKAMs) and ICC 3137 · IG 72933 (276 CKAMs). In the case of the intraspecific crosses, maximum polymorphism was identified between Arerti and ICC 4958 (159 CKAMs), which represent parents of MABC population for improvement of chickpea for drought tolerance. The polymorphism status of CKAMs between different parental combinations is given in Table 2.

Genetic diversity analysis
Genotyping data obtained for all 1341 polymorphic CKAMs on 58 chickpea genotypes (Table S3) were used for assessing the genetic diversity and understanding their genetic relationships. Genetic dissimilarity between different pairs of genotypes varied from 0.02 (ICC 7554 and ICC 3137) to a maximum of 0.74 (PI 48977 and IG 72933) with a mean of 0.37. On the basis of the dissimilarity data and UPGMA method, a hierarchical cluster analysis was performed on all the 58 genotypes using DARWIN V5.0.128 software (Perrier et al., 2003) (Figure 3). In the dendrogram, the genotypes were grouped into two discrete major clusters: the Cluster-I comprised only two wild species (C. reticulatum) genotypes (IG 72953 and PI 489777), and the Cluster-II comprised 56 genotypes of C. arietinum species, with an exception of one genotype IG 72933, belonging to C. reticulatum species, that branches off sequentially at the base of the dendrogram closer to the Cluster-I. In the Cluster-II, few landraces and cultivars from India (Annigeri, ICC 4593, ICCC 37, ICCV 05530), Ethiopia (Arerti), Mexico (ICC 12037) and Israel (ICC 7571) formed a clear outlying group, with the remaining 48 genotypes clustering into two main groups-the Cluster-IIa and the Cluster-IIb. The Cluster-IIa has 13 genotypes that mainly belong to Afghanistan (2), Chile (1), Ethiopia (1), Iran (4), Portugal (1), Turkey (1), Mexico (1) and former USSR (2). The Cluster-IIb is comprised of 35 genotypes, of which 33 belong exclusively to India, one to Iran and one to Cyprus. Within the Cluster-IIb, Figure 2 Snapshots showing SNP genotyping with KASPar assays. Different possible scenarios of SNP genotyping in germplasm collection (a-c) and interspecific RIL mapping population (d-f) have been shown. Marker genotyping data generated for each genotype were used for allele calling using the automatic allele calling option. Allelic discrimination (two alleles) for a particular marker in the genotypes examined has been shown on a scatter plot with axes 'X' and 'Y'. The snapshot (a) shows monomorphic pattern, that is, occurence of only one allele (blue spots) for CKAM0790 marker. In the snapshot (b), polymorphism pattern, that is, occurence of two alleles (blue and red spots) for CKAM1175 marker in almost equal proportion in the germplasm collection, has been shown. All germplasm accessions show homozygosity for the corresponding alleles, and one accession shows missing data (pink spot). The snapshot (c) shows heterozygosity, that is, occurence of both alleles (green spots) for CKAM1802 marker in nine germplasm accessions in addition to occurence of two alleles in homozygous condition in several accessions (blue and red spots) and three accessions with missing data. The snapshot (d) shows occurence of one allele (red spots) in majority of RILs, except two RILs with the other allele (blue spots) and two RILs with missing data (brown spots). Two clusters of about 50% of RILs each with one allele (blue and red spots) along with two RILs with missing data (brown spots) have been shown in the snapshot (e).
The snapshot (f) shows occurence of one allele (blue spots) in several RILs and missing data in majority of the lines.
ICC 1882 is separated from the rest of the genotypes. Overall, the clustering pattern showed a distinctive grouping of genotypes into separate clusters based on their geographical origin and also based on species background ( Figure 3a).

Relationship of BC 3 F 2 lines with the recurrent parent
A set of 12 BC 3 F 2 generated after introgressing a genomic region containing QTLs for several drought tolerance-related traits in JG 11 variety after maker-assisted backcrossing (MABC) with ICC 4958 genotype were tested with all 2005 CKAMs to assess the genome recovery of JG 11 parent in the MABC lines. As a result, 108-117 markers showed similarity between the given BC 3 F 2 line and JG 11 (Table S4). In brief, the tested BC 3 F 2 lines showed genome recovery of JG 11 from 91% (BC 3 F 2 _170, BC 3 F 2 _187, BC 3 F 2 _195) to 98% (BC 3 F 2 _120, BC 3 F 2 _248) ( Figure 3b). Furthermore, comparison of the BC 3 F 2 lines with ICC 4958 showed the presence of allele of ICC 4958 in the BC 3 F 2 lines for 10 CKAMs (CKAM0017, CKAM1802, CKAM1444, CKAM0042, CKAM0043, CKAM1641, CKAM1963, CKAM1933, CKAM1709 and CKAM1604). These markers seem to be the potential mappable markers in the genomic region transferred from ICC 4958 to JG 11.

Second-generation genetic map of chickpea
The reference mapping population (ICC 4958 · PI 489777) was targeted for integrating CKAMs in the genetic map of chickpea. In this context, a total of 930 CKAMs showed polymorphism between the parental genotypes. The polymorphic CKAMs include 503 Solexa ⁄ Illumina SNPs, 377 TOG-SNPs and 50 candidate gene sequencing-based SNPs. As genotyping data were already available on the reference mapping population for all 371 TOG-SNPs via GoldenGate assay, only 118 markers representing all the linkage groups were selected for genotyping via KASPar assays mainly for quality control. Therefore, genotyping data were generated on the reference mapping population for a total of 671 CKAMs (503 Solexa ⁄ Illumina SNPs, 50 candidate genes SNPs and 118 TOG-SNPs). High-quality genotyping data, however were generated for 651 CKAMs (492 Solexa ⁄ Illumina SNPs, 46 candidate genes SNPs and 112 TOG-SNPs). Analysis of genotyping data showed Mendelian segregation ratio for a total of 525 markers, and the remaining 126 (19.3%) markers exhibited segregation distortion (Table S5) owing to skewed occurrence ⁄ distribution of one of the two parental alleles or high percentage (60%) absence of allele data (Figure 2d,e,f).
As genotyping data were available for a total of 429 TOG-SNPs via GoldenGate assay (R.V. Penmetsa, N. Carraquilla-Garcia, A.D. Farmer, R.K. Varshney, D.R. Cook, unpublished data) and high-quality genotyping data were generated for 112 TOG-SNPs from this set via KASPar assay in the study, the genotyping data for the remaining 317 TOG-SNPs generated via GoldenGate assay were added to the data set of 651 CKAMs. In addition, genotyping data were also assembled for (i) 61 genic molecular markers (GMMs) including 31 CGMMs, 15 CIS-Rs and 15 ICCeMs , and (ii) 335 legacy markers including SSRs from different sources (H-series, ICCMs, CAMs, SSRs-Frankfurt University, ISSRs), SNaPshot assays-based SNPs, CAPS, DArTs , and RAPDs. In summary, genotyping data were compiled for 1364 markers and used for constructing the genetic map. The most likely order of the markers was determined based on the verified position of GMMs , TOG-SNPs (R.V. Penmetsa, N. Carraquilla-Garcia, A.D. Farmer, R.K. Varshney, D.R. Cook, unpublished data) and legacy markers (Nayak et al., 2010;Thudi et al., 2011). By using JOINMAP v 4.0 program (Van Ooijen et al., 2006), a total of 1328 markers were mapped onto eight linkage groups (CaLG01-CaLG08) as per the nomenclature given in Thudi et al. (2011). The developed genetic map spans a total of 788.6 cM distance with an average intermarker distance of 0.59 cM (http://cmap.icrisat.ac.in/cmap/sm/cp/hiremath/) ( Figure 4). Details about different type of markers integrated in this map are given in Table 3. The number of markers per linkage group varied from 107 (CaLG08) to 255 (CaLG04). The total distance of individual linkage groups ranged from 70.5 (CaLG08) to 116.6 cM (CaLG01).
Uneven distribution and clustering of markers was observed along the length of all the chickpea linkage groups in this map. Occurrence of both minor (3-5 cM) and major (>5 cM) gaps between adjacent loci was observed (Table 4). A detailed observation revealed extensive clustering of CKAMs and TOG-SNPs near the telomeric regions of CaLG03, CaLG06, CaLG07 and (a) (b) Figure 3 Genetic relationships in germplasm and BC 3 F 2 lines. Hierarchical clustering of chickpea accessions was carried out based on UPGMA using DARwin. The part (a) of the figure shows phylogenetic relationships among 58 germplasm lines based on allelic data for 1341 CKAMs. All the genotypes analysed could be grouped into two main clusters (I and II). The Cluster-I comprised two wild species genotypes (Cicer reticulatum) and Cluster-II comprises accessions mainly of Cicer arietinum species coming from 11 different countries. The part (b) of the figure shows genetic dissimilarity of 12 BC 3 F 2 lines with JG 11, the recurrent parent.   LGs, namely CaLG02, CaLG07 and CaLG08, are split into A and B parts; three LGs, namely CaLG04, CaLG05 and CaLG06, are split into A, B and C parts; the CaLG01 is divided into A, B, C and D parts; and CaLG03 is divided into A, B, C, D and E parts. Map distances (cM) are presented on the left side of the bars, and corresponding markers are listed on the right side of the bars. Each marker class is colour coded as follows: green, CKAMs; red, TOGs-SNPs; black, CGMMs; dark blue, CISRs; golden yellow, ICCeMs; light blue, DArTs; and brown, legacy markers. High resolution genetic map is available at http:cmap.icrisat.ac.in/cmap/sm/cp/hiremath/.  CaLG08 (Figure 4). In the case of CaLG01, CaLG02, CaLG04 and CaLG05, more CKAMs were clustered near the subtelomeric regions.

Comparison of the developed genetic map with other chickpea maps
The developed genetic map with 1328 marker loci was compared with the 1291 loci genetic map  and 300 loci transcript map of Gujaria et al. (2011). The details of comparison of these maps are available at http://cmap.icri-sat.ac.in/cmap/sm/cp/hiremath/. These comparisons reflect a greater congruency in terms of grouping of markers into specific linkage groups. A few exceptions were also observed. For instance, TA4L-TA199R-3_300 and TA4L-TA191R_291-284 loci were mapped on LG04 by Thudi et al. (2011) and on LG06 by Gujaria et al. (2011); these loci have been assigned to CaLG07 in the present map. Similarly, the marker loci TA5L-TS38R-1_470 and TA5L-TS129R_208 that were present on LG05 and LG08 of genetic maps developed by Thudi et al. (2011)   In the case of chickpea and Medicago, 555 unique chickpea loci showed significant matches with 1558 genomic regions on Medicago chromosome (Table 5). Most of the chickpea loci have ‡2 matches in Medicago. About 111 chickpea loci from CaLG01 showed similarity with Mtchr02 genomic regions. Similarly, loci from CaLG02 showed maximum matches to Mtchr05, followed by CaLG03 with Mtchr07, CaLG04 with Mtchr01, CaLG05 with MtChr03, CaLG06 with Mtchr04, CaLG07 with MtChr04, and CaLG08 with MtChr05. In brief, each linkage group of chickpea showed considerable synteny with one or more chromosomes of Medicago, although internal duplication of DNA sequences ⁄ blocks was not observed (Figure 5a).
In the comparison of chickpea with soybean, 494 chickpea unique loci matched 1798 short stretches distributed on different chromosomes of soybean (Glyma1 assembly) (Figure 5b, Table S6). Each chickpea marker locus showed similarity to   Figure 5 Genome relationships of chickpea with closely related legume species. Homologous relationship of chickpea genome with four legume species, that is, Medicago truncatula (a), soybean (b), Lotus japonicus (c) and cowpea (d), has been shown by comparing sequence data of 1064 mapped markers of chickpea with genome sequence of Medicago (Mt 3.5), L. japonicus (Lj 2.5 pseudomolecules), soybean (Glyma1 genome assembly) and cowpea genetic map (Muchero et al., 2009). Maximum similarity was observed with Medicago (1558), followed with soybean genome (1798), Lotus (438) and least with cowpea (55). The percentage of matches in each species is in congruence with their phylogenetic distances. approximately 3-4 regions on Glyma1. This reflects the number of matches one would expect to see based on the one round of whole genome duplication in soybean. Only 267 unique chickpea loci matched with 438 regions on Lotus (Table S7, Figure 5c). In the case of cowpea in which genetic map was used for the comparison, least matches were observed between chickpea and cowpea genomes. Only 50 unique chickpea loci showed synteny with 55 loci of cowpea map (Table S8, Figure 5d).

Cost-effective KASPar assays for SNP genotyping
Until recently, SSR markers were the commonly used markers for chickpea genetics research and breeding applications . Nevertheless, in some cases, genetic maps have also been developed using DArTs , CISRs  and SNPs ⁄ CAPs (Choudhary et al., 2012;Gujaria et al., 2011;Nayak et al., 2010). With the availability of whole genome or EST sequences in many crop species, the use of SNP markers has been proven attractive for high-throughput use in molecular breeding (Rafalski, 2002;Varshney, 2010). High-throughput SNP genotyping platforms such as Illumina's GoldenGate or Infinium assays are being used for large-scale SNP genotyping. While the high-throughput SNP genotyping platforms are very useful for rapid genotyping of mapping population or germplasm collections, they are not generally economical for projects such as in silico SNP validation, gene-specific SNP assays, marker saturation in the regions of interest and marker application projects that utilizes defined set ⁄ panel of smaller number of SNP markers on varying number of genotypes. In such cases, SNP genotyping technologies such as arrayed primer extension reaction (APEX) (Podder et al., 2008), dynamic allele-specific hybridization (DASH) (Podder et al., 2008), molecular beacons (Mhlanga and Malmberg, 2001), primer extension followed by MALDI-TOF (alternative to Sequenom's assays) (Sauer et al., 2000) and KASPar assay (http://www.kbioscience.co.uk/reagents/KASP.html) have been developed. While choosing a particular SNP genotyping platform, several features such as the reproducibility, accuracy, capability of multiplexing, the level of throughput, time consumption and cost (considering both the equipments required and the cost per genotype) need to be considered. As molecular breeding applications, generally, require screening of large populations with a few markers, this study developed costeffective KASPar marker assays for SNP genotyping in chickpea. A total of 2486 SNPs were assembled from different sources for developing KASPar assays. KASPar assays developed for chickpea have been referred as CKAMs. Genotyping of these 2486 CKAMs on a panel of 70 genotypes provided a validated set of 2005 CKAMs. This includes KASPar assays for 539 TOG-SNPs that were initially assayed on GoldenGate assays. Conversion of these TOG-SNPs into KASPar assay will facilitate use of TOG-SNPs in chickpea genetics and breeding application.
To compare the success rate of converting putative SNPs into successful and informative KASPar assays, amplification and polymorphism statistics were checked across the four sets of SNPs. The set of markers that gave higher rate of failures were those SNPs identified from alignments of Sanger ESTs (172 SNP markers, i.e. 60% of a total of 286). The possible reasons could be attributed primarily to (i) SNPs were mined from the ESTs with sequencing artefacts, (ii) frequency of one of two alleles for a given SNP is very low in the EST data set, and (iii) all the genotypes for which EST-based mining approach provided SNPs were not included in the genotype panel used in the current study . The remaining number of markers that could not be validated include 222 (15.7% of total of 1409) from Alpheus pipeline predicted SNPs, 65 SNPs (10.7% out of 604) from TOG-SNPs and 22 SNPs (11.7% out of 187) from allele resequencing data. Overall, the KASPar assay has shown 81% validation success rate in our study. Comparison of costs and time involved in genotyping the SNPs via KASPar assays and GoldenGate assays for the same set of SNPs in this study, showed superiority of KASPar assays over GoldenGate assays, especially when limited number of SNPs (<500) are genotyped with <100 lines.
The PIC values of validated CKAMs varied from 0.02 to 0.50 with an average of 0.12. Low range of PIC value of CKAMs is not unexpected as genetic variation in the chickpea gene pool is limited (Nayak et al., 2010;Thudi et al., 2011). Also, this study identifies polymorphic markers (15-930) for different mapping populations segregating for drought, salinity, Fusarium wilt, Ascochyta blight, etc. It, therefore, provides opportunities for mapping resistance to biotic and tolerance to abiotic stresses in chickpea.

Diversity analysis and molecular breeding applications
This study demonstrates the suitability of KASPar assays for SNP genotyping for understanding the relationships in the germplasm collection as well as for molecular breeding applications. Despite using a wide diverse collection of genotypes with all 2005 CKAMs, an overall success rate of 81% was achieved. The genetic dissimilarity analysis of the germplasm accessions determines relationships of accessions with each other. The dendrogram developed based on genetic dissimilarity coefficient depicted clear clustering of chickpea accessions into two main clusters as per their geographical origin and species type of all 58 accessions (55 accessions of C. arietinum species and three accessions of C. reticulatum species) analysed. Two accessions of C. reticulatum are resolved as a separate group; however, IG 72933, a C. reticulatum, was found closer to C. arietinum. Similar results were observed in an earlier genetic diversity study using 513 SSR markers in which the IG 72933 genotype showed 40% similarity with the C. arietinum genotypes (Gudipati, 2007). The Cluster-II contained more geographically divergent material of the C. arietinum species. As expected, accessions of all Indian origin formed a separate clade, and the remaining accessions from other countries were grouped into another clade (IIa). Overall, these results are in general congruence with earlier studies and indicate that the cluster topology is reliable.
The study also demonstrates the utility of CKAMs for assessing the genome recovery of BC 3 F 2 lines. This study identified five lines (BC 3 F 2 _120, BC 3 F 2 _170, BC 3 F 2 _187, BC 3 F 2 _195 and BC 3 F 2 _268) with > 95% genome recovery of JG 11 in MABC experiments. These lines may be used for multi-location field trials for evaluating agronomic performance as well as for developing the near isogenic lines (NILs) for fine mapping the QTLs.
Second-generation genetic map of chickpea with more anchoring points with other legume genomes As expected, the number of polymorphic markers observed between interspecific mapping populations is higher than intraª 2012 The Authors Plant Biotechnology Journal ª 2012 Society for Experimental Biology, Association of Applied Biologists and Blackwell Publishing Ltd, Plant Biotechnology Journal, 10, 716-732 specific mapping populations. For instance, maximum number of polymorphic markers is 930 (ICC 4958 · PI 489777) in interspecific crosses as compared with 159 (Arerti · ICC 4958) in intraspecific crosses. As ICC 4958 · PI 489777 population is a reference mapping population, genotyping data were generated for the polymorphic CKAMs. Although genotyping data were earlier generated for TOG-SNPs on the mapping population via GoldenGate assays (R.V. Penmetsa, N. Carraquilla-Garcia, A.D. Farmer, R.K. Varshney, D.R. Cook, unpublished data), a set of 118 TOG-SNPs distributed on all eight LGs was also targeted for generating genotyping data via KASPar assays for quality control purpose. Comparison of high-quality data for 112 markers generated via KASPar assay with that of Golden-Gate assay showed no discrepancy. After assembling genotyping data for 539 remaining CKAMs, 317 TOGs and 396 marker loci from other sources Nayak et al., 2010;R.V. Penmetsa, N. Carraquilla-Garcia, A.D. Farmer, R.K. Varshney, D.R. Cook, unpublished data;Thudi et al., 2011), genotyping data for a total of 1364 marker loci were considered for mapping. As a result, a comprehensive genetic map comprising 1328 marker loci including 939 new marker loci (625 CKAMs, 314 TOGs-SNPs) and 389 already published mapped marker loci was developed. The second-generation genetic map has a coverage of 788.6 cM genetic distance. On an average, each of the linkage group has 166 markers with an average distance of 98.6 cM. This map has probably the highest number of gene-based SNP markers (1088) mapped in chickpea so far. Earlier to this map, Gujaria et al. (2011) developed a transcript map with 126 gene-based markers and Choudhary et al. (2012) developed a genetic map with 406 marker loci including 177 gene-based markers. This map has approximately eightfold gene-based markers as compared with the abovementioned studies. Another important feature with this genetic map is the availability of cost-effective KASPar assays for the mapped gene-based markers that can be used in any number as well as on a variable number of lines. The quality and accuracy of the second-generation genetic map was evaluated by comparing it with several genetic maps developed in earlier studies Nayak et al., 2010;Thudi et al., 2011;Winter et al., 1999).
Clustering of two or more markers is a commonly occurring phenomenon observed in several earlier genetic maps of chickpea (Nayak et al., 2010;Thudi et al., 2011;Winter et al., 1999). Only CKAMs and TOG-based SNPs were clustered, which constitute a large proportion of mapped markers [i.e. 625 CKAMs and 314 TOG-SNPs (939, 71%) of 1328] compared with other marker types. This clustering may be attributed mainly to random selection of markers from the closely spaced regions of the genome that have undergone comparatively less number of recombination events.
As a complement to the gene-based linkage map developed in this study, we compared the sequences of these mapped loci with genome assemblies ⁄ genetic maps of four legume species (Medicago, Lotus, cowpea and soybean). Through the comparative analysis, high conservation of synteny was observed between chickpea and Medicago, whereas lowest level of synteny conservation was observed between chickpea and cowpea. Apparently, during the time of analysis genome sequence information was not available for cowpea; hence, the analysis was carried out by comparing with high-density linkage map developed by Muchero et al. (2009) available then. As a result, least similarity was identified between chickpea and cowpea, although chickpea is phylogenetically closer to cowpea than it is to soybean, which shares the same common ancestor relative to the ancestor of chickpea, Medicago and Lotus (Wojsciechowski et al., 2004). In all the other cases, high level of similarity was observed (>70%, 1E-05) between sequences of chickpea, and those of compared legumes, however, are often punctuated or interrupted by chromosomal rearrangements, thereby resulting in disruption of the linear order of the genes. Subsequently, these variations (insertion, deletion, duplication or rearrangements) form the basis for evolution of diverse genomes. One or more chickpea loci match to a single locus on Medicago chromosome, and similar pattern was observed for the remaining three legume genomes with chickpea. This may reflect segmental duplication events of chromosomal stretches, or the mapped loci may correspond to paralogous genes or same gene family members. Recent analysis of Medicago genome has revealed that higher rates of mutations and chromosomal rearrangements are known to have occurred after the whole genome duplication event as compared with other model legumes such as Glycine max and L. japonicus (Young et al., 2011).
A number of chickpea unique loci matching to different chromosomal regions on Mt 3.5, Glyma1, Lj 2.5 and cowpea genetic map were identified. Of the 69 chickpea unique loci that mapped on 227 regions distributed over eight chromosomes of Medicago, approximately 49% (i.e. 111 of 227) matched to the MtChr02 and the remaining 116 were similar to those on other chromosomes. Only 53 loci are in linear order with Mtchr02 chromosomal regions, and the remaining are in nonlinear positions. These findings support the earlier reports by Choi et al. (2004), Nayak et al. (2010) and Zhu et al. (2005) that one to one synteny does not hold true between chickpea and the compared legume species, and the synteny is restricted only to small genetic or genomic intervals (Young et al., 2011). Our comparative results showed that regions of CaLG02 and CaLG08 are strongly similar to Mt05, which in turn shows high similarity to regions on Gm01, Gm02 and Gm11, which are consistent with the findings of Young et al. (2011).

Conclusions
The study reports compilation of a large number of SNPs and their conversion into cost-effective KASPar assays. A set of 2005 KASPar assays have been developed for accelerating chickpea genetics research and breeding applications. Together with these markers and recently developed SSR markers from genomic libraries (Nayak et al., 2010) and BAC-end sequences , DArT markers , CISRand CAPS-based CGMMs, >10 000 markers have become available in chickpea. The available marker resource should be able to tackle the issue of narrow genetic diversity in the gene pool as it is now possible to identify reasonable number of polymorphic markers in any given combination of cross. Genetic structure information gained on 58 chickpea accessions may be useful in finding suitable parental combinations for developing the new mapping populations segregating for different traits of interest to chickpea breeders. Furthermore, a number of polymorphic markers were identified in many existing mapping populations that can be used for developing genetic maps and mapping of different agronomic traits. Many polymorphic markers were found to be common in many mapping populations, revealing their usefulness in providing bridging markers and for comparing different chickpea maps. Developed genetic map is the most enriched genetic map for gene-based markers. This map should be useful not only in comparing different chickpea genetic maps, but also in anchoring the physical map, currently underway, as well as establishing more anchor points among genomes of chickpea and other legume species.

Plant material and DNA extraction
A set of 70 different chickpea genotypes was used for validation of SNPs using KASPar assays. Details of these genotypes are given in Table 2 and Table S2. Furthermore, a set of 131 recombinant inbred lines (RILs) derived from the cross between ICC 4958 (C. arietinum) and PI 489777 (C. reticulatum) was used for genetic mapping.
Total genomic DNA of all the accessions was extracted from leaves of two-week-old seedlings using high-throughput mini DNA extraction protocol as mentioned in Cuc et al. (2008). The quality and quantity of extracted DNAs were assessed on 0.8% agarose gel. The DNA was normalized to 5 ng ⁄ lL for genotyping.

RNA Sequencing by Solexa ⁄ Illumina
Five different chickpea genotypes, viz. ICC 4958, ICC 1882, PI 489777, ICC 506 and ICCC 37, which are parents of different mapping populations, were selected for RNA sequencing. Roots of 22-day-old seedlings of ICC 4958 and ICC 1882 were subjected to drought stresses, and subsequently total RNA was extracted from both genotypes . About 22-day-old leaves of ICC 506 and ICCC 37 were infested with larvae of Helicoverpa armigera for a period of 5 days under green house conditions (temperature of 28 ± 5°C and relative humidity of >65%). After a brief infestation period, leaf samples from both genotypes were harvested for total RNA extraction. Total RNA was also extracted from 22-day-old root tissues of PI 489777, a wild species genotype. Subsequently, the total RNA samples of all the genotypes were sent for Solexa ⁄ Illumina sequencing at National Center for Genome Research (NCGR), USA.

Development and analysis of KASPar assays
For developing the KASPar assays, 50 bp upstream and 50 bp downstream flanking sequences around the variant position (SNP) were selected (Table S1). Subsequently, KASPar assays for the targeted SNPs were carried out at KBioscience, UK. Complete details on principle and procedure of the assay are available at http://www.kbioscience.co.uk/reagents/KASP_manual. pdf and http://www.kbioscience.co.uk/download/KASP.swf. On the basis of the fluorescence obtained, allele call data are viewed graphically as a scatter plot for each marker assayed using the SNPViewer. The consistency between the predicted SNP and assayed ones was checked for each SNP marker.

Evaluation of polymorphism in chickpea accessions
The PIC refers to the value of a marker for detecting polymorphism within a given germplasm, depending on the number of detectable alleles and the distribution of their frequency. In this study, the PIC value of markers was calculated using the following formula (Anderson et al., 1993): Where 'n' denotes the total number of alleles and 'p' refers to the frequency of the 'i'th allele at a genetic locus in different genotypes.

Genetic diversity analysis
To evaluate the relationship between chickpea germplasm accessions, SNP allele call data obtained for polymorphic markers were used for calculating both pair-wise genetic distance and per cent dissimilarity matrix to construct a dendrogram using DARWIN V5.0.128 software (darwin.cirad.fr/darwin/ Home.php, Perrier et al., 2003). Cluster analysis was carried out using the UPGMA method.

Comparative mapping between chickpea and closer legumes
Sequences data for mapped chickpea marker loci were queried using BLAST against genomes of M. truncatula (Mt 3.5), L. japonicus (Lj 2.5 pseudomolecules), soybean (Glyma1 genome assembly) and cowpea genetic map (Muchero et al., 2009). All the databases mentioned are available at http://comparativelegumes.org/. Hits matching a minimum of 70% sequence identity were retained for comparative study. Identification of homologous blocks was performed using I-ADHORE v2.1 (Vandepoele et al., 2002). For the purpose of developing Circos images, cM distances on the chickpea linkage groups were scaled up by a factor of 250 000 to match similar base pair lengths of the chromosomes of other legumes' genomes. Visualization of blocks was performed with Circos26. Scales along the outer edge of the chickpea linkage groups show actual cM distances, while the scale along the outer edge of the Medicago chromosomes are in Mb.

Supporting information
Additional Supporting information may be found in the online version of this article: Please note: Wiley-Blackwell is not responsible for the content or functionality of any supporting materials supplied by the authors. Any queries (other than missing material) should be directed to the corresponding author for the article.