Microsatellite diversity among the primitive tribes of India

The present study was undertaken to determine the extent of diversity at 12 microsatellite short tandem repeat (STR) loci in seven primitive tribal populations of India with diverse linguistic and geographic backgrounds. DNA samples of 160 unrelated individuals were analyzed for 12 STR loci by multiplex polymerase chain reaction (PCR). Gene diversity analysis suggested that the average heterozygosity was uniformly high ( .0.7) in these groups and varied from 0.705 to 0.794. The Hardy-Weinberg equilibrium analysis revealed that these populations were in genetic equilibrium at almost all the loci. The overall GST value was high (GST 5 0.051; range between 0.026 and 0.098 among the loci), reflecting the degree of differentiation/heterogeneity of seven populations studied for these loci. The cluster analysis and multidimensional scaling of genetic distances reveal two broad clusters of populations, besides Moolu Kurumba maintaining their distinct genetic identity vis-àvis other populations. The genetic affinity for the three tribes of the Indo-European family could be explained based on geography and Language but not for the four Dravidian tribes as reflected by the NJT and MDS plots. For the overall data, the insignificant MANTEL correlations between genetic, linguistic and geographic distances suggest that the genetic variation among these tribes is not patterned along geographic and/or linguistic lines.


Introduction
The most remarkable feature of the Indian population structure is the clear division of its population into strictly defined endogamous castes, tribes and religious groups.
With the exception of Africa, India harbors more genetic diversity than other comparable global regions.It is generally believed that the tribal people, who constitute 8.2% of the total population (2001 census of India), are the original inhabitants of India.The total number of tribal groups is estimated to be 461, who speak about 750 dialects that belong to one of the four language groups, Austro-Asiatic, Indo-Europeans, Dravidian and Tibeto-Burman. [1,2]It is possible that populations living in close geographic proximity are more likely to exchange genes, thereby enhancing genetic similarity, despite the fact that these populations may not belong to the same sociocultural stratum.
] Indeed, geographical clines have been reported for traditional genetic markers like ABO allele frequencies. [6]vertheless, clines for other genetic markers are observed to be restricted to very small radius, not over long distances, as reflected by the autocorrelation analyses of traditional genetic markers and quantitative variables like anthropometry and dermal ridge counts. [7,8]is has been ascribed to the unique Indian population structure, characterized by strict endogamy of the castes and tribes, which fits the kind of island model rather than the isolation by distance model of population structure.It has also been argued that tribes belonging to different language families represent different genetic lineages; hence, they are genetically different. [9]Based on autosomal markers, Roychoudury et al. reported close genetic affinity for populations from similar linguistic backgrounds. [10]The present study was undertaken to determine the extent of genetic variation based on 12 STR loci among seven primitive tribal populations of India, belonging to the same ethnic group traditionally described as Australoid.These tribes speak languages belonging to Original Article DOI: 10.4103/0971-6866.60187two different linguistic families and are widely separated geographically.

Materials and Methods
The location of the study populations along with the linguistic background and sample sizes are presented in Five to 10 ml of blood was collected in EDTA after informed consent from 160 unrelated individuals belonging to the seven tribal groups.DNA was isolated from the leucocytes by using standard protocols. [11]elve dinucleotide microsatellite STR loci (D12S83, D13S218, D12S78, D13S217, D12S1659, D13S285, D13S170, D12S1723, D13S175, D13S263, D12S1617 and D12S346) were analyzed by multiplex PCR using commercially available ABI Prism Linkage Mapping sets V2.5 kits (Applied Biosystems, Foster city, California, USA).The samples were run on the ABI Prism 310 Genetic analyzer (Applied Biosystems) using the Gene Scan program.The resultant data analysis was carried out using the Genotyper software.Alleles at 12 loci were designated by repeat numbers.
Allele frequencies at each locus were calculated by a simple gene counting method. [12]The Hardy-Weinberg equilibrium for each locus was tested by the Exact test, which was performed using software Arlequin version 3.0. [13]Nei's coefficient of gene differentiation (GST), which is based on mean heterozygosity within populations (H S ) and mean heterozygosity for the total sample (H T ) {GST 5 1-(HS/HT)}, was calculated using the software Dispan. [14]Pair-wise genetic distances between populations (DA distance) following Nei et al. [15] and a phylogenetic tree based on the neighbor-joining (NJ) method proposed by Saitou and Nei [16] were constructed using the software Dispan.Arlequin software version 3.0 was used for the analysis of molecular variance (AMOVA).htm.The geographic distance between places was computed based on geographic coordinates of the place from where the samples were collected.Linguistic distance for the correlation analyses was based on the linguistic trees, [17] considering different languages spoken by these tribes within each of the linguistic families.The scheme used was one unit distance between populations separated at each branch nodes.

Results
Allele frequencies at 12 STR loci among the seven tribal groups showed the presence of the same common alleles and, in a majority of the cases, the most predominant allele was the same, with a variable frequency.The observed heterozygosity at each locus and the average heterozygosity over all the loci for each of the study populations are given in Table 1.
The average heterozygosity that indicates the degree of within-population variation is uniformly high (.0.AMOVA was performed, grouping the tribes according to the state to test whether geographically closer tribes are also genetically closer [Table 4].Genetic differentiation between these tribal populations within the geographic groups is found to be significant and larger (F ST value 0.03) when compared with that among the geographic groups, which is negligible and non-significant.Similar results were when AMOVA was performed based on the linguistic groupings.Genetic differentiation within the Linguistic group was found to be significant compared with the non-significant value for genetic differentiation between the two linguistic groups.
Mantel correlations between genetic distance and geographical or linguistic distances were not significant.The partial Mantel correlation was significant neither between genetic and geographical (P 5 0.30) nor between genetic and linguistic distances (P 5 0.58), respectively, controlling for linguistic and geographic distances.

Discussion
The immense cultural, linguistic and ethnic diversity in the Indian population, which has crossed one billion, offers tremendous scope for genetic diversity studies in the country.Microsatellites are STRs composed of a core unit of one to five bases and are considered to be highly informative markers in the various fields of modern genetics.The dinucleotide microsatellite markers are known to be selectively neutral in nature.
Therefore, observed variations in the allele frequencies could be due to random genetic drift or admixture.Because the populations under study have generally remained endogamous, similarities of allele frequencies among them are probably a reflection of their common ancestry. [8]ese populations show high levels of average heterozygosity (about 74%), suggesting high withinpopulation diversity at these 12 loci.The average G ST is observed to be 5.2%, suggesting a significant amount of inter-tribal differentiation.Further, this G ST value is much higher than that observed for the traditional markers (1.5%) among the Indian populations.Based on STR loci, previous studies have reported a relatively high G ST value among the Indian populations representing northern, eastern and northeast regions [17,18] as compared to the G ST value based on STR, VNTR and other DNA marker loci among the subcastes of Golla from Southern Andhra Pradesh, [19] Bhargavas, Chaturvedis and Brahmins of North India [20,21] and also the tribal populations of Madhya Pradesh [8] and Orissa. [22]The average G ST value calculated for three STR loci common to 23 Indian populations ranged from 3.2% among the Golla sub castes at the local level to 6.7% for all the 23 groups representing different regions of India. [23]Based on the 12 mirosatellite loci, G ST value for the tribal populations of the present study is quite high as compared with the other continental populations (G ST , 2%) such as Africans, Caucasians and Mongoloids. [24,25]This could be due to from other tribal groups. [26,27]e analysis of molecular variance reveals that the extent of genetic differentiation among the states or among linguistic families is not significant although the genetic differentiation between populations within the states or within the linguistic family was significant, suggesting that the current geographic and/or linguistic boundaries are not significant determinants of genetic

Figure 1 .
Figure 1.The Indo-European language-speaking tribes Kolcha, Kotvadia and Katkari are from the neighboring states of Gujarat and Maharashtra.The three Dravidian language-speaking tribes Irula, Kurumba and Moolu Kurumba are from the Nilgiri district of Tamil Nadu whereas Madia are a group belonging to the Dravidian language family and are from the state of Maharashtra, which is predominantly inhabited by populations speaking Indo-European languages.Interestingly, the Gondi language spoken by Madia has an Indo-European script in the north and a Dravidian script in the south.

Figure 1 :
Figure 1: Map of India showing the locations from where samples were collected along with information on languages spoken by these tribes 7), varying from 0.705 to 0.795.Among the different loci, D13S263 showed the highest level of heterozygosity in the populations (0.643-0.923) and D12S1659 the lowest (0.300-0.678).The Hardy-Weinberg equilibrium analysis revealed that these populations are in genetic equilibrium in almost all the loci studied.Gene diversity analysis for individual loci and for all the loci taken together is presented in Table2.The coefficient of gene differentiation among the populations is variable across the loci.The overall extent of genetic differentiation among the seven groups is high (G ST 5 0.051).However, there is considerable heterogeneity in the degree of differentiation at different loci, high (9.8%) in the case of D13S175 and low (2.6%) in the case of D13S170.The PD is an index of the power of a particular locus to discriminate individuals in a population and the results suggest that most of these loci show a high value of this index.Pair-wise genetic distances between the study populations were computed from the allele frequencies of the 12 loci [Table3] and an unrooted NJ tree was constructed from the distance matrix [Figure 2].The study populations grouped themselves into two broad clusters in the tree, one formed by Kolchas, Katkaris and Kotvadias, the Indo-European-speaking tribal groups of Gujarat and Maharashtra along with Madias that are a Dravidian-speaking tribe but living in Maharashtra, and the other by Irulas and Kurumbas from Tamil Nadu, belonging to the Dravidian linguistic family.The Dravidian-speaking Moolu Kurumba tribe from the

Figure 2 :
Figure 2: Neighbor-joining tree based on DA distances depicting genomic affinities among seven tribal population groups of India

Figure 3 :
Figure 3: Bidimensional plot of tribal populations based on multidimensional scaling of the Nei's DA distance matrix strict endogamy and small population sizes, which might have led to rapid genetic differentiation.The genomic affinities among the groups studied [Figures 2 and 3] indicate that Katkaris, Kolchas and Kotvadias, the Indo-European-speaking tribes are closer to each other while the Madias (Dravidian-speaking tribe) join this cluster as an outer element.Although Madias and Katkaris are from the same state of Maharashtra, the two are from places that are geographically wide apart and also belong to different linguistic groups.On the other hand, Kolcha and Kotvadia from Gujarat are geographically close to Katkaris from Maharashtra.The genetic affinities between the tribes Kolcha, Kotvadia and Katkaris can thus be explained in terms of linguistic affiliation and geographic proximity.Irulas and Kurambas, both Dravidian-speaking tribes, are also genetically closer to each other.On the other hand, Moolu Kurumba, which is also a Dravidian-speaking tribe of Tamil Nadu, do not cluster with Irulas and Kurumbas.The samples for three Dravidian-speaking tribes Irula, Kurumba and Moolu Kurumba have been collected from the same place.Thus, for the Dravidian-speaking tribes, neither the linguistic affiliation nor the geographical proximity explain the population relationships.The position of Moolu Kurumba is hard to explain as, linguistically, Kurumba and Moolu Kurumba are closest to each other.The much longer branch length for Moolu Kurumba in the NJ tree probably indicates earlier separation and/or distinct origin as compared with the other Dravidian-speaking tribal populations.Although Madia also belongs to the Dravidian linguistic group, it is not close, linguistically, to either Irula or Kurumba and Moolu Kurumba.This might be the reason why we find that, genetically, Madia is not close to other Dravidian groups.Previous studies on the tribal groups like Katkaris and Irulas for other STR markers reported that these tribes are genetically distinct