Multivariate statistical assessment of heavy metal pollution sources of groundwater around a lead and zinc plant

The contamination of groundwater by heavy metal ions around a lead and zinc plant has been studied. As a case study groundwater contamination in Bonab Industrial Estate (Zanjan-Iran) for iron, cobalt, nickel, copper, zinc, cadmium and lead content was investigated using differential pulse polarography (DPP). Although, cobalt, copper and zinc were found correspondingly in 47.8%, 100.0%, and 100.0% of the samples, they did not contain these metals above their maximum contaminant levels (MCLs). Cadmium was detected in 65.2% of the samples and 17.4% of them were polluted by this metal. All samples contained detectable levels of lead and iron with 8.7% and 13.0% of the samples higher than their MCLs. Nickel was also found in 78.3% of the samples, out of which 8.7% were polluted. In general, the results revealed the contamination of groundwater sources in the studied zone. The higher health risks are related to lead, nickel, and cadmium ions. Multivariate statistical techniques were applied for interpreting the experimental data and giving a description for the sources. The data analysis showed correlations and similarities between investigated heavy metals and helps to classify these ion groups. Cluster analysis identified five clusters among the studied heavy metals. Cluster 1 consisted of Pb, Cu, and cluster 3 included Cd, Fe; also each of the elements Zn, Co and Ni was located in groups with single member. The same results were obtained by factor analysis. Statistical investigations revealed that anthropogenic factors and notably lead and zinc plant and pedo-geochemical pollution sources are influencing water quality in the studied area.


Introduction
Water is one of essential compounds for all forms of plants and animals [1], thus its pollution is generally considered more important than soil and air. Due to its specific characteristics, this liquid bears unique properties. It is the most effective dissolving agent, and adsorbs or suspends many different compounds [2].
More than one billion people in the world do not have suitable drinking water, and two to three billions lack access to basic sanitation services. About three to five millions die annually from water related diseases [3].
Surface water (fresh water lakes, rivers, streams) and groundwater (borehole water and well water) are the principal natural water resources. Nowadays one of the most important environmental issues is water contamination [4,5]. Heavy metals are among the major pollutants of water sources [6]. Despite this, heavy metals are sensitive indicators for monitoring changes in the marine environment. Due to human industrial activities, the levels of heavy metals in the aquatic environment are seriously increasing and have created a major global concern [7,8]. Some of these metals are essential for the growth, development and health of living organisms, whereas others are non-essential as they are indestructible and most of them are categorized as toxic species on organisms [9]. Nonetheless the toxicity of metals depends on their concentration levels in the environment. With increasing concentrations in environment and decreasing the capacity of soils towards retaining heavy metals, they leach into groundwater and soil solution. Thus, these toxic metals can be accumulated in living tissues and concentrate through the food chain.
Cadmium is regarded as the most serious contaminant of the modern age [10]. Copper is classified as a priority pollutant because of its adverse health effects [11]. Zinc and iron are essential elements and are generally considered to be non-toxic below certain levels [12]. Lead is not an essential trace element in any organism and has no known biological function. It can cause a variety of harmful health effects [13] and is known as a fatal neurotoxicant [14]. Excessive concentrations of cobalt can cause death and various compounds of nickel are carcinogenic [15]. These menaces provoke the studies on the monitoring of these heavy metals in this chain being important for protection of public health.
A variety of techniques including x-ray fluorescence (XRF), neutron activation analysis (NAA), inductively coupled plasma-atomic emission spectrometry (ICP-AES), atomic absorption spectrometry (AAS) and graphite furnace atomic absorption spectrometry (GFAAS) have been used for evaluating the heavy metal concentration in environmental samples [16][17][18][19][20]. Beside their valuable characteristics, these techniques suffer from some disadvantages such as heavy capital cost, expensive maintenance, and insufficient sensitivity for very low concentrations of metals. Voltammetric methods are known as sensitive techniques for determination of a variety of chemical species [21]; among these techniques, differential pulse polarography (DPP) bears some advantages for accurate and precise detection and determination of trace amounts of heavy metal ions in environmental samples [22,23].
Evaluation of the contaminants resulted from excavation of zinc and lead mines and development of related industries in Zanjan province-Iran and their negative environmental impacts is critical and important. Lack of a systematic investigation of the probable heavy metals contamination around National Iranian Lead and Zinc Company (NILZ) in Bonab Industrial Estate (BIE), in Zanjan province, promotes to assess the quality of groundwater sources in this industrial zone. These are the main sources of drinking water and irrigation for a part of people who live around NILZ Company. In this research, DPP technique was used to determine the concentrations of seven heavy metals (iron, cobalt, nickel, copper, zinc, cadmium and lead) in water samples and the results were compared with the maximum contaminant levels (MCLs) specified by WHO as well as Institute of Standards and Industrial Research of Iran (ISIRI). The multivariate statistical analysis was conducted to categorize the metals and to distinguish the source of the contaminants.

Study area
Zanjan province (located in north west Iran), has a large metalliferrous site and has been considered as a traditional mining region since antiquity [24]. There are still large reserves of lead and zinc in the area. Both mines and smelting units within the province present a risk of contamination of soils, plants, and surface/groundwater resources through dissemination of particles carrying metals by wind action and/or by runoff from the tailings [25]. Transportation of concentrated ore by trucks for about 110 kilometers from mines in Angouran to NILZ is another anthropogenic source of metal contamination, especially along the roads.
In this study, Bonab Industrial Estate (BIE) and its neighborhood was selected for detailed study. The research was focused on the environmental impacts of NILZ Company (36°660 N, 48°480 E) located within BIE, about 12 km east of Zanjan city. The NILZ Company was established in 1992, with a current consumption of about 300,000 tons of raw ore and an annual production of 55000 tons of Pb and Zn [26,27]. The plant is situated over an aquifer, which is the only source of fresh water available in the area, supplying a part of drinking water to Zanjan citizens and its neighboring areas as well as water used for agricultural and industrial consumptions. The tailings from BIE, estimated to be about 2.5 million tons, contain a variety of toxic elements, notably Pb, Zn, and Cd [26]. They are damped in the vicinity of the Estate and are exposed to wind and rain, contributing to soil, surface and groundwater contamination.

Sample collection and storage
To examine the extent of the contamination by toxic metals leached from tailings, 23 spring/groundwater samples were collected and analyzed from the studied area. Nineteen groundwater and four spring water stations were selected from the site within a radius of five km from NILZ Company ( Figure 1).
Sampling stations were selected, taking into account the direction of groundwater flow (west), direction of prevailing winds (west and south west) and also the density of the population within the studied area. However, limitations on number and distribution of sampling stations are set due to the spatial distribution of available bore wells within the studied area. Table 1 shows the location of sampling stations for this study.
From each station three replicate samples were selected for analysis. Glassware and vessels were treated in 10% (v/v) nitric acid solution for 24 h and were washed with distilled and deionized water. The samples were collected in polypropylene containers, labeled and immediately few drops of HNO 3 (ultra pure grade) to pH < 2 were added to prevent loss of metals, bacterial and fungal growth and then stored in a refrigerator.

Reagents and standards
All the chemicals used in this study were mostly reagents of highest grades (Merck) and used without further treatment.

Sample digestion
Groundwater samples were filtered through 0.45 μm filters. To ensure the removal of organic impurities from the samples and thus preventing interference in analysis, the samples were preserved and digested with concentrated nitric acid. To this end 1 mL of nitric acid was added to water sample in 50 mL volumetric flask.

Sample analysis in the field
The pH, electrical conductivity (EC) and dissolved oxygen (DO) of the samples were immediately measured at sampling stations by using a portable digital pH meter (Hach HQ 40d). Recorded pH and EC of samples varied in the range of 7.2 -8.3 and 326-1857 (μS /cm) respectively ( Table 1). The pH values of the samples were within the WHO range (6.5 -8.5) but those of ECs were below the announced value of MCL by WHO (1500 μS /cm), except for samples number W6 and W8.

Sample analysis
Water samples were analyzed for the presence of iron, cobalt, nickel, copper, zinc, cadmium and lead using a differential pulse polarography (Metrohm 797 VA). Dissolved air was removed from the solutions by degassing with N 2 gas (99.999%) for 5-10 min prior to each run. Standard addition method was used for the analysis. The polarography parameters are given in Table 2. Digested samples were analyzed in triplicate and the average concentrations of metals were reported in μg/L.

Statistical analysis
SPSS statistical package (Window version 18) and software Excel 2007 are used for data analysis. The analysis of the experimental data was carried out by using one-way ANOVA, Pearson correlation matrix, Cluster Analysis, Principal Component Analysis (PCA) and Factor Analysis (FA) methods [28,29]. Pearson correlation matrix shows a probable common source of the pollutants. Cluster analysis is used for dividing the studied metal ions into the similar classes with respect to their normalized concentration level. PCA is designed to transform the original variables into new, uncorrelated variables (axes), called the principal components. Factor Analysis is similar to Principal Component Analysis method except for the preparation of the observed correlation matrix for extraction and the underlying theory [30].
The one-way ANOVA method allows testing the significant difference of the means. For this test each sampling location was selected as a group and its heavy metal concentration as the corresponding variable. The ANOVA test requires three assumptions, i.e. the random behavior of the occurrence, the homogeneity of variance and the normal distribution behavior of the metal ions in the sample stations. These were tested by using Runs test, Levene statistic and the K-S (Kolmogorov-Smirnov) methods, respectively. It is noteworthy that instead of the ANOVA test, one can use the Kruskal-Wallis test. The latter is a non-parametric test without requirements announced for the ANOVA test [28,29]. In this work both of the methods were tested for a comparison.
The bivariate correlation procedure computes the pair wise associations for a set of variables and displays the results in a matrix. It is useful for determining the strength and direction of the association between two variables. The correlation coefficients computed by bivariate correlation procedure lay in the range −1 (for the cases in which a perfect negative relationship exists) to +1 (for a perfect positive relationship). A value of 0 indicates there is no linear relationship among the variables. For normally distributed variables, the Pearson method can be used to calculate the correlation coefficient. For normally distributed variables, the Pearson correlation was used for bivariate correlation, otherwise non-parametric Spearman method was applied.
Cluster analysis is a method for dividing a group of metals into classes so that similar metals, with respect to variable space, are in the same class. In fact, the groups are not known prior to applying this mathematical analysis and no assumption is made about the distribution of the variables [28,29]. The major objective of FA is to reduce the contribution of less significant variables to simplify even more of the data structure given by PCA. This goal can be achieved by rotating the axis defined by PCA and constructing new variables, also called Varifactors [31]. PCA reduces the dimensionality of data by a linear combination of original data to generate new latent variables which are orthogonal and uncorrelated to each other [32]. The major objective of FA is to reduce the contribution of less significant variables to simplify even more of the data structure coming from PCA. All significance statements reported in this study are at the P < 0.05 level.

Extent of heavy metals contamination
The results of analysis of target metal ions i.e. Fe, Co, Ni, Zn, Cd and Pb in samples from 23 studied wells are given in Table 3. It is noteworthy that the reported values are based on three replicate determinations.

Comparison of the concentration of heavy metals
In order to deduce the frequencies of the concentration of each metal in the samples, the Chi-Square test was applied [29]. Here, the frequency means the number of times a given range of concentrations occurs, and the Chi-Square test is used to examine whether the observed frequencies differ significantly from those which would be expected on the null hypothesis. This test indicates that there is no significant difference between observed frequencies of the heavy metals.
The random and normal distribution assumptions were checked by Runs and K-S methods, respectively. Another requirement for applying the ANOVA test is that the variances of the groups are equivalent. Based on the statistically verification done by Levene test, the homogeneity of variance was found to be significant for the samples (Levene statistic = 5.696, P < 0.001). Although the Levene statistic parameter rejects the null hypothesis, as the group variances are equal the ANOVA test can be yet used. Alternatively, the homogeneity and normal distribution in the data can be achieved by transforming the obtained data to another mathematically presentation which lowers the difference between the data. This can be achieved for example by using the logarithmic form of data. In addition, one can use a non-parametric test. This type of tests does not require to a homogeneity assumption.
The ANOVA method was used under two conditions. In fact, although the homogeneity of the data was not shown, ANOVA was applied to the data. In addition, by transforming the data as their logarithmic form, homogeneity in the observed data was achieved. The nonparametric method Kruskal-Wallis is based on ranks of the data variances. This method was used for the same scope as ANOVA. Both parametric and non-parametric methods used for comparison of the concentrations of heavy metals among sampling sites show a statistically significant difference depending on sampling locations.

Bivariate correlations of investigated heavy metals
To deduce the probable common source of metals in water samples, the bivariate correlation procedure was used (Table 5). This procedure computes the pair wise associations for a set of metals and displays the results as a matrix. It is useful for determining the value of association of the investigated metals. Because, obtained data was not normally distributed, Spearman method was applied.

Classification of the investigated heavy metals by cluster analysis
Cluster analysis grouped the studied heavy metals into clusters (called groups in this study) on the basis of similarities within a group and dissimilarities between different groups. CA was performed on the data using Ward method and squared Euclidean distance. A dendrogram was produced by cluster analysis, shown in Figure 2. Seven studied heavy metals were classified into five groups based on spatial similarities and dissimilarities.

Principal component analysis and factor analysis
PCA reduces the dimensionality of data by a linear combination of original data to generate new latent variables which are orthogonal and uncorrelated to each other [32]. Prior to PCA and FA analysis, the raw data was commonly normalized to avoid misclassifications due to  the different order of magnitude and range of variation of the analytical parameters [30]. The rotation of the principal components was executed by the Varimax method with Kaiser normalization. Four principal components are obtained for heavy metals through FA performed on the PCA. This indicates that four main controlling factors influenced the quality of surface water in the study area. Corresponding components, variable loadings, and the variances are presented in Table 6. Only PCs with eigenvalues greater than 1 were considered. PCA of the whole data set yielded 4 data sets explaining 88.92% of the total variance. First component which explained 32.02% of the total variance is correlated with Pb and Cu. The second component is due to Zn and Co. The third component is a location for only Ni metal. The latest extracted factor is related to Fe and Cd.

Discussion
According to results mentioned in Table 4, all of the samples contained Co, Cu and Zn inferior to the values specified by related MCLs. In contrast, in 13.0% and 8.7% of the samples the amounts of Fe and Ni, respectively, were above WHO MCLs. The amount of cadmium  found in 34.8% of the samples was lower than the detection limit of the DPP method, but 17.4% contained the metal ion superior than the ISIRI and WHO MCL. This is of concern because cadmium has carcinogenic properties as well as a long biological half life leading to chronic effects as a result of accumulation in liver and renal cortex. It can also cause kidney damage as well as producing acute health effects resulting from over exposure to high concentrations [20]. Due to possible long term effects of chronic exposure, the presence of lead in drinking water is crucially important for public concern. Although all of the samples included this metal, 8.7% of them contained lead ions above the levels proposed by WHO and ISIRI MCL. Overall average concentration of heavy metals in water samples varies as Zn > Fe > Cu ≈ Ni > Co > Pb > Cd. The results reveal that the amount of heavy metals depends on the sampling locations.
As shown in Table 5, a close relationship between the couples Fe/Pb, Cu/Pb, Cd/Pb, Co/Zn, Ni/Zn, Ni/Cd and Zn/Cd states a probable common source of the couples. A further statistical investigation was performed by  testing the correlation between the determined concentration of heavy metals and the distance of the sampling site from NILZ Company. The calculated correlation (Cu/Pb and Zn/Cd) can confirm the significant effect of NILZ Company activities as a main source of heavy metal contamination observed in the investigated groundwater samples. In addition, close correlation between Fe and depth of the wells (0.52) suggests that this metal is totally of pedo-geochemical source leached from the upper soil layers. Cluster analysis allows identification of five clusters or groups of associated metals (Figure 2). On the basis of similarities found for group 1 (Pb, Cu), one can suggest the anthropogenic origin of the contamination sources. The presence of iron in group 3 (Cd, Fe) notifies, probably, mixed anthropogenic and pedo-geochemical source of the metals presented in this group. Therefore Zn, Co and Ni were located in single member groups.
Also according to Table 6, Component 1 is attributed to lead and copper with positive sign. These elements are important byproducts of lead industries indicating its anthropogenic sources. Component 2 reveals 24.4% of the total variances are positively loaded with Zn and negatively loaded with Co. Component 3 shows that 17.4% of the total variance is positively loaded with Ni and it can be represented by oil industries activities near the NILZ Company. Component 4 explains 15.1% of the total variance, is positively loaded with Cd and Fe.
The heavy metal grouping has been explored in the plot of the first three principal components generated from these parameters (Figure 3). The low correlation found for the studied metal ions in the four components defined by FA, suggests both anthropogenic and pedogeochemical sources for the metal contaminations.

Conclusion
Overally, the present study has shown that the groundwater source within radius of 5 km from National Iranian Lead and Zinc Company (NILZ) in Bonab Industrial Estate (Zanjan province-Iran) is contaminated by iron, cobalt, nickel, copper, zinc, cadmium and lead. This can be considered as a menace for people who daily intake the corresponding waters, planted vegetables and food crops irrigated by the same water source. The higher health risk comes from those elements which are present at higher levels than announced by WHO and ISIRI notably lead, nickel and cadmium. Multivariate statistical techniques have shown correlations and similarities among the investigated heavy metals and classification of these ion groups. Cluster analysis has identified five clusters among the heavy metals. The statistical investigations reveal the pollution sources influencing water quality in the study area as anthropogenic (with a very high contribution of NILZ Company) and pedo-geochemical for Fe, Cu. The results suggest a significant risk to the population of Zanjan city and its neighborhoods given the toxicity of the studied metals and the fact that this aquifer by far is the main source of their drinking water and irrigation. This study has also highlighted the need for further research and regular monitoring, in order to determine the permitted levels of metals in the studied aquifer.