3D-QSAR CoMFA study of some Heteroarylpyrroles as Possible Anticandida Agents

A three dimensional quantitative structure-activity relationship study using the comparative molecular field analysis method was performed on a series of 3-aryl-4-[α-(1H-imidazol-1-yl) aryl methyl] pyrroles for their anticandida activity. This study was performed using 40 compounds, for which comparative molecular field analysis models were developed using a training set of 33 compounds. Database alignment of all 33 compounds was carried out by root-mean-square fit of atoms and field fit of the steric and electrostatic molecular fields. The resulting database was analyzed by partial least squares analysis with cross-validation; leave one out and no validation to extract optimum number of components. The analysis was then repeated with bootstrapping to generate the quantitative structure-activity relationship models. The predictive ability of comparative molecular field analysis model was evaluated by using a test set of 7 compounds. The 3D- quantitative structure-activity relationship model demonstrated a good fit, having r2 value of 0.964 and a cross validated coefficient r2 value as 0.598. Further comparison of the coefficient contour maps with the steric and electrostatic properties of the receptor has shown a high level of compatibility and good predictive capability.

Fungal infections in the human, range from superfi cial and cutaneous (such as dermatomycosis) to deeply invasive and disseminated (such as candidiasis and cryptococcosis) infections. In the past 20 years, fungal infections have increased dramatically-paradoxically, as risk advances. Fungal infections occur more frequently in people whose immune system is suppressed (because of organ transplantation, cancer chemotherapy, or the acquired immune deficiency syndrome) or who have been subject to invasive procedures (catheters, prosthetic device) 1 . Fungal infections are now important causes of morbidity and mortality among immuno-compromised hospitalized patients. The frequency of invasive candidiasis has increased ten-fold during the past decade 2 .
Candida albicans (CA) has been identified as the major opportunistic pathogen in the etiology of fungal infections; however, the frequency of other Candida species is increasing 3 . The current standard of therapies is the fungicidal (but toxic) polyene antibiotic, amphotericin B, and the safer (but fungistatic) azoles. In particular, the latter class of drugs is an important antifungal class widely used for AIDS-related mycotic pathologies 4 .
Quantitative structure activity relationship (QSAR) enables the investigators to establish a reliable quantitative structure-activity and structure-property relationships to derive in silico QSAR models to predict the activity of novel molecules prior to their synthesis. The overall process of QSAR model development can be divided into three stages namely, data preparation, data analysis, and model validation, representing a standard practice of any QSAR modeling. Successful application of 3D-QSAR methodologies have been used to generate models for various chemotherapeutic agents 5,6 .
We have carried out 3D-QSAR studies employing comparative molecular field analysis 5 (CoMFA) techniques in order to study and gain further insight to deduce a correlation between structure and biological activity of 3-aryl-4-[α-(1H-imidazol-1-yl) aryl methyl] pyrroles as potent anticandida agents 7 . class similar to training set. CoMFA techniques were used to derive 3D-QSAR models for 3-aryl-4-[α-(1Himidazol-1-yl)aryl methyl)pyrroles. The MIC data were used for the QSAR analysis as a dependent parameter, after converting to the reciprocal of the logarithm of MIC (pMIC) expressed in µM/ml (Table 1).

Molecular modeling:
A database of 33 compounds forming the training set was generated by molecular modeling. All molecular modeling and 3D-QSAR studies were performed using SYBYL 6.7 15 with TRIPOS Force Field 16 on a Silicon Graphics O 2 workstation with IRIX operating system. The crystallographic data for these ligand complexes was not available hence all the molecules were constructed using a grid having a spacing of 1.54 A 0 between grid points. This is the default spacing, which represents sp 3 carboncarbon bond length. The molecules were cleaned up and quick minimized after sketching. Because no experimental data on the biologically relevant conformations of the selected compounds were available (for example, atomic coordinates derived from X-ray crystallographic studies of their complexes with the putative receptor), we resorted to a general molecular mechanics approach (AM1) to build the conformational models to be used for generation of CoMFA models. A chirality check was performed to identify chiral atoms, after adding hydrogens, it was important to consider all possible enantiomers as the activity was reported for racemic mixtures. Then the molecules were subjected for energy minimization (geometry optimization) at a gradient of 1.0 kcal/mol with delta energy change of 0.001 kcal/mol with the TRIPOS standard force fi eld. Structures were drawn by using default setting of SYBYL.
After conformational analysis (CA) by adopting AM1 Hamiltonian approach, the least energy conformation was selected, saved, and used for the In the CoMFA method, introduced by Crammer 8,9 , a relationship is established between the biological activities of a set of compounds and their steric and electrostatic properties. An advantage of CoMFA is its ability to predict the biological activity of molecules and represent the relationship between steric and electrostatic properties and biological activity in the form of contour maps 10 . An 'active conformation' of the ligands is generated and superimposed as per the predefined rules. These molecules are then placed in a box of predefined grid size. The steric and electrostatic interaction energy between each structure and a probe atom of defi ned size and charge are calculated at each grid point using the molecular mechanics force fi elds. A multivariate data analysis technique like partial least squares (PLS) 11-13 is used to derive a linear equation from the resulting matrices. PLS is used in combination with cross validation to obtain the optimum number of components. This ensures that the QSAR models are selected on their ability to predict the data rather than to fi t the data. The advantages of CoMFA studies are in the ability to predict the target properties of the compounds and to graphically present the QSAR in the form of coeffi cient contour maps 14 .
We present here 3D-QSAR studies using CoMFA method on a series of 3-aryl-4-[α-(1H-imidazol-1-yl) aryl methyl] pyrroles and the contour maps derived reveal the significance of steric and electrostatic fields. The structural variations in the molecular fields at particular regions in the space provide underlying structural requirements and 3D-QSAR models generated give good predictive ability and aid in the design of potent anticandida agents.

Biological activity data:
The antifungal activity data against Candida albicans for a series of 3-aryl-4-[α-(1H-imidazol-1-yl) aryl methyl] pyrroles containing 40 compounds as anticandida agents was used in this analysis. General structure of the compounds is shown in (fig. 1). Training set was formed by selecting 33 compounds from the original series. Test set compounds were no. 11, 12, 33, 34, 35, 37 and 42 (total 7 compounds), selected randomly. These compounds were not included in the analysis to generate the CoMFA model. The robustness and predictive ability of models were evaluated by selecting biological activity with chemical while Model B was obtained using MOPAC charges. From the results it can be observed that both the models are significant in term of their statistical acceptance, however model B was considered to be better due to higher correlation coefficient and Fischer's statistical value.

Prediction of Activity:
The 3-D QSAR analysis obtained as Model B was used for predicting the activity of the 33 compounds in the training set. The results are shown in Table  3. From the table, it can be seen that the predicted activities are very close to the experimental activities with minimum residual activity.

CoMFA contour maps:
The QSAR produced by CoMFA were represented as a 3-D coefficient contour map. To visualize charge calculation, assuming that it was the active one. We have used two different types of charges, calculated using the Gasteiger-Marsili method and the semi-empirical MOPAC method 17 .

Partial Least Squares analysis (PLS):
The PLS analyses were done by following standard protocols 18 . In order to speed up the analysis and reduce the amount of noise, column fi lter was used by excluding the columns with a variance smaller than 2.0. Equal weights for the steric and electrostatic descriptors were assigned using the CoMFA scaling option.

CoMFA Results:
Two CoMFA models were generated by using different types of partial atomic charges, results of which are shown in Table 2. Model A was derived using charges calculated according to Gasteiger-Marsili method, The CoMFA contour maps generated for model B were used to explain the structure activity relationship of antifungal drugs.
In CoMFA contour maps, the regions of high and low steric tolerance are shown in green and yellow polyhedra, respectively. CoMFA electrostatic field are shown as blue and red polyhedra in fi g. 2. A low electron density within the inhibitors near blue and red polyhedra, respectively increase or decrease the activity.

RESULTS AND DISCUSSION
The validity of Model B was further enhanced by bootstrapping process. Bootstrapping of 10 runs gave r 2 of 0.986±0.005 with a very low standard error of 0.046±0.007 which added to the high confi dence limit to this analysis. It can be seen that, both steric and the CoMFA steric and electrostatic fields from PLS analysis, contour maps of the product of the standard deviation associated with CoMFA column and coefficient (SD Coeff) at each lattice point were generated. The contour maps were plotted as percentage contribution to the QSAR equation and were associated with difference in biological activity. Where r 2 CV is cross-validated r 2 , Nopt is optimum number of components, SEP is standard error of prediction,R 2 convention is noncross-validated r 2 , SE is standard error of estimate, r 2 BS is from 100 bootstrapping runs, F Value is Fischer static value, P Value is probability of r 2 =0 and SD BS is standard deviation bootstrapping  A B electrostatic fi elds contributed to the QSAR equation by 42.8 and 57.2%, respectively. This suggested that variation in the antifungal activity is predominantly determined by electrostatic properties. Thus the results suggested that there is a good internal consistency in the data set generated in model B.
Model B performed exceptionally well in predicting the activity of most compound used in the test set. However, it must be emphasized that molecular alignment and conformations used in this study were selected in the absence of X-ray crystallographic coordinates of these molecules; still, the CoMFA model generated in the study showed very good prediction capability. From the Table 4, it can be observed that the predictions made using CoMFA model were satisfactory in most cases. In general, the percentage difference in the predicted activity of the synthesized compound ranges from 1.2 to 28.5%. The relative difference in the prediction is not unexpected and is within the acceptable limits.
From these results, it is inferred that the 3-D QSAR model generated in this study has a potential to predict the activity of diverse compounds belonging to similar structural class. The investigations concerning the design of new chemical entities based on the proposed CoMFA models, predicting their antifungal activity prior to the synthesis would be part of our forthcoming communication. The data indicates the difference in predicted and experimental activities of compounds used in test set along with percentage residual activities. Predicted activities were obtained by using Model B