Simulation Studies 3D QSAR and Molecular Docking, on a Point Mutation of Protein Kinase B with Flavonoids Targeting Ovarian Cancer

Cancer is the world’s dreaded disease and its prevalence is expanding globally. The study of integrated molecular networks is crucial for the basic mechanism of cancer cells and its progression. During the present investigation we have examined different avonoids that targets protein kinases B (AKT1) protein which exerts their anticancer eciency intriguing the role in cross talk cell signalling, by metabolic processes through in-silico approaches. Molecular dynamics simulation (MDS) was performed to analyse and evaluate the stability of the complexes under physiological conditions and the results were congruent with molecular docking. This investigation revealed the effect of a point mutation (W80R), considered based on their frequency of occurrence, with AKT1. The ligand with high docking scores and favourable behaviour on dynamic simulations are proposed as potential W80R inhibitors. A virtual screening analysis was performed with 12000avonoids satisfying the Lipinski’s rule of 5 according to which drug-likeness is predicted based on its pharmacological and biological properties to be active and taken orally. The pharmacokinetic ADME (adsorption, digestion, metabolism and excretion) studies featured drug likeness. Subsequently, a statistical signicant 3D-QSAR model of high correlation coecient (R2) with 0.992 and cross validation coecient (Q2) with 0.6132 at 4 component PLS (partial least square) were used to verify accuracy of the models. The molecular dynamics simulation of this study showed that the compound is Taxifolin (I-UPAC namely2-(3,4-dihydroxyphenyl)-3,4 –dihydro-2H-chromene-5,7-diol of C15H14O5, of CID ID-443637 evidenced a better interaction with docking score (-9.63Kcal/mol) exhibited the binding anity with W80R mutant protein thus reecting that natural inhibitor can be considered for experimental evaluation which provides targeted insights for new combination of drugs in forming a network in pharmacology. RMSD values for stability in trajectory and conformational drifts were observed in W80R protein. The expected result supported the molecular cause in a mutant form which resulted in a gain of ovarian cancer. However, experimental evaluation or in vivo studies is recommended for further validation.


Introduction
Ovarian cancer marks the most lethal gynaecological malignancy which ranks the fth leading cause of cancer deaths in females 1 .It is estimated that there are 22530 cases with a mortality rate of approximately 13980 deaths in the United States in 2019 1 .Ovarian cancers are categorized into 3 types based on cell origin: epithelial, stromal and germ cell 2 . The low survival rate and poor prognosis of ovarian cancer is due to a lack of screening methods at the early stages and ineffective treatments for advanced stages of disease 3. Moreover it is very crucial to dissect the role of tumor causing microenvironment during early stage, proliferation, and metastasis. Thus, it becomes paramount to understand the root cause from different views of its molecular pathogenesis, histological subtypes, hereditary factors, epidemiology, methods of treatment and diagnostic perspectives. The Cancer Genome Atlas (TCGA) revealed that the expression of AKT1, AKT2 and AKT3 was associated with poor patient survival 4 . The leading cause of disease is due to genetic and epigenetic changes of the cellular genome. So, numerous small drug molecules of AKT gene targeting mutations such as, FOXO, glucose metabolism(GSK3), apoptotic proteins (BAD,NF-kB, FKHR). Cell cycle arrest, apoptosis, DNA repair (MDM2) are critical in disease progression. Among various kinases, over expression of AKT1 protein and associated mutations play a deciding role in cross-talk cell signalling in causing cancer. Recent studies have introduced assorted therapeutic agents as targets speci c for cancer driven factors involved in inhibition of ovarian cancer development. One such factor of kinase family is protein kinase B/serine-threonine serves as a decisive mediator of the P13K/AKT/mTOR cell signaling pathway that has distinct physiological functions such as cell growth, survival, proliferation, and metabolism 5. Structurally AKT1 consists of three domains, including an N-terminal pleckstrin homology, a central catalytic kinases domain, and C-terminal domain 6 . AKT1 is the kinase which connects upstream signals from PI3K and mammalian targets of rapamycin complex2 (mTORC2) with downstream signals to mTORC1 and effectors such as mTOR, GSK3b along with phosphorylation cascade which acts as substrates that induce cell cycle progression, protein synthesis, lipid and protein phosphatases, glucose metabolism and cell growth 7 . AKT1 is mutated and AKT2 is ampli ed in about 40% AKT1 is inhibited by tumor suppressors including phosphatase and tensin homology (PTEN) and inositol polyphosphate 4-phosphatase type 2(INPP4B) 8,9,10 . Therefore, targeting ATP binding cleft of AKT gene by inhibitors (natural/synthetic) has become attractive strategy for treating patients in ovarian cancer. Interestingly, AKT1 gene inhibitors showed strong binding a nity with mutant forms when compared to the native form. However, the emergence of acquired drug resistance in patients found to limit its usage in last phase of clinical trials. In ovarian cancer, overexpression of AKT is associated with advanced-stage platinum resistance 11 . As an isoform of the AKT family, AKT1 is observed to be expressed unduly in a wide assortment of many human cancers including breast and ovarian cancers 12,13 . This study scrutinizes substitution mutation from tryptophan to arginine at 80 residue position. The underlying molecular mechanism is assumed to cause conformational changes in native protein structure (AKT1) which modify covalent bond interaction by limiting their practical application. On that account, there is need to search and develop novel as well as regimes that can counteract the drug resistance induced by AKT1 gene. However, the molecular interactions and atomic stability for the W80R have been considered as the novel and taken as a crucial platform for the present study.
W80R results in increased repression of FOXO 3 compared to wild type AKT1 in an invitro assay which then predicted to result in a gain of AKT1 protein function. FOXO is a transcription factor in the nucleus induces CGN2 transcription in epithelial ovarian cancer cells with enhanced catenin activity. The absence of Wnt ligand dissociates catenin from the destruction complex and translocates to the nucleus where it acts with the FOXO3 factor which is known to play a role in the W80R protein pathway. Abnormal activation of this pathway leads to hyperactivation of catenin, which has been reported in ovarian cancers. W80R is one of the reported mutants of AKT1 cancer which cause missense driver mutation with 238T > C of the coding sequence, also CDS (change in the nucleotide sequence as a result of mutation, where the syntax here used is identical to the method used for the peptide sequence) mutation c.238T > A with gene location 14q32. 33 14 in the uterus section causing endometrial cancer. It has been proved that W80R contains highly conserved residues damaged by polyphen2, targeting through PI3K/AKT1/mTOR pathway of substitution-missense variant type affecting exon of protein domain PH (the UniProt Consortium 2019) and SIFT prediction as 3 12 . The mutant W80R-Q79K on combination found to be displayed a very strong membrane localization and hyperactivation in transfected HeLa cells in both presence and absence of serum under uorescence microscopy 15 . The previous studies of AKT1 cooccurring mutations(like Q79K-W80R) found to be hyperactive equal to E17K mutant widely distributed in different tissues such as endometrium (homozygous and heterozygous), large intestine (caecum), prostrate (with heterozygosity condition) breast cancers involving cross-talk signaling pathways 16 .
The deleterious mutations of AKT1 (E17K and W80R) concluded to be of functional relevance exclusively in myxoidtumors 17 . The altering mutations promote growth factor independent cell proliferation as compared to wild type AKT1 18. AKT1 gene alterations account for most of the genetic drive contributing to the pulmonary sclerosing haemangioma which is a benign tumor development 19 . It was observed in the patients receiving genomically targeted therapy that W80R mutant found to be in clinical bene t of SD 4 mo+(stable disease), working e ciently with synthetic drugs temsirolimus and ixabepilone targeting ovary granulosa cell 20 . In line with, the inhibition of AKT1 or its mutant proteins has been recognized as a compelling strategy for the treatment of cancers with 21 induce ovarian tumor angiogenesis 22 and in immune evasion 23 .
Existing chemotherapeutic drugs have developed resistance to the novel compounds along with side effects despite of enormous progress in anticancer drug discovery. Hence more targeted strategies are required to develop with sensitivity and speci city. Most of the successful anticancer compounds were originated from natural sources or as their analogues. Flavonoids are naturally occurring secondary metabolites consisting of polyphenols having therapeutic bene ts in multiple ways. These are low-molecular-weight compounds with non-nitrogenous properties consisting of C6-C3-C6 as a backbone with different classes 24 and their activities are structure-dependent. Chemically, avonoids depend on their structural class, degree of hydroxylation, substitutions, and conjugations, and degree of polymerization 25 . Several mechanisms have been proposed for the effect of avonoids at the initiation and promotion stages of the carcinogenicity including in uences on development and hormonal activities 26 . Flavonoids falls under 6 different categories based on the functional group avones (luteolin, apigenin), avonols (quercetin, kaempferol), avanones (naringenin), avanonol (taxifolin), iso avones, and avan-3-ols (genistein, epicatechin, catechin, wedelactone, ellagic acid, silibinin, folstein, parthenoilods, oridonin, curcumin, reservertol. The choice of this study has been relied on the compounds of family called avonoids with tremendous variety of pharmacological and biochemical consequences including hepatoprotective, antidiabetic, cardioprotective, anti-tumor, neuroprotective, and anti-in ammatory and played a wonderful role in the preclusion of Alzheimer's disease 27 . In earlier investigation in this area has demanded series of chemical methods and animal models to synthesis lead compounds with more time, investment, and level of exposure. To overcome this issue, the computational approaches have been developed reliably in predicting the mutation both in induced drug resistance and also to design resistance evading drugs. As a result of above mentioned short falls, the present study has aimed on the dynamic simulation at molecular level and molecular docking studies on taxifolin targeting W80R mutant protein in protein kinase B/AKT1 protein of Ovarian cancer for designing therapeutic. This computational study rely on learning and pattern classi cation methods (phylogeny, neural systems, vector machines, and FATHMM servers) which can classify mutations, create 3D protein structures.

Sequence retrieval and structure analysis of selected protein
The amino acid sequence of AKT1 protein was retrieved from the Uniprot database with accession number P31749. The primary structure of the protein was elucidated using the ProtParam tool 28 29 of the Expasy server and the difference between physical and chemical properties of the AKT1 protein (wild) and mutant (W80R) were evaluated. Factors such as physicochemical properties, molecular weight, theoretical pI (isoelectric point), half-life, instability index (II), aliphatic index (AI), extinction coe cient (EI), grand average hydropathy (GRAVY), and site of origin were analyzed. The secondary structure properties prediction was carried out by the RAMPAGE server, which provides the con guration score like the total number of helices, turns, coils, predicted solvent accessibility, with the range, existed from 0 (highly buried) to 9 (exposed region) depending on the residue exposed. Normalized B-factor is measured for a selected protein as Z score which is a combination of template and pro le-based prediction where residues are higher than zero are considered as less stable during experimental structures. The mutant protein W80R was edited manually at the amino acid position number and submitted to homology modelling.

Homology modelling
The 480 amino acid residue length of W80R protein was retrieved to recognize the appropriate template for structure modelling and functional prediction of the protein. This modelling depends mainly on a sequence alignment between the target and template sequence whose structure has been experimentally determined, the 3D structure of target protein using its template was visualized by PYMOL tool; based on template-target alignment. These theoretical structural models of the W80R protein were ranked based on the normalized discrete RMSD values. The model with the lowest RMSD score was considered as the best model 30 .

Evaluation of the structure model
The quality of AKT1 and mutant form W80R models was assessed by many tools to test the stability and reliability of the model. PROCHECK suite 31 quanti es the residues in favourable zones of the Ramachandran plot, were used to evaluate the stereochemical quality of the model. ERRAT tool 32 nd the overall quality factor of the protein and was used to check the statistics of non-bonded interactions between different atom types. The compatibility of the atomic model (3D) with its amino acid sequence was determined using the VERIFY 3D program. Swiss PDB viewer 4.1.07 was used for the energy minimization of the predicted AKT1protein along with its mutant form. The W80R model was further subjected to structural analysis and veri cation server to evaluate its quality, before and after energy minimization. ProSA tool 33 was employed for the re nement and validation of the minimized structure to check the native protein folding energy. The superimposition of the proposed model of AKT1 protein along with mutant form with its closest-structural homolog was carried out using chimera 1.11 34 .

Selection and preparation of ligands
Natural compounds database containing more than 12,000 ligands were aimed to the AKT1 protein family were downloaded from the Pubchem library 35 and subjected to ligand preparation by ligprep wizard application of the Maestro 9.3 36 . Ligprep tool was used to prepare the high quality of ligands, such as the addition of hydrogen's, conversion of 2D to 3D structures, corrected bond angles and bond lengths, with lower energy structure, stereochemistry's, and ring conformation followed by minimization in the optimized potential of OPLS 2005 force eld 37,38 . Properties such as ionization did not change and tautomers were not generated, speci cally retained chiralities. Compounds were selected based on the lowest energy.

Preparation of protein molecule and active site prediction
The protein was modelled by using the protein preparation wizard of Schrodinger Suite; by adding hydrogen atoms, optimizing hydrogen bonds, and verifying the protonation states of His, Gln, and Asn. Energy minimization was carried out using constraint 0.3A RMSD and OPLS 2005 force eld. The sitemap tool was used to identify binding pockets of W80R protein 39 .

Receptor grid generation
Receptor grid generation was done by the Glide application 40 . The receptor grid for W80R was generated using active site residues which were identi ed Sitemap tool. Once the grid has generated, the ligands are docked to the protein (W80R) using Glide version 5.8(Grid-based Ligand Docking with Energetics) docking protocol. The scaling factor (0.25) and partial charge (1 Å) represents cut-offs of Vander Waals radius scaling.

Molecular docking
Molecular docking procedures were consistently carried out using a preparation of protein of Schrodinger 41 and de ning the grid on the active site of the protein. GLIDE molecular docking tool uses computational simulation methods for evaluating particular poses and ligand exibility. GLIDE systematic method, a new approach for rapid, accurate molecular docking and its output G-score, is found to be an empirical scoring function, is a combination of diversi ed attributes. G-score is calculated in Kcal/mol, encompass ligand-protein interaction energies, hydrophobic interactions, hydrogen bonds, internal energy, pi-pi stacking interactions, root mean square deviation (RMSD), and desolvation. GLIDE modules of the XP visualize analyses of the speci c ligandprotein interactions. The ligands were docked using Extra Precision mode(XP) and conformers were evaluated using the Glide(G) score. The G score is calculated as follows: where vdW denotes vanderwaals energy, Coul denotes columb energy, Lipo denotes lipophilic contact, H-bond indicates hydrogen bonding, Metal indicates metal-binding, BuryP indicates penalty for buried polar groups, RotB indicates penalty for freezing rotatable bonds, site denotes polar interactions in the active site and a = 0.065 while b = 0.130 were the coe cients of vdW and Coul.

ADME properties studies
Calculation of absorption, distribution, metabolism, excretion, and toxicity (ADME/T) properties were performed for best-docked ligand molecules by QikProp software. This software predicts various limiting factors such as QP log Po/w, QPlog BB, SASA, FOSA, FISA, PISA, WPSA, volume, donarHB, acceptorHB, dip^2/V, AC*DN*5, Caco, QlogS, rotors, rule of 5, rule of 3, the overall percentage of human oral absorption, etc 42 . Lipinski's rule of ve 43 measures the druglikeness for the prediction of a chemical compound as an orally active drug based on biological compounds and pharmacological properties.

Analysis of cancer-associated mutants
The deleterious W80R mutations that are speci c for cancers were predicted using the FATHMM server (http://fathmm.biocompute.org.uk/) 44 which allows the distinct difference between cancer-promoting/driver mutations and other germline polymorphisms. The gene number identi ers (UniProt id) along with mutant form as a text were provided as the input for the prediction.

Molecular alignment and 3D QSAR studies and validation
The key component of 3D QSAR analysis is the arrangement of the molecules based on the scaffold they share which generated using the training was set of 44 molecular poses with a grid spacing of 1 Å PLS (partial least square) algorithm to establish the relationship between biological activity and different structural features. The training set was adjusted to 50%. Three models were generated by Gaussian led extension as Gaussian steric, electrostatic, hydrophobic, hydrogen bond donor, hydrogen bond acceptor, and aromatic ring elds. CoMFA and CoMSIA are the tools employed as independent variables in PLS regression analysis. The best model was chosen based on the criteria of statistical robustness and visualized using contour map modules. The predictive power and stable models were assessed using the leave one odd (LOO) cross-validation method. The crucial aspects for the test set statistics include RMSE, Q2, SD, R2, R2CV, R2scramble, stability, F, P, Q2, Pearson's r which indicates the predictive ability of the model. A Scatter plot was generated in correlation with predicted activity on the Y-axis and observed activity on the X-axis of the data set model 46 .

Contour maps visualisation
Representation of the elds as contours (surfaces) or as color intensities of the elds on the grid can be displayed in different styles. Based on the eld type, the colors are designed and eld intensities are shown for one eld at a time. The elds with greater absolute values than the cut-off were presented at the maximum brightness.

Molecular dynamics simulation
The simulation of protein-ligand complexes was implemented by GROMACS 4.5.5(Groningen machine for Chemical Simulations) software 45 . The complex with the lowest binding energy was selected for molecular dynamics (MD) simulation. The ligand parameters were analyzed using PRODRG online server 47 in the framework of GROMACS force-eld 43a1 46 . The ligand enzyme complex was solvated at a simple point charge as well as a water box under periodic boundary conditions using 1.0nm distance protein to the box faces. The system was then neutralized by Cl − or Na + counter ions for the W80R complex with ligand respectively. To perform energy minimization, the complex was equilibrated under volume, constant number of particles, and temperature condition for 100ps at 300k, followed by 100ps. All the covalent bonds with hydrogen bonds were considered using a linear constraint solver algorithm. The electrostatic interactions were treated using the particle mesh Ewald method 48 Further MD simulation studies were noted for 20ns to check the accuracy and stability of the ligand-protein complexes. The potential of each trajectory produced after MD simulations were analyzed using g_rms, g_rmsf, and g_h bond of GROMACS utilities 49 the root mean square deviation (RMSD), the root mean square uctuation (RMSF), with hydrogen bonds formed between the ligand and protein complex.

Mutant W80R sequence analysis
The development of anticancer compounds with variegated pharmacological effects becomes a very paramount topic and hence main class of secondary metabolites, both dietary and synthetic avonoids have been subjected to clinical trials 50 . De nite bene cial biological activities of dietary avonoids including antioxidants 51 anticancer 52 and cardio-protective properties 53 have been identi ed in a series of previous studies. Flavonoids are known for their wide exposure to chemo-preventive, chemotherapeutic activities, and the availability of the compound in plant sources for the human diet in routine consumption 54 .
The analysis of the mutant W80R protein sequence of the AKT1 has 480 amino acid residue which plays a very crucial role in metabolism, cell proliferation, cell survival, growth, and angiogenesis, was downloaded from Uniprot with accession number (P31750). The amino acids in the protein sequence of W80R were found to exhibit larger contents of lysine, leucine, glutamic acid, and alanine. The ProtParam tool was used for the W80R protein sequence to compute physio-chemical parameters such as molecular weight of 5565.45 kD. The W80R had a pI (isoelectric point) of 5.99 indicating its acidic nature (pI < 7.0) with an aliphatic index (AI) (71.69). The protein volume is occupied by aliphatic side chains such as lysine, leucine, glutamic acid, and alanine. The instability index of W80R measured 35.76 of the unstable nature. The grand average of hydropathicity (GRAVY) of W80R protein was lower (-0.583), which proves its high a nity with water. The comparison of statistical characteristics are showing the differences among wild AKT1 and mutantW80R using the ProtParam tool ( Table 1). The comparison of sequence analysis of W80R mutant protein with AKT1(wild) at nucleotide and protein level was same with a slight difference, thus proving-T, C-G rich region, and properties such as molecular weight, amino acid composition, theoretical pI, aliphatic index, and grand average of hydropathicity (GRAVY) were found in an appropriate range of in uencing the protein stability. (2e-60) is expected value obtained by hits, percentage identity de nes the extent of two sequences, Modeller 9.13 has generated 5 models of W80R, among these the lowest score is considered as stable which is thermodynamically subjected to further re nement. The lowest RMSD as 0.18 score model was considered as the best one for further validation purposes 30 . Finally, three dimensional (3D) structure of selected protein using its template was visualized by PYMOL tool.

Model assessment and validation
The stability of the protein was constructed based on the backbone of torsion angles psi and phi which were evaluated by the PROCHECK server that computes the amino acid residues in the existing zones of Ramachandran plot analysis of W80R mutant forms ( Table 2). The information presented in the Table 2 depicts Ramachandran plot through RAMPAGE server where W80R mutant protein has 79.3% amino acids falls in the most favored region with located major active binding sites, while 13.8% in an allowed region and 6.9% residues in the outlier region of the plot with lesser signi cance. SAVES analysis was conducted to con rm the quality of the protein model followed by ProSA, RMSD assessment for a high-quality structural model for virtual screening. The quality of the predicted model of AKT1 protein and a W80R mutant was supported by a high ERRAT score of 81.99 in an acceptable protein environment. The VERIFY 3D results of W80R showed 81.88% of the residues with an average 3D-1D score > = 0.2, indicating the stability of the model. 'WHAT IF' tool examines the coarse packing quality, the model protein structure, re ecting the acceptance of good quality. The reliability of the W80R form was con rmed by ProSA ( Fig. 1) which achieved a Z score of -7.92 kcal/mol compared to the wild form AKT1 having a Z score − 7.2kcal/mol, wherein the energy is negative, re ects the best quality of the model. The quality of the model was evaluated through the comparison of predicted structure with experimentally determined structure followed by superimposition and atoms RMSD assessment using Chimera 1.11, proved that the predicted model is good and quite similar to the wild protein.

Analysis of cancer associated mutants
The mutation impact for the protein W80R was classi ed using the FATHMM server derived from the new FATHMM-MKL algorithm. It distinguishes between cancer-promoting/driver mutations and other germline polymorphisms. This algorithm predicts the functional, molecular, and phenotypic consequences of the missense mutation of a functional protein using hidden Markov models (HMMs), representing the alignment of homologous sequences and conserved protein domains with "pathogenicity weights", representing overall tolerance of protein/domain to mutations 44 The gene number identi er (uniprot id) along with mutant form as a text was provided as the input for the prediction based on the FATHMM server predictions with a score − 1.12 responsible for benign cancer. The functional scores for individual mutations were obtained from the FATHMM-MKL server which falls in the range of 0-1 known as single p-values fall in the range of (0-1) where the values below 0.5 are predicted as benign and above 0.5 are deleterious.
3.6 Determination of ADME pro le Molecular properties of the selected compounds were studied using Qikprop and chosen based on the Lipinski rule of ve which marks the most important activity in drug discovery and development. Multifarious Insilco techniques have been employed to measure the drug-likeness for a compound based on numerous descriptors. Calculation of absorption, distribution, metabolism, excretion, and toxicity (ADME/T) properties were predicted for best-docked ligand molecules using Qikprop software. Qikprop computes almost 20 physical descriptors over a wide range of predicted properties unlike a fragment-based approach, by screening compound libraries for hits and play a lead optimization that can be used to improve predictions by tting to experimental data and also to generate QSAR models. The detailed analyses of chemical and molecular descriptors and also solubility properties were tabulated in Table3, 4, and 5. The results of ADME properties are an important index to check the clinical candidates have reached the required standard. It is revealed that compounds in the table were ranked based on the potential drug properties. According to a previous study, ~ 40% of failures to develop medicine in the development phase are due to poor biopharmaceutical properties (pKa-dissociation constant and bioavailability) 55 . The ADME as a deal medicine has following characteristics, hydrogen bond donar < 5; hydrogen bond acceptor < 10; molecular weight < 500Da; lipid water partition coe cient < 5; water solubility partition coe cient − 6.5 < logs < 0.5; and polar surface area 7.0-20.  *.Donor HB: it is the calculated number of hydrogen bonds that would be donated by the solute to water molecules in an aqueous solution, values are averages take over many con gurations, so they can be non-integer; Acceptor HB: it is estimated as the number of hydrogen bonds that would be accepted by the solute from water molecules in aqueous solution; dip2/v: square of the dipole moment divided by the molecular volume. This is the key term given in Kirkwood-Onsager equation for the free energy described of solvation of a dipole moment with volume V; AC*DN: index of cohesive interaction in solids; Volume: total solvent-accessible volume in the cubic angstroms using a probe with 1.4 A radius. *QPlogPoct: predicted octanol/gas partition coe cient; QPlogPw: predicted water/gas partition coe cient; QPlogPo/w: predicted octanol/water partition coe cient;ClQPlogS: conformation -independent predicted aqueous solubility, logs. S in mol dm − 3 is the concentration of the solute in a saturated solution that is in equilibrium with the crystalline solid; QPlogHERG: predicted IC 50value for blockage of HERG K + channels; QPPCaCo: predicted apparent CaCo-2 cell permeability in nm/sec; Caco-2 cells are a model for the gut blood barrier; QPlogKp: predicted skin permeability, logKp; QPlogS: Predicted aqueous solubility, log S, S in mol dm − 3 is the concentration of solute in the saturated solution that is in equilibrium with the crystalline solid; QPPMDCK: Predicted apparent MDCK cell permeability in nm/sec, MDCK cells are considered to be a good mimic for the blood-barrier; QPlogpCl: Predicted hexadecane/gas partition coe cient.

QSAR studies and validation
A dataset of 44 ligand compounds was chosen for statistical studies and classi ed as the training set and test set into 50% for suitable 3D QSAR model development. The graphical interface allowed building dataset into training and testing equally for 50% by generating a correlation coe cient. The graph obtained for all/training models/test models were observed in Fig. 2A. Molecular descriptors (ligands) were divided into a training set and test set (Table 6) with parameters such as phase QSAR, phase activity, % extrapolation, predicted error, and predicted activity. QSAR built model was generated based on docking poses and substructure alignment was represented with standard deviation for the regression as 10.7913, R 2 gives 0.8226 measures the coe cient of determination, where R 2 always lies between 0 and 1, R 2 C yields 0.2055 for cross-validated where R 2 is obtained to leave an N-out approach, R 2 scramble (R 2 is regression or coe cient of determination) obtained as 0.4889 computing the average value obtained using scrambled activities of Fig. 2B. It measures the degree to which the molecular elds can t random data, stability statistical measure observed to be 0.379 for the model predicting the changes obtained in the training set composition F with 92.7 measuring higher F value indicates more statistical signi cant regression. Pearson 5.95e-09, root mean square error predictions were to be 22.02 (RMSE), Q 2 for predicted activities with 0.2915, Pearson-r correlated with predicted activity, and observed activity observed for test set with 0.7508. The test set was determined within the maximum range of training set.

Contour visualisation
The contour maps (Fig. 3) were used to illustrate the elds required for biological activity. Field-based QSAR interface creates electrostatic, hydrophobic, and steric elds for optimization and leads discovery. The represented green contour indicates the bulky group in a favorable region. The contour map depicts hydrophobicity in the solvent-accessible hydrophobic pocket steric elds are considered as the most favorable regions with a high Glide score. The obtained results have shown the steric and Gaussian eld fractions are much larger than other elds suggesting most of the binding energy has been contributed from hydrophobic interactions.

Molecular docking studies
Molecular docking is the paramount computational tool to con gure (Fig. 4) all the possible active conformations of binding at the active site for the receptor molecule. Before performing the docking protocol, the co-crystallized ligand was re-docked into the crystal structure of the W80R receptor molecule to evaluate the reliability of the standard precision algorithm of the Glide. A dataset of avonoids family along with its structural analogs comprising 7000 ligands was selected. Upon generation of Epik for suitable tautomeric states per 16 for each ligand, 12000 ligands were chosen entirely as a whole set for virtual screening with W80R mutant protein. The top three ligands with the best binding energy were considered for further analysis (Fig. 4).
Several hydrogen bond interactions were found in the docking result. The top-scoring compound belongs to CID-443637 was having the lower binding energy with the Glide score of -9.63 Kcal/mol. The hydroxyl group of SER 208 formed a hydrogen bond with GLU 198 and also found interacted with THR 211 revealing the strongest stability with the receptor molecule. The three hydrogen bond interactions provide the guarantee for stable conformation of a binding ligand molecule to protein structure which in uences the activity of ligand. The interaction with 1 pi ~ cation recognized as an energetically signi cant 56 noncovalent binding interaction proves to exist in a quite strong platform both in the gas phase and liquid media 57 which is a special hydrophobic interaction with LYS 268 having a cationic side chain amino acid, indicating that the geometry is biased towards aromatic amino acid, one that experiences a favorable pi ~ cation interaction 58 having IUPAC name 2-(3,4-dihydroxy phenyl)-3,4 -dihydro-2H-chromene-5,7-diol of C15H14O5 (Fig. 5). The second highest molecule of CID 71424203 has the binding energy of -9.43 kcal/mol forming three hydrogen bond interactions with amino acid residues THR 211 , MET 227 , SER 205 aromatic amino acid residue, and TYR 474 of 1 pi ~ pi stacking interaction. The residue TRP 80 between two aromatic amino acids has a separation of -3.35A (vDw) having IUPAC name 2,5,7-trihydroxy-3-(4-hydroxyphenyl)-2,3-dihydrochromen-4-one of C 15 H 12 O 6 (Fig. 6). The third compound CID 44264122 has the binding energy of -9.36 kcal/mol with hydrophobic contacts with residues such as LYS 268, THR 291 having IUPAC name 3,4 Di uoro-8,9-dihydroxbenzo[c] chromen-6one of C13H6F2O4 (Fig. 7).After the comparison of all the three models, the compound CID with 443637 with the lowest energy is chosen for further molecular dynamics simulation studies.
Lower Glide score represents the most and highest favorable binding a nity. Hydrogen bond interactions, pi-interactions, pi staking of the best poses were visualized and interpreted using XP visualizer with descriptors (Table 7) in ascending order. It rewards the topmost ligands for hydrogen bond with lengths and angles deviating signi cantly from "ideal" hydrogen-bond interaction (1.65A H-A distance,180 D-H A angle) 42 .The PhobEn measures hydrophobic enclosure reward on the protein. The lipophilic EvdW is the term for hydrophobic region lies within receptor and Ligand proximity. For the obtained data, PiCat, ClBr, PhobEnPa, penalties, HB penal, exposed penal, zprot remained at zero, whereas other properties of descriptors were exhibited accordingly. *G Score-total G score along with sum of XP terms(G score = a*vdW + b*Coul + Lipo + Hbond + Metal + BuryP + RotB + Site where vdW is vanderwaals energy, Coloumb energy, Lipo is lipophilic contact, Hbondis hydrogen bonding, Metal is metal-binding, BuryPis penalty for buried polar groups, RotBis penalty for freezing rotatable bonds, site is polar interactions in the active site and a = 0.065 while b = 0.130 were the coe cients of vdW and Coul.
Dock score -Vanderwaals + coulombic + HBonds represents potentiality of bonding. In simple rigid systems, the ligand is searched in a 6 dimensional rotational or translational space to t in the binding site, which can serve as a lead compound for drug design 60 Lipophilic term is derived from the hydrophobic grid potential and the fraction of the total protein ligand vdW energy, PhobEn-can be as hydrophobic enclosure reward for penalty for ligands with large hydrophobic contacts and low hydrogen bond scores phobic penal for penalty for exposed hydrophobic ligand groups, Rot Penal for rotatable bond penalty.

Molecular dynamics simulation
MD simulations were performed to W80R protein-ligand complex with least binding energy (Fig. 8a). The results of MD trajectories were evaluated by root mean square deviation (RMSD) and root mean square uctuation (RMSF) plot which could provide signi cant insights into understanding structural changes in atomic details. The RMSD is a signi cant parameter to analyze the equilibrium in MD trajectories, which is estimated for backbone atoms of W80R protein and taxifolin ligand complex. For W80R protein complex, the uctuations were raised about 0.3 to 0.4nm during initial stage (Fig. 8a).Clear and noticeable deviations were observed in the residues of RMSD values with increase in time from 200ps to 600ps. Majority of residues resulted to attain a stable state at 600psbetween 0.45nm to 0.5nm. At the same time, W80R protein-ligand complex uctuated from 700ps to 900ps at 0.4nm and remained stable between 0.4nm to 0.45 nm until the end of simulation 61 RMSF results were obtained by considering the average of all backbone residues of atoms to inspect the local variations of protein exibility (Fig. 8b). The uctuations observed above have an important role in protein complex exibility and thus affect protein-ligand activity and stability. The high RMSF value shows more exibility with a maximum level of uctuation in the residue positions of 355 and405 at 6Åof the backbone structure, while the minimum RMSF shows very limited movements. The RMSF graph for the W80R-ligand complex was shown in Fig. 8b. The W80R-ligand complex has attained the amino acid residues at 455 and500 also show a uctuation at 5Å of RMSF. While at positions 305 and355 at4Å indicate similar steep up graph at 5Å.The amino acid residues between 15, 55, and 105, 155, have shown medial deviation at 3Å.
To determine the residue interaction network, RING2.0 software identi es all types of non-covalent interactions in atomic levels which have wide different energies and lengths. The output has been visualized in two different ways (i) interaction network which has been visualized using different labels and (ii) structural contacts using RING_viz-script for pymol (Fig. 9). The applications of RING 2.0 have a growth in protein folding patterns, domain-domain communication and catalytic activity, inter-intrachain interactions that combine both solvent and ligand atoms. Residue interaction network (RIN) describes the single amino acid as nodes and physicochemical properties as edges including covalent and non-covalent bonds. RIN has become common practice to explore the complexity inherent in macromolecular systems 59 .

Discussion
The enigmatic in ovarian cancer is that in nearly 75% of patients, cancer do recurse during rst two years and fail to respond to available therapeutic drugs due to acquired resistance 62,63 in addition to late diagnosis in advanced clinical stages and metastasis within the peritoneal cavity 63,64 . Therefore, there is immediate need to design novel drugs to deal with the existing problem.Numerous studies since a decade has reported that Flavinoids as candidates are meant to block, retard, or reverse the progression of carcinogenesis 80 . Although various studies have been carried out using avinoids but the anticancer mechanisms have not been de ned clearly. However, it was found that the avonoids such as quercetin and silymarin induce anti-cancer mechanisms in ovarian cancer cells 65,66 . Consequently, the effects of apigenin, luteolin and myricetin on ovarian cancer have to uncover the link between potential mechanisms underlying their anticancer effects. Quercetin inhibits cell proliferation of ovarian cancer cell line of SKOV-3 which correlated with ndings of 67 caused on concentration and time-dependent manner 68 showed to inhibit UVB induced skin cancer cell proliferation and induce apoptosis in vivo models upon apigenin treatment. Taxifolin invitro studies have been e cient especially in anticancer, antimicrobial activities but leaves a strong gap in the invivo studies at root level.
The lead compound in the current study was recognised as taxifolin which has potent to exhibit anti-cancer effects on U2OS and Saos-2 in osteosarcoma cell lines by inhibiting the proliferation and disrupting colony formation. In vivo studies exhibit intraperitoneal administration in nude mice bearing U2OS xenograft that resists tumor growth. This potency is known to arrest the G1 phase of the cell cycle in U2OS and Saos-2 cell lines. Taxifolin has known to function by inhibiting colon carcinogenesis by NF-kB mediated Wnt/b catenin signalling through upregulation of Nrf2 pathway while downregulation in genes such as TNF-α, COX-2, β-catenin, and cyclin-D1 were inhibited by NF-kB and Wnt signalling pathway 69 . It is also reported that injection of taxifolin has reduced the proliferative activity on wistar rats with benign prostatic hyperplasia 70 . Taxifolin also has an excellent report on antiangiogenic effect by new blood vessels and its branches per area of chick chorioallantoic membrane assay which is inhibited by tube formation on matrigel matrix in human umbilical vein of endothelial cells which were evaluated against tachyzoites in vitro with IC50 of 1.39µg/mL(p ≤ 0.05) along with pyrimethamine. Taxifolin has known to express anti-proliferative effect on cancer cell types by inhibiting cell lipogenesis and inhibits the fatty acid synthesis in cancer cell lines which is able to prevent the growth of cancer cells 79 .
An extensive animal (rat) study of antioxidant activity on taxifolin acid has shown the decreased lipid peroxidation in the serum and liver levels. The presence of OH groups at position 5th and 7th together with 4-OXO function in the A and C rings were meant for scavenging effect while O-dihydroxy group in the B ring provided stability 71 . Consequently, In vivo studies on taxifolin induced in apoptosis of HCT116 and HT 29 cells revealed PARP1 over expression is responsible for ovarian cancer. AKT and catenin proved that down-regulated expression by taxifolin on HCT 116 and HT 29 cells demonstrates a decline in p-AKT and catenin in a dose of 40 µM against DMSO altering in G2 cell cycle and its regulators 72 . The expression levels of AKT, SKP-2, v-mc avian myelocytomatosis viral oncogene homolog(c-myc) and p-Ser 473 , have reduced activity on AKT gene by taxifolin 73 . Although the above mentioned experimental outcomes have contributed for diversi ed pharmacological activities with AKT1 protein, we still lack the detailed and molecular changes wrt to W80R mutant protein of AKT1 family. Consequently, the marginal overview of the molecular mechanism and atomic level with W80R mutation has aimed to identify hits for optimization from large data set of compounds from the PubChem database screening of avonoids in parallel to W80R mutant protein of AKT1 targeting ovarian cancer. The Table 1 for the receptor molecule W80R of 480 amino acid sequence provides the detailed knowledge about the stability of protein using Protoparam tools of Expasy server. The extensive evaluation on W80R sequence at nucleotide level reveals its density, while other parameters such as A-T,C-G rich region, molecular weight, amino acid composition, theoretical pI, aliphatic index, instability index and GRAVY signi cantly stand up for stability factor. The most favoured region by RAMPAGE server was assessed to be 79.3% (Table 2) with active site binding. Furthermore, the reliability of the protein model has been assessed by 3D or homology modelling. Therefore, Generation of 3D protein structure from sequence information, in the absence of experimentally determined structures in protein data bank through computational approaches has become topmost priority in the scienti c community based on structural biology research for several decades 74,75,76 . The protein was henceforth evaluated with SAVES server (structural analysis and veri cation) for quality check, structural re nement through energy minimization in lowest energy state in its stable conformation, followed by ProSA (Fig. 1) and superimposition analysis with experimentally determined template structure as well as atoms and RMSD assessment to obtained a high -quality structural model for virtual screening 77 . The predicted score for 3D homology model of RMSD for the W80R protein was 0.18, the model was considered as the best one for further validation purposes.3D QSAR studies have been performed with structural similarity to predict the unknown/untested ligands for better potency by correlating mathematical and statistical values. QSAR models can prioritize ideas in virtual screening as well in the optimization of lead compounds. Thus it has gained acceptance in in-silco drug discovery. The scatter plot QSAR tool (Fig. 2) assessed the molecular elds for the compounds which estimate the stability and establish statistical value to be 0.379 predicting the changes obtained in the training set composition with 92.7 measured higher F indicates more statistical signi cant regression. The dataset of 44 ligands was classi ed into test and training models randomly with combined mathematical and statistical approaches for the drug candidate represents phase activity of 358.477% extrapolated for 0.458 with the predicted activity of 333.692 and predicted error of -24.7856 which was a good combination as a lead compound ( Table 6). As per Lipinski's rule of ve, a drug is good molecule if it possesses ADME (absorption, distribution, metabolism, and excretion) properties 43 . All the physicochemical properties and drug-likeness were listed in Table 3, 4 and 5consequently; it becomes easy for the lead compound to enter the mammalian cell to interact with proteins and regulating gene expression in metabolic pathways. The top 10 hits obtained by molecular docking were further docked into the active binding sites of protein using a sitemap tool of above score 1 and grid generation followed by XP protocol (Table 7). However, a contour map is one such tool used in the present study to determine favourable regions based on eld-based QSAR which depends on steric, electrostatic, hydrophobicity in solvent-accessible pockets based on least binding energy. This application plays a vital role in combination therapies of multi-drug-resistant conditions as well in drug discovery.
The evaluated hydrophobicity gives an accurate check for the drug-ability of a compound (Fig. 3). Sitemap tool treats entire protein to locate binding sites whose size, the extent of solvent exposure is assessed based on scoring function by ranks. Active sites are ranked based on ligand propensity of binding measured by their ability to bind tightly for passively absorbed small molecules. Among the predicted combinations, active site amino acid residues of site score 1.128, drug-ability score − 1.149, volume 384.486, and size 179 (Fig. 4) were taken for further analysis. Taxifolin holds good interactions with the binding domain of W80R, highest Glide score of -9.63kcal/mol with O-H of SER 208 and H bond GLU 198 and THR 211 amino acid residues and one pi-cation interaction and one hydrophobic bond with LYS 268 (Fig. 5). The lead molecule satis ed all the surface area calculations using QIKPROP tool of SASA, FISA,FOSA, PSA and partition coe cient of Qplogpoct, QPlogPw, QPlogPo, QPlogS, ClQPlog, QPlogHER, QPPCaco, QPPMDCK, QPlogKp, wherefore, this inhibitor of the PI3K/AKT pathway has shown diverse aptitudes for anticancer activity in both preclinical and clinical experimental values and also supported through in-silico analysis.
It has been reported by the administration of taxifolin in colorectal cancer cell lines and in HCT 116 xenograft mouse model had shown excellent antitumor activity. The studies proved that the administration of taxifolin hindered the mRNA expression of β-catenin thus compiling anti-proliferative activity which was arbitrated by PI3K/AKT signal by jamming Wnt/ β -catenin signaling transduction through hampering the β expression 72 . The elucidation of suppression by taxifolin on nuclear factor-kB, C-Fos, and mitogen-activated protein kinase also decreased osteoclast speci c gene expression including Trap, Mmp-9, Cathepsin K, C-Fos, Nfatc1, and Rank; taxifolin osteoclastogenesis via regulation of many RANKL signaling pathways was also con rmed 78,79 . Taken together, these studies demonstrated that Wnt/catenin pathway plays a crucial role in ovarian cancer development and this idea also laid a strong platform for the development of targeted curatives.
CID-44264122 with 2 hydrogen bonds of a hydroxyl group (-OH) interacting with LYS 268 , THR 291 , ILE 290 , and THR 211 and ILE 290 and -OH with THR 291 and oxy bond with residue LYS 268 ( Fig. 6) with Glide XP score − 9.43Kcal/mol. The hydrogen bond interaction with residues of TYR 474 , SER 215 , THR 211 , with 1 pi-pi interaction at TRP 80 residue, and 1 pi-cationic interaction bonding with LYS 265 with G score − 9.36Kcal/mol showed good hydrophobic interactions (Fig. 7). The molecular dynamics simulation was performed to obtain lowest error and data loss. The uctuations in relative positions of atoms in protein-ligand complex explains the structural stability (RMSD) at 0.45nm to 0.50nm between 600 to 800ps (Fig. 8a). The RMSF has shown a steep up graph at 5A with a slight medial deviation and not much structural change in protein cavity was observed 80,81,82 (Fig. 8b). Residue interaction network (RINs) consider single amino acid as nodes and physio-chemical interactions as edges ( Fig. 9) representing the protein structure as RINs have become common practice to explore the complexity inherent in macromolecular systems. Henceforth, the taxifolin has been suggested as a drug for human use in clinical trials.

Conclusions
The mutant forms of the amino acid were found to induce pathological outcomes disrupting the native conformation of a protein. The W80R mutation in the PH domain of AKT1 had been reported to cause ovarian cancer by in-vitro studies and recorded in My Cancer genome database. The synthetic drugs reported in clinical trials are being used currently. To examine the detailed molecular mechanism of W80R, we conducted molecular docking along with dynamic simulation studies to understand the stability of the mutant structure, which is known to cause a damaging effect of the mutation. Furthermore, a rise in RMSD values for stability in trajectory and conformational drifts were observed in W80R protein. The expected result supported the molecular cause in a mutant form which resulted in a gain of ovarian cancer. However, experimental evaluation or in vivo studies is recommended for further validation. Declarations