Journal of Pharmaceutical Analysis: Original Article
Journal of Pharmaceutical Analysis: Original Article
Original article
a r t i c l e i n f o a b s t r a c t
Article history: The recent pandemic of coronavirus disease 2019 (COVID-19) caused by SARS-CoV-2 has raised global
Received 5 February 2020 health concerns. The viral 3-chymotrypsin-like cysteine protease (3CLpro) enzyme controls coronavirus
Received in revised form replication and is essential for its life cycle. 3CLpro is a proven drug discovery target in the case of severe
20 March 2020
acute respiratory syndrome coronavirus (SARS-CoV) and Middle East respiratory syndrome coronavirus
Accepted 21 March 2020
Available online 26 March 2020
(MERS-CoV). Recent studies revealed that the genome sequence of SARS-CoV-2 is very similar to that of
SARS-CoV. Therefore, herein, we analysed the 3CLpro sequence, constructed its 3D homology model, and
screened it against a medicinal plant library containing 32,297 potential anti-viral phytochemicals/
Keywords:
Coronavirus
traditional Chinese medicinal compounds. Our analyses revealed that the top nine hits might serve as
SARS-CoV-2 potential anti- SARS-CoV-2 lead molecules for further optimisation and drug development process to
COVID-19 combat COVID-19.
Natural products © 2020 Xi'an Jiaotong University. Production and hosting by Elsevier B.V. This is an open access article
Protein homology modelling under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
Molecular docking
Molecular dynamics simulation
https://doi.org/10.1016/j.jpha.2020.03.009
2095-1779/© 2020 Xi'an Jiaotong University. Production and hosting by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.
org/licenses/by-nc-nd/4.0/).
314 M. Tahir ul Qamar et al. / Journal of Pharmaceutical Analysis 10 (2020) 313e319
Fig. 1. (A) Phylogenetic tree inferred from closest homologs of SARS-CoV-2 3CLpro. The maximum likelihood method was used to construct this tree. (B) Multiple sequence
alignment of closest homologs of SARS-CoV-2 3CLpro sharing 70% sequence identity. (C) Cartoon representation of the SARS-CoV-2 3CLpro homodimer. Chain-A (protomer-A) is in
multicolour and Chain-B (protomer-B) is in dark blue. The N-finger that plays an important role in dimerization maintaining the active conformation is shown in hot pink, domain I
is coloured cyan, domain II is shown in green, and domain III is coloured yellow. The N- and C-termini are labelled. Residues of the catalytic dyad (Cys-145 and His-41) are
highlighted in yellow and labelled. (D) Cartoon representation of the 3CLpro monomer model (chain/protomer-A) of SARS-CoV-2 superimposed with the SARS-CoV 3CLpro structure.
The SARS-CoV 3CLpro template is coloured cyan, the SARS-CoV-2 3CLpro structure is coloured grey, and all identified mutations are highlighted in red. (E) Docking of 5,7,30 ,40 -
tetrahydroxy-2’-(3,3-dimethylallyl) isoflavone inside the receptor-binding site of SARS-CoV-2 3CLpro, showing hydrogen bonds with the catalytic dyad (Cys-145 and His-41). The
3CLpro structure is coloured dark blue, the 5,7,30 ,40 -tetrahydroxy-2’-(3,3-dimethylallyl) isoflavone is orange, and hydrogen coloured maroon.
(nsps), a spike protein (S) gene, envelope protein (E) gene, a target for anti-coronaviruses inhibitors screening [9]. Structure-
membrane protein (M) gene, a nucleocapsid protein (N) gene, 30 - based activity analyses and high-throughput studies have identi-
UTR, and several unidentified non-structural open reading frames fied potential inhibitors for SARS-CoV and MERS-CoV 3CLpro
[3]. Although SARS-CoV-2 is classified into the beta-coronaviruses [10e12]. Medicinal plants, especially those employed in traditional
group, it is different from MERS-CoV and SARS-CoV. Recent Chinese medicine, have attracted significant attention because they
studies highlighted that SARS-CoV-2 genes share <80% nucleotide include bioactive compounds that could be used to develop formal
identity and 89.10% nucleotide similarity with SARS-CoV genes drugs against several diseases with no or minimal side effects [13].
[6,7]. Usually, beta-coronaviruses produce a ~800 kDa polypeptide Therefore, the present study was conducted to gain structural in-
upon transcription of the genome. This polypeptide is proteolyti- sights into the SARS-CoV-2 3CLpro and to discover potent anti-
cally cleaved to generate various proteins. The proteolytic pro- COVID-19 natural compounds.
cessing is mediated by papain-like protease (PLpro) and 3-
chymotrypsin-like protease (3CLpro). The 3CLpro cleaves the poly- 2. Materials and methods
protein at 11 distinct sites to generate various non-structural pro-
teins that are important for viral replication [8]. 3CLpro plays a 2.1. Data collection
critical role in the replication of virus particles and unlike struc-
tural/accessory protein-encoding genes, it is located at the 30 end Whole-genome sequences of all SARS-CoV-2 isolates available
which exhibits excessive variability. Therefore, it is a potential till January 31, 2020, were downloaded from GISAID database
M. Tahir ul Qamar et al. / Journal of Pharmaceutical Analysis 10 (2020) 313e319 315
Table 1
Physicochemical parameters of SARS-CoV-2 3CLpro.
Table 2
Summary of top ranked phytochemicals screened against SARS-CoV-2 3CLpro receptor binding site with their respective structures, docking score, binding affinity and
interacting residues.
IDs Phytochemical Plant source Phytochemical structure Docking Binding 3CLpro residues interacting
name score affinity with phytochemical through
(kcal/ H-bonding and other
mol) interactions
PubChem 5,7,30 ,40 - Psorothamnus 16.35 29.57 His41, Cys145, Thr24, Thr25,
11610052 Tetrahydroxy- arborescens Thr26, Cys44, Thr45, Ser46,
2’-(3,3- Met49, Asn142, Gly143,
dimethylallyl) His164, Glu166, Gln189
isoflavone
NPACT00105 3,5,7,30 ,40 ,50 - Phaseolus 14.42 19.10 Met49, Cys145, His41, Thr24,
hexahydroxy vulgaris Thr25, Thr26, Cys44, Ser46,
flavanone-3-O- Asn142, His164, Met165,
beta-D- Glu166, Gln189
glucopyranoside
Table 2 (continued )
IDs Phytochemical Plant source Phytochemical structure Docking Binding 3CLpro residues interacting
name score affinity with phytochemical through
(kcal/ H-bonding and other
mol) interactions
*3CLpro catalytic dyad (Cys-145 and His-41) residues are highlighted with bold font.
(accession numbers and details are given in Table S1) [4]. The CoV-2 sequence (Wuhan-Hu-1; GSAID: EPI_ISL_402125) was used
genome sequence of BetaCoV/Kanagawa/1/2020 (GISAID: EPI_- as a reference in our analysis.
ISL_402126) was incomplete, and the genome sequence of Beta-
CoV/bat/Yunnan/RaTG13/2013 (EPI_ISL_402131) was an old 2.2. Sequence analyses
sequence (2013); therefore, these sequences were not included in
our analyses. Gene sequences of 3CLpro were extracted from the In order to identify similar sequences and key/conserved resi-
whole-genome sequences and translated into protein sequences dues, and to infer phylogeny, multiple sequence alignment of SARS-
using the translate tool of the ExPASy server [14]. The first SARS- CoV-2 3CLpro followed by phylogenetic tree analyses were
M. Tahir ul Qamar et al. / Journal of Pharmaceutical Analysis 10 (2020) 313e319 317
performed using T-Coffee [15] and the alignment figure was PDB, with 100% query coverage, an E-value of 0.00, and 96.08%
generated using ESPript3 [16]. Physicochemical parameters of sequence identity. There were 12-point mutations (Val35Thr,
SARS-CoV-2 3CLpro including isoelectric point, instability index, Ser46Ala, Asn65Ser, Val86Leu, Lys88Arg, Ala94Ser, Phe134His,
grand average of hydropathicity (GRAVY), and amino acid and Asn180Lys, Val202Leu, Ser267Ala, Ser284Ala and Leu286Ala) be-
atomic composition were investigated using the ProtParam tool of tween SARS-CoV and SARS-CoV-2 3CLpro enzymes (Fig. S1). Except
ExPASy [14]. for replacement of Leu with Ala at position 286, all other re-
placements conserve polarity and hydrophobicity. However, these
2.3. Structural analyses mutations may affect 3CLpro structure and function. Therefore, the
3D structure of SARS-CoV-2 3CLpro was predicted. Firstly, a single
To probe the molecular architecture of SARS-CoV-2 3CLpro, chain monomeric model comprising all domains (Domain
comparative homology modelling was performed using Modeller I ¼ residues 8e100; Domain II ¼ residues 101e183; Domain
v9.11 [17]. To select closely-related templates for modelling, PSI- III ¼ residues 200e303) was built (Fig. S2). N-terminal amino acids
BLAST was performed against all known structures in the protein 1 to 7 form the N-finger that plays a significant role in dimerization
databank (PDB) [18]. Chimera v1.8.1 [19] and PyMOL educational and formation of the active site of 3CLpro. Domains I and II,
version [20] were used for initial quality estimation, energy mini- collectively referred to as the N-terminal domain, include an anti-
misation, mutation analyses, and image processing. parallel b-sheet structure with 13 b-strands. The binding site for the
substrate is situated in a cleft between domains I and II. A loop from
2.4. Ligand database preparation and molecular docking residues 184 to 199 joins the N-terminal domain and domain III,
which is also called the C-terminal domain and comprises an anti-
A comprehensive medicinal plant library containing 32,297 parallel cluster of five a-helices. The overall molecular architecture
potential anti-viral phytochemicals and traditional Chinese me- of SARS-CoV-2 3CLpro was in consistent with the crystal structure of
dicinal compounds was generated from our previously collected SARS-CoV (PDB ID: 3M3V); the root mean square deviation (RMSD)
data and studies [13,21e23], and screened against the predicted between the homology model and the template was 0.629 Å.
SARS-CoV-2 3CLpro structure. Molecular operating environment Structural and Ramachandran plot analyses revealed that 99% of
(MOE) [24] was used for molecular docking, ligand-protein inter- residues are in favourable regions.
action and drug likeness analyses. All analyses were performed After quality assessment, individual chains were combined to
using the same protocols that are already described in our previous form a homodimeric 3D structure, as shown in Fig. 1C. To facilitate
studies [13,25,26]. The qualitative assessment of absorption, other researchers, the predicted 3D model has been submitted to
deposition, metabolism, excretion and toxicity (ADMET) profile of the Protein Model Database (PMDB) [33], and anyone can down-
selected hits were predicted computationally by using ADMETsar load/use the SARS-CoV-2 3CLpro final structure using PMDB ID:
server [27]. PM0082635. Furthermore, mutational analyses depicted none of
the mutations affected the overall structure of SARS-CoV-2, which
2.5. Molecular dynamics simulations fully superimposed on the SARS-CoV 3CLpro structure (Fig. 1D). The
results also revealed that SARS-CoV-2 has a Cys-His catalytic dyad
Explicit solvent molecular dynamics (MD) simulations were (Cys-145 and His-41), consistent with that of SARS 3CLpro (Cys-145
performed to verify docking results and to analyse the binding and His-41), TGEV 3CLpro (Cys-144 and His-41) and HCoV 3CLpro
behaviour and stability of potential compounds using the predicted (Cys-144 and His-41) [34]. These results revealed that the SARS-
SARS-CoV-2 3CLpro homology model. GROMACS v5.1.4, GROMOS96 CoV-2 3CLpro receptor-binding pocket conformation resembles
and the PRODRG server were employed to run 50 ns MD simula- that of the SARS-CoV 3CLpro binding pocket and raises the possi-
tions [28,29] following the same protocol as described in our pre- bility that inhibitors intended for SARS-CoV 3CLpro may also inhibit
vious studies [13,30]. the activity of SARS-CoV-2 3CLpro.
3.1. Sequence and structural analyses To test this hypothesis, we docked (R)-N-(4-(tert-butyl)phenyl)-
N-(2-(tert-butylamino)-2-oxo-1-(pyridin-3-yl)ethyl)furan-2-
Multiple sequence alignment results revealed that 3CLpro was carboxamide), a potential noncovalent inhibitor of SARS-CoV 3CLpro
conserved, with 100% identity among all SARS-CoV-2 genomes. named ML188 [35], with the SARS-CoV-2 3CLpro homology model.
Next, the SARS-CoV-2 3CLpro protein sequence was compared with We also docked ML188 with the SARS-CoV 3CLpro structure (PDB
its closest homologs (Bat-CoV, SARS-CoV, MERS-CoV, Human-CoV ID: 3M3V) as a reference, and ML188 bound strongly to the receptor
and Bovine-CoV). The results revealed that SARS-CoV-2 3CLpro binding site of SARS-CoV 3CLpro. The inhibitor targets the Cys-His
clusters with bat SARS-like coronaviruses and shares 99.02% catalytic dyad (Cys-145 and His-41) along with the other resi-
sequence identity (Fig. 1A). Furthermore, it shares 96.08%, 87.00%, dues, and the docking score (S ¼ 12.27) was relatively high.
90.00% and 90.00% sequence identity with SARS-CoV, MERS-CoV, However, surprisingly, ML188 did not show significant binding to
Human-CoV and Bovine-CoV homologs, respectively (Fig. 1B). the catalytic dyad (Cys-145 and His-41) of SARS-CoV-2, and the
These findings were consistent with those of initial studies docking score (S ¼ 8.31) was considerably lower (Fig. S3). These
reporting that SARS-CoV-2 is more similar to SARS-CoV than to results indicated that the 12-point mutations identified at previous
MERS-CoV, and shares a common ancestor with bat coronaviruses step may disrupt important hydrogen bonds and alter the receptor
[1,3,31]. Analysis of physicochemical parameters revealed that the binding site, thereby affecting its ability to bind with the SARS-CoV
SARS-CoV-2 3CLpro polypeptide is 306 amino acids long with a inhibitors.
molecular weight of 33,796.64 Da and a GRAVY score of 0.019, Therefore, it is essential to discover novel compounds that may
categorising the protein as a stable, hydrophilic molecule capable of inhibit SARS-CoV-2 3CLpro and serve as potential anti-COVID-19
establishing hydrogen bonds (Table 1). drug compounds. We developed a library from our previously
Next, for comparative modelling, BLAST [32] search identified published studies that contains numerous natural compounds
SARS-CoV 3CLpro (PDB ID: 3M3V) as the best possible match in the possessing potential anti-viral activities and screened it against the
318 M. Tahir ul Qamar et al. / Journal of Pharmaceutical Analysis 10 (2020) 313e319
Fig. 2. (A) Root mean square deviation (RMSD), (B) root mean square fluctuation (RMSF), (C) potential energy and (D) Hydrogen Bond interactions for all three complexes over the
50 ns simulation.
SARS-CoV-2 3CLpro homology model. Recent drug repurposing three phytochemical complexes, namely 5,7,30 ,40 -tetrahydroxy-2’-
studies proposed few drugs that target SARS-CoV-2 3CLpro and (3,3-dimethylallyl) isoflavone, myricitrin, and methyl rosmarinate,
suggested that they could be used to treat COVID-19. Herein, we were subjected to 50 ns MD simulation. The root mean square
selected the best of these (Nelfinavir, Prulifloxacin and Colistin) deviation (RMSD), root mean square fluctuation (RMSF), radius of
from three different drug repurposing studies [36,37] and docked gyration (RoG) and hydrogen bond parameters were calculated.
them as controls in the present study (Fig. S4). Our analyses iden- RMSD is an indicator of the stability of ligand-protein complexes.
tified nine novel non-toxic, druggable natural compounds that are None of the complexes showed any obvious fluctuations, and all
predicted to bind with the receptor binding site and catalytic dyad three were stable, with average RMSD values of 1.6 ± 0.02 Å,
(Cys-145 and His-41) of SARS-CoV-2 3CLpro (Table 2; Fig. S5). 1.5 ± 0.02 Å and 1.7 ± 0.02 Å for 5,7,30 ,40 -tetrahydroxy-2’-(3,3-
ADMET profiling of the selected hits is given in Table S2. Among dimethylallyl) isoflavone, myricitrin, and methyl rosmarinate,
these screened phytochemicals, 5,7,30 ,40 -tetrahydroxy-2’-(3,3- respectively (Fig. 2A). RMSF is an indicator of residual flexibility.
dimethylallyl) isoflavone is an isoflavone extracted from Psor- Minimal fluctuations were observed for myricitrin and methyl
othamnus arborescens [38] that exhibited the highest binding af- rosmarinate, and the overall complexes remained stable
finity (29.57 kcal/mol) and docking score (S ¼ 16.35), and throughout the simulations. The functionally important catalytic
formed strong hydrogen bonds with the catalytic dyad residues dyad residues (Cys-145 and His-41) displayed stable behaviour, and
(Cys-145 and His-41) as well as significant interactions with the fluctuations were observed toward the C-terminal end of the SARS-
receptor-binding residues Thr24, Thr25, Thr26, Cys44, Thr45, CoV-2 3CLpro molecule (Fig. 2B). RoG is an indicator of protein
Ser46, Met49, Asn142, Gly143, His164, Glu166 and Gln189 (Fig. 1E). compactness, stability, and folding, and the results suggested
A literature review revealed that 5,7,30 ,40 -tetrahydroxy-2’-(3,3- normal behaviour for all three complexes; all remained compact
dimethylallyl) isoflavone has been successfully used as an anti- and stable throughout the 50 ns simulations (Fig. 2C). In addition,
leishmanial agent [38], and it is also found in traditional Chinese hydrogen bonds, which are the main stabilising interactions factors
medicine records [39]. Our screened phytochemicals displayed in proteins, suggested that the SARS-CoV-2 3CLpro internal
higher docking scores, stronger binding energies, and closer in- hydrogen bonds remain stable throughout the simulation, with no
teractions with the conserved catalytic dyad residues (Cys-145 and obvious fluctuations (Fig. 2D). These results confirmed our findings
His-41) than Nelfinavir, Prulifloxacin and Colistin. These results and further indicated that these compounds may serve as potential
suggested that natural products identified in our study may prove anti-COVID-19 drug sources.
more useful candidates for COVID-19 drug therapy.
4. Conclusion
3.3. MD simulations
In conclusion, our study revealed that 3CLpro is conserved in
To further investigate the molecular docking results, the top SARS-CoV-2. It is highly similar to bat SARS-like coronavirus 3CLpro,
M. Tahir ul Qamar et al. / Journal of Pharmaceutical Analysis 10 (2020) 313e319 319