Santhinissi Addala,1 Madhuri Vissapragada,1 Ramakrishna Meduri,2
Radhakrishna Nagumantri,3 Gurudatt Patra,4 Sonanjali Uddaraju,5, 6 Likhitanjali Uddaraju,5,6 Ravikiran S. Yedidi.6, *
Department of Life Sciences, Dr. Lankapalli Bullayya College, Visakhapatnam, AP, India; 2College of Science, George Mason University, Fairfax, VA, U.S.A.; 3Department of Biotechnology, GITAM Institute of Technology, GITAM, Visakhapatnam, AP, India; 4Molecular Biophysics Unit, Indian Institute of Science, Bengaluru, KA, India; 5Department of Life Sciences, Aditya Degree College, Rajamahendravaram, AP, India; 6The Center for Advanced-Applied Biological Sciences & Entrepreneurship (TCABS-E), Rajamahendravaram, AP, India.
*Correspondence should be addressed to:
Dr. Ravikiran S. Yedidi
Founder, Principal Scientist & Lead Instructor
The Center for Advanced-Applied Biological Sciences & Entrepreneurship (TCABS-E)
Danavaipeta, Rajamahendravaram, 533103. Andhra Pradesh, India.
Tel.: 91-8660301662, Email: [email protected]
Corona virus disease 2019 (COVID-19) is a pandemic that infected millions of people claiming hundreds of thousands of lives across the globe. The Government of India managed to flatten the pandemic spreading curve by implementing a strict national lockdown on all the activities. While majority of the scientific community was engaged in the discovery of antiviral drugs and vaccines for the treatment of COVID-19, the others were involved in understanding the cross-species viral evolution of the novel corona virus 2019 (nCoV-19) from bats to humans. In order to understand the nCoV-19 cross-species evolution, we designed a computational study of the viral DNA and protein sequences using various Bioinformatics and Computational Biology tools. This study was then posted as an online computational challenge, TOCC-2020 (Take the Online Computational Challenge 2020), to engage the undergraduate, postgraduate and research scholar students around the world, free of cost. The goal of this study was to involve the future scientists in COVID-19 research while staying home safe during national lockdown. Among the registered participants, 7 were highly committed and significantly contributed to TOCC-2020. The data analysis revealed some clues regarding the nCoV-19 cross-species evolution which are currently under further evaluation. Based on this first round of TOCC-2020 results we are designing a second round of TOCC-2020 in the near future to get students engaged further until May 31st, 2020 to get them involved in future drug discovery projects related to COVID-19.
Corona virus disease 2019 (COVID-19) infected more than 4.8 million people worldwide as of 20th, May 2020 claiming more than 3 Lakh lives across the globe. India recorded more than 1 Lakh cases as of 20th, May 2020 with more than 3,000 deaths. COVID-19 is caused by severe acute respiratory syndrome corona virus 2 (Sars-CoV-2), also referred to as novel corona virus 2019 (nCoV-19). The Government of India declared a national lockdown since March 25th, 2020 until May 17th, 2020 with severe restrictions and further extended this national lockdown with less restrictions until May 31st, 2020. It was predicted that the pandemic spread would have infected more than 2 Lakh Indian citizens by mid-April 2020 in the absence of national lockdown. The nCoV-19 typically spreads through aerosols when an infected person sneezes or coughs. The droplets of nasal and/or oral secretions containing nCoV-19 spread into the air and are able to infect new individuals that are within the vicinity of the source person. Wearing a mask that protects the nose and mouth region will mostly avoid this risk of spreading COVID-19. However, the virus may stay on the surfaces for few to many hours depending on the surface. Hence, wiping the surfaces with a wet cloth or wipe soaked in a disinfectant solution such as sodium hypochlorite (bleach) is very important to stop the spread of COVID-19.
The nCoV-19 infects lung epithelial cells in the alveoli by binding to a receptor protein called human angiotensin converting enzyme 2 (hACE-2) (Letko et al. 2020; Zhang et al. 2020; Timens et al. 2020). As shown in Figure 1, the nCoV-19 particles have extended long proteins called spikes. As shown in Figure 2, the spike protein aids the virus in docking to the hACE-2 receptor (Walls et al. 2020; Shang et al. 2020; Wrapp et al. 2020). The nCoV-19 was believed to be evolved from bats in China (Chen et al. 2020) where people consume partially cooked bats as food. This implies that the viral spike proteins evolved to bind hACE-2, thus jumping from one species (bats) to another (humans) (Lu et al. 2020; Zhou et al. 2020). Understanding the details of this cross-species evolution of nCoV-19 would shed light on various preventive measures that can be taken in future and also would help from a therapeutic point of view. In this study we proposed the generation of various possible mutant versions of the nCoV-19 spike protein using computational tools in order to identify exactly what pattern of mutations helped the cross-species evolution of nCoV-19. However, during the national lockdown laboratory-based training is not possible without violating the restrictions of the Government implemented national lockdown. In such a situation, we opted for an online computational training called, TOCC-2020 (Take the Online Computational Challenge 2020) that addresses the COVID-19 pandemic research questions. We believe that it is crucial to train the future generation scientists on a real time problem such as COVID-19 during the pandemic so that they clearly understand the intensity of the pandemic problem. Here we report how we achieved the generation of 4000 mutant versions of nCoV-19 spike protein through the TOCC-2020
Posting tasks online: The initial rounds of computational tasks along with the relevant tools were posted online (https://www.tcabse.org/toccc) as a warm-up round so that all the participants get acquainted with the study and related tools. The first three exercises were designed to learn the usage of online Bioinformatics and Computational Biology servers such as ExPASy, NCBI-BLAST, UniProt, SWISS-MODEL, etc. The last round was the actual exercise of building the three-dimensional models of mutant nCoV-19 spike protein using the online servers. The participants were given wild type sequence of the natural nCoV-19 strain and were asked to build mutant models of the same using the online servers.
Building the mutant models of spike protein: Template search with BLAST (Camacho et al. 2009) and HHBlits (Remmert et al. 2012) has been performed against the SWISS-MODEL (Waterhouse et al. 2018) template library (SMTL, last update: 2020-04-15, last included PDB release: 2020-04-10). For each identified template, the template’s quality has been predicted from features of the target-template alignment. The templates with the highest quality have then been selected for model building. Models are built based on the target-template alignment using ProMod3. Coordinates which are conserved between the target and the template are copied from the template to the model. Insertions and deletions are remodelled using a fragment library. Side chains are then rebuilt. Finally, the geometry of the resulting model is regularized by using a force field. In case loop modelling with ProMod3 fails, an alternative model is built with PROMOD-II (Guex et al. 2009). The global and per-residue model quality has been assessed using the QMEAN scoring function (Studer et al. 2020).
Structural analysis of mutant spike protein models: The mutant models of spike protein were analysed using PyMOL, molecular graphics software. Figure 1 was taken from Wikipedia.org (By Alissa Eckert, MS; Dan Higgins, MAM – the Centers for Disease Control and Prevention’s Public Health Image Library (PHIL), #23312). Figures 2 was prepared using PyMOL.
RESULTS & DISCUSSION
Almost 50 participants enrolled in the TOCC-2020 that initially participated in the warm-up exercises. Exercise 1 was based on translating the nCoV-19 spike protein gene (DNA sequence) into the final protein sequence by scanning through all the possible open reading frames. This task was achieved by using the ExPASY server. Exercise 2 was based on performing a BLAST search on the nCoV-19 spike protein sequence to confirm its identity. Exercise 3 consisted of uploading the nCoV-19 spike protein sequence to the SWISS-MODEL server to build the three-dimensional model of the nCoV-19 spike protein from the natural strain sequence. Most of the participants were comfortable with the first two exercises and had to fiddle around with the third one. Exercise 4 was the real task where each student would make mutations (amino acid substitutions) at a given position of the nCoV-19 spike protein and then build its three-dimensional model using SWISS-MODEL server. Among the registered participants, 7 of them were able to successfully complete the generation of mutant computational models. These 7 participants thus became the co-authors of this report in the order of their contribution and are shown below.
A total of 4000 mutant models of nCoV-19 spike protein were generated using online Bioinformatics and Computational Biology tools. Each one of these mutant models were structurally analyzed using PyMOL molecular graphics software. Mutations in the binding interface of nCoV-19 spike protein and hACE-2 showed significant loss of Hydrogen bonding network thus resulting in loss of binding affinity. Similarly, certain mutations gained extra Hydrogen bonds in the interface thus increasing the overall binding affinity. Based on this analysis, the mutants were classified into two categories, one in which loss of binding affinity was seen and the other where the binding affinity increased. In order to address the question of cross-species evolution of nCoV-19, we focused primarily on the mutants with enhanced overall binding affinity due to mutations. Full details of all the 4000 mutants along with their binding affinities and structural changes including the full technical details of the analysis protocol will be published elsewhere. In conclusion, structural changes in the mutant models of the nCoV-19 indicate either loss or gain of binding affinity to the hACE-2 receptor. The better binding affinities helped nCoV-19 cross-species evolution resulting in COVID-19. In future, this online challenge strategy will be extended further for COVID-19 vaccine design as well as an open source drug discovery program for COVID-19. TOCC-2020 was designed and executed by The Center for Advanced-Applied Biological Sciences & Entrepreneurship (TCABS-E) Rajamahendravaram, Andhra Pradesh, India.
We thank all the participants of TOCC-2020 for staying home safe during the national lockdown due to COVID-19 and for participating in TOCC-2020 from Bengaluru (BLR), Rajamahendravaram (RJY), Vijayawada (BZA), Visakhapatnam (VSP), Vijayanagaram (VZM) and international folks: Anandi (VSP); Chiranjeevi (RJY); Devi (RJY); Divya (RJY); Divyasekhar (VSP); Durga Aparna (VSP); Durga Bhavani (VSP); Esther (RJY); Gurudatt (BLR); Hema (VZM); Hemsai (VSP); Joycee (RJY); Krishna (BZA); Krishna priya (RJY); Likhitanjali (RJY); Madhumita (VSP); Madhuri (VSP); Meghana (VSP); Monika (VSP); Monisha (VSP); Navya (RJY); Neha (VSP); Niharika (VSP); Priyanka (BLR); Radhakrishna (VSP); Ramakrishna (Virginia, USA); Rishma (VSP); Jhansi (VSP); Sahitya (RJY); Sandhya (VSP); Santhinissi (VSP); Sashi (New York, USA); Ramani (RJY); Sheryl (VSP); Shyam (VSP); Sinduja (VSP); Sonanjali (RJY); Sony (RJY); Sowmya (VSP); Srinayana (RJY); Sripooja (VSP); Subhasirisha (RJY); Swathi (VSP); Swetha (VSP); Thiruvalli (RJY); Uttpal (Beersheba, Israel); Vasanthi (VSP); Vyshnavi (VSP).
- Camacho et al. (2009). BLAST+: architecture and applications. BMC Bioinformatics 10, 421-430.
- Chen et al. (2020). Epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in Wuhan, China: a descriptive study. The LANCET 395, 507-513.
- Guex et al. (2009). Automated comparative protein structure modelling with SWISS-MODEL and Swiss-PdbViewer: a historical perspective. Electrophoresis 30, S162-S173.
- Letko et al. (2020). Functional assessment of cell entry and receptor usage for SARS-CoV-2 and other lineage B betacoronaviruses. Nature Microbiology 5, 562-569.
- Lu et al. (2020). Genomic characterisation and epidemiology of 2019 novel coronavirus: implications for virus origins and receptor binding. The LANCET 395, 565-574.
- Remmert et al. (2012). HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment. Nature Methods 9, 173-175.
- Shang et al. (2020). Structural basis of receptor recognition by SARS-CoV-2. Nature 581, 221-224.
- Studer et al. (2020). QMEANDisCo-distance constraints applied on model quality estimation. Bioinformatics 36, 1765-1771.
- Timens et al. (2020). Tissue distribution of ACE2 protein, the functional receptor for SARS coronavirus. A first step in understanding SARS pathogenesis. The Journal of Pathology 203, 631-637.
- Walls et al. (2020). Structure, function and antigenicity of the SARS-CoV-2 spike glycoprotein. Cell 180, 281-292.
- Waterhouse et al. (2018). SWISS-MODEL: homology modelling of protein structures and complexes. Nucleic Acids Research 46(W1), W296-W303.
- Wrapp et al. (2020). Cryo-EM structure of the 2019-nCoV spike in the prefusion conformation. Science 367, 1260-1263.
- Zhang et al. (2020). Angiotensin-converting enzyme 2 (ACE2) as a SARS-CoV-2 receptor: molecular mechanisms and potential therapeutic target. Intensive Care Medicine 46, 586-590.
- Zhou et al. (2020). A pneumonia outbreak associated with a new coronavirus of probable bat origin. Nature 579, 270-273.