Immunological and mutational analysis of SARS-CoV-2 structural proteins from Asian countries

Introduction : The emergence of a novel coronavirus, SARS-CoV-2, an etiologic agent of coron-avirus disease (COVID-19), has led to a pandemic of global concern. Considering the huge number of morbidity and mortality worldwide, the World Health Organization declared, on 11 th March 2020, the pandemic as an unprecedented public health crisis. The virus is a member of plus sense RNA viruses that can show a high rate of mutations. The ongoing multiple mutations in the structural proteins of coronavirus drive viral evolution, enabling them to evade the host immunity and rapidly acquire drug resistance. In the present study, we focused mainly on the prevalence of mutations in the four types of structural proteins-S (spike), E (envelope), M (membrane), and N (nucleocapsid)- thatarerequiredfortheassemblyof acompletevirionparticle. Further, weestimatedtheantigenic-ity and allergenicity of these structural proteins to design and develop a potentially good candidate vaccine against SARS-CoV-2. Methods : In the present in silico study, envelope protein was found to be highly antigenic, followed by the nucleocapsid, membrane, and spike proteins of SARS-CoV-2. Results : In this study, we detected 987 mutations from 729 sequences from Asia in October 2020, and compared them with China's first Wuhan isolate sequence as a reference. Spike protein showed the highest mutations with 807 point mutations among the four structural proteins, followed by nucleocapsid with 151 mutations, while envelope showed 19 mutations and membrane only 10 point mutations. Conclusion : Taken together, our study revealed that variations occurring in the structural protein of SARS-CoV-2 might be altering the viral structure and function, and that the envelope protein appears to be a promising vaccine candidate to curb coronavirus infections.


INTRODUCTION
Human Coronavirus (SARS-CoV-2, Severe acute respiratory syndrome) is a positive-sense RNA virus.As an etiologic agent of coronavirus disease 2019 (COVID- 19), the virus induces moderate to severe respiratory distress 1 .This pandemic originated from an animal market in Wuhan city of China 2 .The ripple effect of this contagious viral disease has created a humanitarian health crisis and has become an enormous challenge to the entire health systems across the globe.SARS-CoV-2 is a member of the Coronaviridae family and Nidovirales order.The virus is considered the third zoonotic coronavirus (after SARS-CoV and MERS-CoV) and originated from bats.However, this novel coronavirus has been the only one having pandemic potential 3-6 .SARS-CoV-2, a beta coronavirus, is an enveloped single-stranded, positivesense, non-segmented and genetically diverse RNA virus with the largest genome size among known RNA viruses (29,891 ase pair, encodes for approximately 9860 amino acids) 2,7, 8 .The genome of SARS-CoV-2 encodes both structural proteins like spike (S), envelope (E), membrane (M), and nucleocapsid (N), as well as non-structural proteins ranging from NSP1 to NSP16.RNA viruses, generally, show a drastically high rate of mutation, substantially higher than those of DNA viruses.Due to this high rate of mutation shown by SARS-CoV-2 over a short period, it has been observed that viruses exhibit genomic variability which enables them to modulate virulence properties in the host and subsequently evade the host immunity 9, 10 .In the present research work, we detected 987 mutations from 729 sequences derived from Asia in in the October.Altogether spike showed the highest mutations with 807 point mutations among the four structural proteins, followed by nucleocapsid with 151 mutations.Envelope showed 19 mutations and membrane showed only 10 point mutations.The results of our study suggest that mutational analysis of this virus might be considered as a new approach to help understand its genomic variability.Similarly, using Cite this article : Kumar Jha D, Yashvardhini N, Kumar A. Immunological and mutational analysis of SARS-CoV-2 structural proteins from Asian countries.Biomed.Res.Ther.; 8(5):4367-4381.
the predictive tools of immunoinformatics approach, the antigenicity and allergenicity of the structural proteins of SARS-CoV-2 have been determined to develop efficacious antiviral therapeutics or vaccines against COVID-19.

Data mining
The full-length protein sequences of SARS-CoV-2 structural proteins, i.e., envelope protein, nucleocapsid phosphoprotein, surface glycoprotein and membrane glycoprotein, were retrieved from the NCBI virus database, as submitted from Asia in the month of October.There were 729 SARS-CoV-2 structural protein sequences submitted from Asia in the month of October, including sequences of 165 envelope proteins, 159 nucleocapsid phosphoproteins, 246 surface glycoproteins, and 159 membrane glycoproteins.A total of four reference sequences for envelope protein (YP_009724392), nucleocapsid phosphoprotein (YP_009724397), surface glycoprotein (YP_009724390), and membrane glycoprotein (YP_009724393) were also retrieved for mutational studies.

Multiple sequence alignment (MSA) and mutational identification
Multiple sequence alignment was performed using Clustal Omega online platform (http://www.clustal.org/) based on HMM profile seeded guide trees 11 .The envelope, nucleocapsid phosphoprotein, surface glycoprotein, and membrane glycoprotein were aligned with their respective reference sequences.The aligned files were viewed using Jalview (https://www.jalview.org/) to identify the point mutations occurring in different structural proteins with respect to the Wuhan type isolate.

Antigenicity and allergenicity evaluation
Vaxijen v2.0 server was used for the estimation of antigenicity of all the four structural proteins to study the capability of structural proteins to be used in vaccine production.This online server predicts antigens as per the auto cross-covariance (called ACC transformation) of the peptide sequences submitted to it 12 .A good vaccine needs to be non-allergenic to the host, hence the rationale for evaluating the allergenicity of these structural proteins, AllerTOP server was used, which predicts allergenicity based on size, flexibility, and other parameters 13 .

Mutational identification
A total of 729 structural protein sequences were retrieved from the NCBI virus database for spike glycoproteins, nucleocapsid phosphoproteins, envelope proteins, and membrane glycoproteins submitted from Asian countries in the month of October 2020, along with four references sequences.The size of the different reference structural proteins, i.e., spikes glycoprotein, nucleocapsid phosphoprotein, envelope protein, and membrane glycoprotein being 1273, 419, 75, and 222 amino acids.The sequences were viewed using Jalview after alignment to compare and detect the mutations among the Asian isolates with the Wuhan isolates with respect to structural proteins.Amongst the 729 sequences released from Asia, a total of 987 point mutations were detected in all four structural proteins (Figure 1).Among the 311 mutants, spike showed the highest mutations with 807 point mutations (Table 3), followed by nucleocapsid with 151 mutations (Table 2), while envelope showed 19 mutations (Table 1) and membrane showed only 10 point mutations (Table 4).

Assessment of antigenicity and allergenicity
VaxiJen v2.0 server was used to predict the antigenicity of all four structural proteins of SARS-CoV-2.A peptide to be used in vaccine production must bind with the B-cell and T-cell receptors and enhance the cell's immune response.This estimation indicated that envelope protein was the most antigenic one among the four with an antigenicity of 0.6025, followed by membrane glycoprotein, having antigenicity of 0.5102.In contrast, the antigenicity of nucleocapsid phosphoprotein ranked third with a value of 0.5059, and surface glycoprotein was the least antigenic with antigenicity of 0.4696 as shown in Table The allergenicity of these proteins was also estimated to discern whether these antigenic peptides were allergenic or not.A good vaccine should be non-allergenic to the host, and it should not produce any IgE-mediated immune responses in the host.The allergenicity analysis revealed that all four structural proteins were nonallergenic to the host and could be used as potent vaccine targets.

DISCUSSION
On 11 th March 2020, the WHO announced COVID-19 as a global pandemic due to its rapid global spread to numerous countries, and declared it as a serious threat to public health across the world 14 .Therefore, antiviral therapeutics or candidate vaccines were imminently necessary to curb SARS-CoV-2 infections.
The virus primarily caused acute to severe respiratory illness and pneumonia in humans.The symptoms of COVID-19 began within two days, and could continue up to 14 days.The emergence of a huge number of novel mutations in the genome of SARS-CoV-2 may interrupt ongoing vaccine development and trial strategies in different parts of the world.Regular monitoring of mutations has been crucial in tracking and tracing the circulation of this virus among individuals and in different geographical locations.SARS-CoV-2 (Wuhan-Hu-1), complete genome sequence, was deposited in the NCBI Gene bank 15 in January 2020.RNA viruses, including SARS-CoV-2, exhibit a high frequency of novel point mutations which are supposedly beneficial for the viruses to adapt and evolve in changing climatic conditions, thereby enhancing their potential transmission worldwide 16,17 .The mutation is considered as the essential natural selection phenomenon and most often select for those traits of the virus that are a pre-requisite for survival in the highly dynamic host environment.These characteristic features of viruses (e.g.SARS-CoV-2) could complicate the ongoing efforts of researchers to combat this contagious disease because the high frequency of viral mutations induces drug resistance and rapid immune evasion 18, 19 .
The results of our study revealed the presence of a total of 987 mutations from 729 sequences from Asia countries in October.Spike glycoproteins were found to be highly mutagenic amongst the structural proteins, followed by nucleocapsid (151 mutations).Meanwhile, envelope protein showed 19 mutations and only 10 point mutations were found in the membrane proteins.Furthermore, we estimated the antigenicity and allergenicity of these structural proteins to design and develop potentially potent candidate vaccines against SARS-CoV-2.Among the structural proteins, envelope (E) was found to be highly antigenic and least allergenic, followed by nucleocapsid (N), membrane (M), and spike (S) protein of SARS-CoV-2.Antigenicity and allergenicity are the essential criteria to develop efficacious antiviral therapeutics or vaccines against COVID-19.
The mutations cause alterations in the structural proteins of the SARS-CoV-2 virus and, therefore, help in evading the host immunity.The occurrence of the high frequency of mutations creates a barrier in the development of antiviral drugs.Our results shown in this article, here presents a snap shot picture of an enormously changing situation due to SARS-CoV-2.

CONCLUSIONS
In the present study, the occurrence of novel mutations in the different structural proteins of the SARS-CoV-2 virus provides further insight into the identification and magnitude of virulence properties of virus strains, from a large repertoire of strains.Our findings might be useful in the development of effective therapeutic strategies against all types of SARS-CoV-2 strains.

Figure 1 :
Figure 1: Showing the total number of mutations occurring in the structural proteins.a. Surface glycoprotein, b.Envelope protein, c.Membrane glycoprotein and, d.Nucleocapsid phosphoprotein.