Mutation patterns of human SARS-COV-2 and bat RaTG13 coronaviruses genomes are strongly biased towards C>U indicating rapid evolution in their hosts
Abstract Background: The world pandemy caused by SARS-CoV-2 spreading has raised considerable interest about its evolutionary origin and genome structure. Here we analysed mutation patterns in 13 human SARS-COV-2 isolates and a closely related RaTG13 isolated from Rhinolophus affinis bat. We also evaluated the CpG dinucleotide contents in SARS-COV-2 and other human and animal coronavirus genomes. Results: Out of 1107 single nucleotide differences (c. 4% divergence) between human SARS-COV-2 and bat RaTG13, 672 (61%) can be attributed to C>U and U>T substitutions significantly (P<0.001) exceeding other types of SNPs. A similar trend was observed among the 13 sequenced SARS-COV-2 genomes. Accumulation of C>U mutations was also observed in a highly variable subregion encoding the ACE2 receptor contact domain. Contrast to most other coronaviruses both SARS-COV-2 and RaTG13 exhibited CpG depletion in their genomes. Conclusion: The data support that the C-to-U conversion played a significant role in the evolution of pathogenic RNA coronaviruses including SARS-COV-2. These mutations apparently also influenced amino acid composition of the SARS-Cov-2 spike protein domain receptor implicated in virus pathogenicity. We propose that SARS-COV-2 was evolving relatively long in humans following the transfer from animals before spreading world-wide.