BACKGROUND
Coronavirus pandemic has been a wake-up call for the world. A dispute over the origin of SARS-CoV-2 has been raised. Study results showed that all SARS-CoV-2 sequences around the world sharing a common ancestor towards the end of 2019. Nevertheless, it is hard to reach conclusion regarding SARS-CoV-2 origin.
OBJECTIVE
In this study, we compare the divergence of SARS-CoV-2 sequences from the three areas, China, the USA, and Europe.
METHODS
We download SARS-CoV-2 sequences of China, USA, and Europe from the National Center for Biotechnology Information (NCBI). To investigate the diversity of these sequences from these three areas, we apply 17 different nucleotide substitution models to compare the diversity of these SARS-CoV-2 sequences. In the three groups of SARS-CoV-2 sequences, we calculate the pairwise nucleotide substitution distance of any two sequences in each group and then compare the distances in these three groups.
RESULTS
The analyzed results are consistent in most of the 17 substitution models. The outcomes from 14 substitution models show that China has the lowest diversity, followed by Europe and lastly by the USA. For the other 3 models, in one model, China has the lowest diversity, followed by the USA and lastly by Europe; in another model, USA has the lowest diversity, followed by China and lastly by Europe, and in the last one model, Europe has the lowest diversity, followed by China and lastly by the USA.
CONCLUSIONS
In this study, we compare the diversity of SARS-CoV-2 samples from China, Europe, and the USA. Different substitution models were applied to analyze the data. Our outcome shows that China has the smallest mean distance value, followed by Europe and lastly by the USA, which consists with the virus transmission time order that SARS-CoV-2 starts in China, then outbreaks in Europe and finally in the USA.