scholarly journals On Context-Tree Prediction of Individual Sequences

2007 ◽  
Vol 53 (5) ◽  
pp. 1860-1866 ◽  
Author(s):  
Jacob Ziv ◽  
Neri Merhav
2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Suman Pokhrel ◽  
Benjamin R. Kraemer ◽  
Scott Burkholz ◽  
Daria Mochly-Rosen

AbstractIn December 2019, a novel coronavirus, termed severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), was identified as the cause of pneumonia with severe respiratory distress and outbreaks in Wuhan, China. The rapid and global spread of SARS-CoV-2 resulted in the coronavirus 2019 (COVID-19) pandemic. Earlier during the pandemic, there were limited genetic viral variations. As millions of people became infected, multiple single amino acid substitutions emerged. Many of these substitutions have no consequences. However, some of the new variants show a greater infection rate, more severe disease, and reduced sensitivity to current prophylaxes and treatments. Of particular importance in SARS-CoV-2 transmission are mutations that occur in the Spike (S) protein, the protein on the viral outer envelope that binds to the human angiotensin-converting enzyme receptor (hACE2). Here, we conducted a comprehensive analysis of 441,168 individual virus sequences isolated from humans throughout the world. From the individual sequences, we identified 3540 unique amino acid substitutions in the S protein. Analysis of these different variants in the S protein pinpointed important functional and structural sites in the protein. This information may guide the development of effective vaccines and therapeutics to help arrest the spread of the COVID-19 pandemic.


Author(s):  
Alexander O'Neill ◽  
Marcus Hutter ◽  
Wen Shao ◽  
Peter Sunehag
Keyword(s):  

2019 ◽  
Vol 41 (1) ◽  
pp. 69-76
Author(s):  
Teresa Jakubczyk

Abstract The paper presents the results of analysis of duration of precipitation sequences and the amounts of precipitation in individual sequences in Legnica. The study was aimed at an analysis of potential trends and regularities in atmospheric precipitations over the period of 1966–2015. On their basis a prediction attempt was made for trends in subsequent years. The analysis was made by fitting data to suitable distributions – the Weibull distribution for diurnal sums in sequences and the Pascal distribution for sequence durations, and then by analysing the variation of the particular indices such the mean value, variance and quartiles. The analysis was performed for five six-week periods in a year, from spring to late autumn, analysed in consecutive five-year periods. The trends of the analysed indices, observed over the fifty-year period, are not statistically significant, which indicates stability of precipitation conditions over the last half-century.


2021 ◽  
Author(s):  
Roshan Rao ◽  
Jason Liu ◽  
Robert Verkuil ◽  
Joshua Meier ◽  
John F. Canny ◽  
...  

AbstractUnsupervised protein language models trained across millions of diverse sequences learn structure and function of proteins. Protein language models studied to date have been trained to perform inference from individual sequences. The longstanding approach in computational biology has been to make inferences from a family of evolutionarily related sequences by fitting a model to each family independently. In this work we combine the two paradigms. We introduce a protein language model which takes as input a set of sequences in the form of a multiple sequence alignment. The model interleaves row and column attention across the input sequences and is trained with a variant of the masked language modeling objective across many protein families. The performance of the model surpasses current state-of-the-art unsupervised structure learning methods by a wide margin, with far greater parameter efficiency than prior state-of-the-art protein language models.


Sign in / Sign up

Export Citation Format

Share Document