mutation model
Recently Published Documents


TOTAL DOCUMENTS

148
(FIVE YEARS 25)

H-INDEX

23
(FIVE YEARS 1)

2022 ◽  
Author(s):  
Mahmudur Rahman Hera ◽  
N Tessa Pierce-Ward ◽  
David Koslicki

Sketching methods offer computational biologists scalable techniques to analyze data sets that continue to grow in size. MinHash is one such technique that has enjoyed recent broad application. However, traditional MinHash has previously been shown to perform poorly when applied to sets of very dissimilar sizes. FracMinHash was recently introduced as a modification of MinHash to compensate for this lack of performance when set sizes differ. While experimental evidence has been encouraging, FracMinHash has not yet been analyzed from a theoretical perspective. In this paper, we perform such an analysis and prove that while FracMinHash is not unbiased, this bias is easily corrected. Next, we detail how a simple mutation model interacts with FracMinHash and are able to derive confidence intervals for evolutionary mutation distances between pairs of sequences as well as hypothesis tests for FracMinHash. We find that FracMinHash estimates the containment of a genome in a large metagenome more accurately and more precisely when compared to traditional MinHash, and the confidence interval performs significantly better in estimating mutation distances. A python-based implementation of the theorems we derive is freely available at https://github.com/KoslickiLab/mutation-rate-ci-calculator. The results presented in this paper can be reproduced using the code at https://github.com/KoslickiLab/ScaledMinHash-reproducibles.


BMC Cancer ◽  
2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Congyu Shi ◽  
Shan Liu ◽  
Xudong Tian ◽  
Xiaoyi Wang ◽  
Pan Gao

Abstract Background Tumor protein p53 (TP53) is the most frequently mutated gene in head and neck squamous cell carcinoma (HNSC), and TP53 mutations are associated with inhibited immune signatures and poor prognosis. We established a TP53 mutation associated risk score model to evaluate the prognosis and therapeutic responses of patients with HNSC. Methods Differentially expressed genes between patients with and without TP53 mutations were determined by using data from the HNSC cohort in The Cancer Genome Atlas database. Patients with HNSC were divided into high- and low-risk groups based on a prognostic risk score that was generated from ten TP53 mutation associated genes via the multivariate Cox regression model. Results TP53 was the most common mutant gene in HNSC, and TP53 mutations were associated with immunogenic signatures, including the infiltration of immune cells and expression of immune-associated genes. Patients in the high-risk group had significantly poorer overall survival than those in the low-risk group. The high-risk group showed less response to anti-programmed cell death protein 1 (PD-1) therapy but high sensitivity to some chemotherapies. Conclusion The risk score based on our TP53 mutation model was associated with poorer survival and could act as a specific predictor for assessing prognosis and therapeutic response in patients with HNSC.


2021 ◽  
Author(s):  
Jennifer L. McCann ◽  
Agnese Cristini ◽  
Emily K. Law ◽  
Seo Yun Lee ◽  
Michael Tellier ◽  
...  

AbstractThe single-stranded DNA cytosine-to-uracil deaminase APOBEC3B is an antiviral protein implicated in cancer. However, its substrates in cells are not fully delineated. Here, APOBEC3B proteomics reveal interactions with a surprising number of R-loop factors. Biochemical experiments show APOBEC3B binding to R-loops in human cells and in vitro. Genetic experiments demonstrate R-loop increases in cells lacking APOBEC3B and decreases in cells overexpressing APOBEC3B. Genome-wide analyses show major changes in the overall landscape of physiological and stimulus-induced R-loops with thousands of differentially altered regions as well as binding of APOBEC3B to many of these sites. APOBEC3 mutagenesis impacts overexpressed genes and splice factor mutant tumors preferentially, and APOBEC3-attributed kataegis are enriched in RTCW consistent with APOBEC3B deamination. Taken together with the fact that APOBEC3B binds single-stranded DNA and RNA and preferentially deaminates DNA, these results support a mechanism in which APOBEC3B mediates R-loop homeostasis and contributes to R-loop mutagenesis in cancer.HighlightsUnbiased proteomics link antiviral APOBEC3B to R-loop regulationSystematic alterations of APOBEC3B levels trigger corresponding changes in R-loopsAPOBEC3B binds R-loops in living cells and in vitroBioinformatics analyses support an R-loop deamination and mutation model


2021 ◽  
Author(s):  
Nicolas Alcala ◽  
Noah A Rosenberg

Interpretations of values of the FST measure of genetic differentiation rely on an understanding of its mathematical constraints. Previously, it has been shown that FST values computed from a biallelic locus in a set of multiple populations and FST values computed from a multiallelic locus in a pair of populations are mathematically constrained by the frequency of the allele that is most frequent across populations. We report here the mathematical constraint on FST given the frequency M of the most frequent allele at a multiallelic locus in a set of multiple populations, providing the most general description to date of mathematical constraints on FST in terms of M. Using coalescent simulations of an island model of migration with an infinitely-many-alleles mutation model, we argue that the joint distribution of FST and M helps in disentangling the separate influences of mutation and migration on FST. Finally, we show that our results explain puzzling patterns of microsatellite differentiation, such as the lower FST values in interspecific comparisons between humans and chimpanzees than in the intraspecific comparison of chimpanzee populations. We discuss the implications of our results for the use of FST.


2021 ◽  
Author(s):  
Melih Can

booste geentic algorithm with continuous adaptive mutation


2021 ◽  
Author(s):  
Melih Can

booste geentic algorithm with continuous adaptive mutation


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Yu Cao ◽  
Da-Yong Zhang ◽  
Yan-Fei Zeng ◽  
Wei-Ning Bai

Abstract Background Accurate inference of demographic histories for temperate tree species can aid our understanding of current climate change as a driver of evolution. Microsatellites are more suitable for inferring recent historical events due to their high mutation rates. However, most programs analyzing microsatellite data assume a strict stepwise mutation model (SMM), which could cause false detection of population shrinkage when microsatellite mutation does not follow SMM. Results This study aims to reconstruct the recent demographic histories of five cool-temperate tree species in Eastern Asia, Quercus mongolica, Q. liaotungensis, Juglans cathayensis, J. mandshurica and J. ailantifolia, by using 19 microsatellite markers with two methods considering generalized stepwise mutation model (GSM) (MIGRAINE and VarEff). Both programs revealed that all the five species experienced expansions after the Last Glacial Maximum (LGM). Within butternuts, J. cathayensis experienced a more serious bottleneck than the other species, and within oaks, Q. mongolica showed a moderate increase in population size and remained stable after the expansion. In addition, the point estimates of the multistep mutation proportion in the GSM model (pGSM) for all five species were between 0.50 and 0.65, indicating that when inferring population demographic history of the cool-temperate forest species using microsatellite markers, it is better to assume a GSM rather than a SMM. Conclusions This study provides the first direct evidence that five cool-temperate tree species in East Asia have experienced expansions after the LGM with microsatellite data. Considering the mutation model of microsatellite has a vital influence on demographic inference, combining multiple programs such as MIGRAINE and VarEff can effectively reduce errors caused by inappropriate model selection and prior setting.


2020 ◽  
pp. 22-27
Author(s):  
A.V. Shelyov ◽  
◽  
K.V. Kopylov ◽  
N.P. Prokopenko ◽  
S.S. Kramarenko ◽  
...  

The analysis of allelic polymorphism of five industrial egg crosses of chickens by five microsatellite DNA loci (ADL0268, MCW216, LEI0094, ADL0278, and MCW248) was carried out. DNA loci were chosen according to the recommendations of the International Society for Animal Genetics (ISAG). Based on the results of mathematical-statistical processing and data analysis, the spectra and frequencies of allelic variability, the peculiarities of allele pools, were identified, and unique alleles were identified. In general, the species Gallus gallus is characterized by a specific character of allelic spectra for all investigated microsatellite DNA loci (P <0.001). The highest rates of allelic variability were recorded in brown crosses "Lohmann brown" and "Hisex brown" (Na (LimNa)=(9.2 (5-17) and 7.4 (6-11), respectively). The studied crosses were characterized by a shift in allelic spectra towards a decrease in the fragment length. “Lohmann white” stands out among the birds of other crosses by high consolidation for individual alleles for all studied microsatellites (from ADL278114 – 0.343 and ADL268108 – 0.485 to LEI094259 – 0.720, MCW0248213 – 0.785 and MCW0216137 – 0.920). Unique alleles with the highest frequency were found in brown cross chickens, and in the “Hy-Line W-98” bird, they were not found. The number of unique alleles identified varied from 1 ("Hisex white") to 11 ("Lohmann brown"). Locus LEI094 turned out to be the most polymorphic in terms of the number of unique alleles – 10 such allelic variants were identified for it. No unique alleles were identified at the ADL0268 locus. The obtained estimates criterion χ2 of K. Pearson indicate significant differences in the frequency distribution of alleles for all studied loci. When using the MICROSATELLITE ANALYSER software, it was found that the nature of the variability of the studied microsatellite DNA loci in five industrial crosses of the egg chickens, both in the number of identified alleles and in the nature of their distribution, corresponded to the stepwise mutation model (SMM).


2020 ◽  
Vol 16 (4) ◽  
Author(s):  
Andrzej Krajka ◽  
Ireneusz Panasiuk ◽  
Adam Misiura ◽  
Grzegorz M. Wójcik

AbstractObjectivesThe most common technique of determining biological paternity or another relationship among people are the investigations of DNA polymorphism called Fingerprinting DNA. The key concept of these investigations is the statistical analysis, which leads to obtain the likelihood ratio (LR), sometimes called the paternity index.MethodsAmong the different assumptions stated in these computations is a mutation model (this model is used for all the computations).Results and conclusionsAlthough its influence on LR is usually negligible, there are some situations (when the mother–child mutation arises) when it is crucial.


2020 ◽  
Author(s):  
Yu Cao ◽  
Da-Yong Zhang ◽  
Yan-Fei Zeng ◽  
Wei-Ning Bai

Abstract Background Accurate inference of demographic histories of temperate tree species can aid our understanding of current climate change as a driver of evolution. Microsatellites are more suitable for reflecting recent historical events due to their high mutation rates. However, most programs analyze microsatellite data following a strict stepwise mutation model (SMM), which could cause false detection of population shrinkage when microsatellite mutation is not according with SMM. Results This study aims to reconstruct the recent demographic histories of five cool-temperate tree species, Quercus mongolica, Q. liaotungensis, Juglans cathayensis, J. mandshurica and J. ailantifolia, in eastern Asia by using 19 microsatellite markers and two methods considering generalized stepwise mutation model (GSM) (MIGRAINE and VarEff). Both types of software revealed that all populations experienced expansions after the Last Glacial Maximum (LGM). In particular, J. cathayensis experienced a more serious bottleneck in its history than the other species, leading to a smaller effective population of ancestors, while Q. mongolica showed only a moderate increase in population size and remained stable after the expansion. In addition, the point estimates of the multistep mutation proportion in the generalized stepwise mutation model (pGSM) in all populations were between 0.50 and 0.65, which indicates that when inferring population demographic history of the above forest species using microsatellite molecular markers, it is better to assume a GSM rather than a SMM. Conclusions This study provides the first direct evidence that five cool-temperate tree species in East Asia have experienced expansions after the LGM using microsatellites data. Moreover, since the mutation model of microsatellite have a vital influence on demographic inference, combining multiple software programs such as MIGRAINE and VarEff can effectively reduce unnecessary errors caused by inappropriate model selection and prior setting.


Sign in / Sign up

Export Citation Format

Share Document