scholarly journals RASP 4: Ancestral State Reconstruction Tool for Multiple Genes and Characters

2019 ◽  
Vol 37 (2) ◽  
pp. 604-606 ◽  
Author(s):  
Yan Yu ◽  
Christopher Blair ◽  
Xingjin He

Abstract With the continual progress of sequencing techniques, genome-scale data are increasingly used in phylogenetic studies. With more data from throughout the genome, the relationship between genes and different kinds of characters is receiving more attention. Here, we present version 4 of RASP, a software to reconstruct ancestral states through phylogenetic trees. RASP can apply generalized statistical ancestral reconstruction methods to phylogenies, explore the phylogenetic signal of characters to particular trees, calculate distances between trees, and cluster trees into groups. RASP 4 has an improved graphic user interface and is freely available from http://mnh.scu.edu.cn/soft/blog/RASP (program) and https://github.com/sculab/RASP (source code).

2018 ◽  
Vol 39 (4) ◽  
pp. 457-470 ◽  
Author(s):  
le Fras Mouton ◽  
Alexander Flemming ◽  
Michael Bates ◽  
Chris Broeckhoven

Abstract To substantiate the claim of a relationship between generation gland morphology and degree of body armour in cordylid lizards, we studied the nine species in the genus Smaug. We predicted that well armoured species in this clade will have multi-layer generation glands, and lightly armoured species two-layer glands. Gland type was determined using standard histological techniques after sectioning a glandular patch of one adult male per species. A total of 133 specimens were examined for data on tail and occipital spine lengths (which were used as indicators of armour). We found that species with multi-layer generation glands (S. giganteus, S. breyeri, and S. vandami) have relatively long tail and occipital spines, while species with two-layer glands (S. mossambicus, S. regius, S. barbertonensis, S. warreni, and an undescribed species) have relatively short spines. Smaug depressus possesses both multi-layer and two-layer glands, and this variation was linked to regional variation in spine length. An ancestral state reconstruction for the Cordylidae showed that the two-layer state always results from the reduction of layers from a multi-layer precursor, and that reduction always culminates in two-layer glands and not in one-layer glands. This finding suggests that the one-layer state in the Ninurta-Chamaesaura-Pseudocordylus clade is most probably plesiomorphic, and therefore the ancestral state at the Cordylidae and Cordylinae nodes. Given the observed relationship between type of generation gland and body armour, this finding would suggest that the most recent common ancestor of the Cordylidae was lightly armoured.


2017 ◽  
Author(s):  
Xiaofan Zhou ◽  
Sarah Lutteropp ◽  
Lucas Czech ◽  
Alexandros Stamatakis ◽  
Moritz von Looz ◽  
...  

AbstractIncongruence, or topological conflict, is prevalent in genome-scale data sets but relatively few measures have been developed to quantify it. Internode Certainty (IC) and related measures were recently introduced to explicitly quantify the level of incongruence of a given internode (or internal branch) among a set of phylogenetic trees and complement regular branch support statistics in assessing the confidence of the inferred phylogenetic relationships. Since most phylogenomic studies contain data partitions (e.g., genes) with missing taxa and IC scores stem from the frequencies of bipartitions (or splits) on a set of trees, the calculation of IC scores requires adjusting the frequencies of bipartitions from these partial gene trees. However, when the proportion of missing data is high, current approaches that adjust bipartition frequencies in partial gene trees tend to overestimate IC scores and alternative adjustment approaches differ substantially from each other in their scores. To overcome these issues, we developed three new measures for calculating internode certainty that are based on the frequencies of quartets, which naturally apply to both comprehensive and partial trees. Our comparison of these new quartet-based measures to previous bipartition-based measures on simulated data shows that: 1) on comprehensive trees, both types of measures yield highly similar IC scores; 2) on partial trees, quartet-based measures generate more accurate IC scores; and 3) quartet-based measures are more robust to the absence of phylogenetic signal and errors in the phylogenetic relationships to be assessed. Additionally, analysis of 15 empirical phylogenomic data sets using our quartet-based measures suggests that numerous relationships remain unresolved despite the availability of genome-scale data. Finally, we provide an efficient open-source implementation of these quartet-based measures in the program QuartetScores, which is freely available at https://github.com/algomaus/QuartetScores.


2019 ◽  
Vol 69 (2) ◽  
pp. 308-324 ◽  
Author(s):  
Xiaofan Zhou ◽  
Sarah Lutteropp ◽  
Lucas Czech ◽  
Alexandros Stamatakis ◽  
Moritz Von Looz ◽  
...  

Abstract Incongruence, or topological conflict, is prevalent in genome-scale data sets. Internode certainty (IC) and related measures were recently introduced to explicitly quantify the level of incongruence of a given internal branch among a set of phylogenetic trees and complement regular branch support measures (e.g., bootstrap, posterior probability) that instead assess the statistical confidence of inference. Since most phylogenomic studies contain data partitions (e.g., genes) with missing taxa and IC scores stem from the frequencies of bipartitions (or splits) on a set of trees, IC score calculation typically requires adjusting the frequencies of bipartitions from these partial gene trees. However, when the proportion of missing taxa is high, the scores yielded by current approaches that adjust bipartition frequencies in partial gene trees differ substantially from each other and tend to be overestimates. To overcome these issues, we developed three new IC measures based on the frequencies of quartets, which naturally apply to both complete and partial trees. Comparison of our new quartet-based measures to previous bipartition-based measures on simulated data shows that: (1) on complete data sets, both quartet-based and bipartition-based measures yield very similar IC scores; (2) IC scores of quartet-based measures on a given data set with and without missing taxa are more similar than the scores of bipartition-based measures; and (3) quartet-based measures are more robust to the absence of phylogenetic signal and errors in phylogenetic inference than bipartition-based measures. Additionally, the analysis of an empirical mammalian phylogenomic data set using our quartet-based measures reveals the presence of substantial levels of incongruence for numerous internal branches. An efficient open-source implementation of these quartet-based measures is freely available in the program QuartetScores (https://github.com/lutteropp/QuartetScores).


2016 ◽  
Author(s):  
Manuela Royer-Carenzi ◽  
Gilles Didier

Choosing an ancestral state reconstruction method among the alternatives available for quantita- tive characters may be puzzling. We present here a comparison of five of them, namely the maximum likelihood, restricted maximum likelihood, generalized least squares, phylogenetic independent con- trasts and squared parsimony methods. A review of the relations between these methods shows that the first three ones infer the same ancestral states and can only be distinguished by the distributions accounting for the reconstruction uncertainty which they provide. The respective accuracy of the methods is assessed over character evolution simulated under a Brownian motion with (and without) drift. We start by giving the general form of ancestral state distributions conditioned on leaf states under the simulation model. Ancestral distributions are used first, to give a theoretical lower bound of the expected recon- struction error, and second, to develop an original evaluation scheme which is more efficient than comparing the reconstructed and the simulated states. Our simulations show that: (i) the methods do not perform well as the evolution drift increases; (ii) the maximum likelihood method is generally the most accurate and (iii) not all the distributions of the reconstruction uncertainty provided by the methods are equally relevant.


BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Maulana M. Naji ◽  
Yuri T. Utsunomiya ◽  
Johann Sölkner ◽  
Benjamin D. Rosen ◽  
Gábor Mészáros

Abstract Background In evolutionary theory, divergence and speciation can arise from long periods of reproductive isolation, genetic mutation, selection and environmental adaptation. After divergence, alleles can either persist in their initial state (ancestral allele - AA), co-exist or be replaced by a mutated state (derived alleles -DA). In this study, we aligned whole genome sequences of individuals from the Bovinae subfamily to the cattle reference genome (ARS.UCD-1.2) for defining ancestral alleles necessary for selection signatures study. Results Accommodating independent divergent of each lineage from the initial ancestral state, AA were defined based on fixed alleles on at least two groups of yak, bison and gayal-gaur-banteng resulting in ~ 32.4 million variants. Using non-overlapping scanning windows of 10 Kb, we counted the AA observed within taurine and zebu cattle. We focused on the extreme points, regions with top 0. 1% (high count) and regions without any occurrence of AA (null count). High count regions preserved gene functions from ancestral states that are still beneficial in the current condition, while null counts regions were linked to mutated ones. For both cattle, high count regions were associated with basal lipid metabolism, essential for survival of various environmental pressures. Mutated regions were associated to productive traits in taurine, i.e. higher metabolism, cell development and behaviors and in immune response domain for zebu. Conclusions Our findings suggest that retaining and losing AA in some regions are varied and made it species-specific with possibility of overlapping as it depends on the selective pressure they had to experience.


Biomolecules ◽  
2019 ◽  
Vol 9 (10) ◽  
pp. 572 ◽  
Author(s):  
Wang

MicroRNA (miRNA) is a small non-coding RNA that functions in the epigenetics control of gene expression, which can be used as a useful biomarker for diseases. Anti-NMDA receptor (anti-NMDAR) encephalitis is an acute autoimmune disorder. Some patients have been found to have tumors, specifically teratomas. This disease occurs more often in females than in males. Most of them have a significant recovery after tumor resection, which shows that the tumor may induce anti-NMDAR encephalitis. In this study, I review microRNA (miRNA) biomarkers that are associated with anti-NMDAR encephalitis and related tumors, respectively. To the best of my knowledge, there has not been any research in the literature investigating the relationship between anti-NMDAR encephalitis and tumors through their miRNA biomarkers. I adopt a phylogenetic analysis to plot the phylogenetic trees of their miRNA biomarkers. From the analyzed results, it may be concluded that (i) there is a relationship between these tumors and anti-NMDAR encephalitis, and (ii) this disease occurs more often in females than in males. This sheds light on this issue through miRNA intervention.


2011 ◽  
Vol 279 (1728) ◽  
pp. 610-618 ◽  
Author(s):  
Benjamin M. Winger ◽  
Irby J. Lovette ◽  
David W. Winkler

Seasonal migration in birds is known to be highly labile and subject to rapid change in response to selection, such that researchers have hypothesized that phylogenetic relationships should neither predict nor constrain the migratory behaviour of a species. Many theories on the evolution of bird migration assume a framework that extant migratory species have evolved repeatedly and relatively recently from sedentary tropical or subtropical ancestors. We performed ancestral state reconstructions of migratory behaviour using a comprehensive, well-supported phylogeny of the Parulidae (the ‘wood-warblers’), a large family of Neotropical and Nearctic migratory and sedentary songbirds, and examined the rates of gain and loss of migration throughout the Parulidae. Counter to traditional hypotheses, our results suggest that the ancestral wood-warbler was migratory and that losses of migration have been at least as prevalent as gains throughout the history of Parulidae. Therefore, extant sedentary tropical radiations in the Parulidae represent losses of latitudinal migration and colonization of the tropics from temperate regions. We also tested for phylogenetic signal in migratory behaviour, and our results indicate that although migratory behaviour is variable within some wood-warbler species and clades, phylogeny significantly predicts the migratory distance of species in the Parulidae.


2017 ◽  
Author(s):  
◽  
Tuan Anh Trieu

[ACCESS RESTRICTED TO THE UNIVERSITY OF MISSOURI AT AUTHOR'S REQUEST.] Different cell types of an organism have the same DNA sequence, but they can function differently because their difference in 3D organization allows them to express different genes and has different cellular functions. Understanding the 3D organization of the genome is the key to understand functions of the cell. Chromosome conformation capture techniques like Hi-C and TCC that can capture interactions between proximal chromosome fragments have allowed the study of 3D genome organization in high resolution and high through-put. My work focuses on developing computational methods to reconstruct 3D genome structures from Hi-C data. I presented three methods to reconstruct 3D genome and chromosome structures. The first method can build 3D genome models from soft constraints of contacts and non-contacts. This method utilizes the concept of contact and non-contact to reconstruct 3D models without translating interaction frequencies into physical distances. The translation is commonly used by other methods even though it makes a strong assumption about the relationship between interaction frequencies and physical distances. In synthetic dataset, when the relationship was known, my method performed comparably with other methods assuming the relationship. This shows the potential of my method for real Hi-C datasets where the relationship is unknown. The limitation of the method is that it has parameters requiring manual adjustment. I developed the second method to reconstruct 3D genome models. This method utilizes a commonly used function to translate interaction frequencies to physical distances to build 3D models. I proposed a novel way to derive soft constraints to handle inconsistency in the data and to make the method robust. Building 3D models at high resolution is a more challenging problem as the number of constraints is small and the feasible space is larger. I introduced a third method to build 3D chromosome models at high resolution. The method reconstructs models at low resolution and then uses them to guide the reconstruction of models at high resolution. The last part of my work is the development of a comprehensive tool with intuitive graphic user interface to analyze Hi-C data, reconstruct and analyze 3D models.


Sensors ◽  
2018 ◽  
Vol 18 (11) ◽  
pp. 3701 ◽  
Author(s):  
Jin Zheng ◽  
Jinku Li ◽  
Yi Li ◽  
Lihui Peng

Electrical Capacitance Tomography (ECT) image reconstruction has developed for decades and made great achievements, but there is still a need to find a new theoretical framework to make it better and faster. In recent years, machine learning theory has been introduced in the ECT area to solve the image reconstruction problem. However, there is still no public benchmark dataset in the ECT field for the training and testing of machine learning-based image reconstruction algorithms. On the other hand, a public benchmark dataset can provide a standard framework to evaluate and compare the results of different image reconstruction methods. In this paper, a benchmark dataset for ECT image reconstruction is presented. Like the great contribution of ImageNet that transformed machine learning research, this benchmark dataset is hoped to be helpful for society to investigate new image reconstruction algorithms since the relationship between permittivity distribution and capacitance can be better mapped. In addition, different machine learning-based image reconstruction algorithms can be trained and tested by the unified dataset, and the results can be evaluated and compared under the same standard, thus, making the ECT image reconstruction study more open and causing a breakthrough.


2019 ◽  
Author(s):  
Willie Anderson dos Santos Vieira ◽  
Priscila Alves Bezerra ◽  
Anthony Carlos da Silva ◽  
Josiene Silva Veloso ◽  
Marcos Paz Saraiva Câmara ◽  
...  

ABSTRACTColletotrichumis among the most important genera of fungal plant pathogens. Molecular phylogenetic studies over the last decade have resulted in a much better understanding of the evolutionary relationships and species boundaries within the genus. There are now approximately 200 species accepted, most of which are distributed among 13 species complexes. Given their prominence on agricultural crops around the world, rapid identification of a large collection ofColletotrichumisolates is routinely needed by plant pathologists, regulatory officials, and fungal biologists. However, there is no agreement on the best molecular markers to discriminate species in each species complex. Here we calculate the barcode gap distance and intra/inter-specific distance overlap to evaluate each of the most commonly applied molecular markers for their utility as a barcode for species identification. Glyceraldehyde-3-phosphate dehydrogenase (GAPDH), histone-3 (HIS3), DNA lyase (APN2), intergenic spacer between DNA lyase and the mating-type locusMAT1-2-1 (APN2/MAT-IGS), and intergenic spacer between GAPDH and a hypothetical protein (GAP2-IGS) have the properties of good barcodes, whereas sequences of actin (ACT), chitin synthase (CHS-1) and nuclear rDNA internal transcribed spacers (nrITS) are not able to distinguish most species. Finally, we assessed the utility of these markers for phylogenetic studies using phylogenetic informativeness profiling, the genealogical sorting index (GSI), and Bayesian concordance analyses (BCA). Although GAPDH, HIS3 and β-tubulin (TUB2) were frequently among the best markers, there was not a single set of markers that were best for all species complexes. Eliminating markers with low phylogenetic signal tends to decrease uncertainty in the topology, regardless of species complex, and leads to a larger proportion of markers that support each lineage in the Bayesian concordance analyses. Finally, we reconstruct the phylogeny of each species complex using a minimal set of phylogenetic markers with the strongest phylogenetic signal and find the majority of species are strongly supported as monophyletic.


Sign in / Sign up

Export Citation Format

Share Document