scholarly journals Power and Weakness of Repetition – Evaluating the Phylogenetic Signal From Repeatomes in the Family Rosaceae With Two Case Studies From Genera Prone to Polyploidy and Hybridization (Rosa and Fragaria)

2021 ◽  
Vol 12 ◽  
Author(s):  
Veit Herklotz ◽  
Aleš Kovařík ◽  
Volker Wissemann ◽  
Jana Lunerová ◽  
Radka Vozárová ◽  
...  

Plant genomes consist, to a considerable extent, of non-coding repetitive DNA. Several studies showed that phylogenetic signals can be extracted from such repeatome data by using among-species dissimilarities from the RepeatExplorer2 pipeline as distance measures. Here, we advanced this approach by adjusting the read input for comparative clustering indirectly proportional to genome size and by summarizing all clusters into a main distance matrix subjected to Neighbor Joining algorithms and Principal Coordinate Analyses. Thus, our multivariate statistical method works as a “repeatomic fingerprint,” and we proved its power and limitations by exemplarily applying it to the family Rosaceae at intrafamilial and, in the genera Fragaria and Rosa, at the intrageneric level. Since both taxa are prone to hybridization events, we wanted to show whether repeatome data are suitable to unravel the origin of natural and synthetic hybrids. In addition, we compared the results based on complete repeatomes with those from ribosomal DNA clusters only, because they represent one of the most widely used barcoding markers. Our results demonstrated that repeatome data contained a clear phylogenetic signal supporting the current subfamilial classification within Rosaceae. Accordingly, the well-accepted major evolutionary lineages within Fragaria were distinguished, and hybrids showed intermediate positions between parental species in data sets retrieved from both complete repeatomes and rDNA clusters. Within the taxonomically more complicated and particularly frequently hybridizing genus Rosa, we detected rather weak phylogenetic signals but surprisingly found a geographic pattern at a population scale. In sum, our method revealed promising results at larger taxonomic scales as well as within taxa with manageable levels of reticulation, but success remained rather taxon specific. Since repeatomes can be technically easy and comparably inexpensively retrieved even from samples of rather poor DNA quality, our phylogenomic method serves as a valuable alternative when high-quality genomes are unavailable, for example, in the case of old museum specimens.

2020 ◽  
Vol 69 (4) ◽  
pp. 613-622 ◽  
Author(s):  
Rong Zhang ◽  
Yin-Huan Wang ◽  
Jian-Jun Jin ◽  
Gregory W Stull ◽  
Anne Bruneau ◽  
...  

Abstract Phylogenomic analyses have helped resolve many recalcitrant relationships in the angiosperm tree of life, yet phylogenetic resolution of the backbone of the Leguminosae, one of the largest and most economically and ecologically important families, remains poor due to generally limited molecular data and incomplete taxon sampling of previous studies. Here, we resolve many of the Leguminosae’s thorniest nodes through comprehensive analysis of plastome-scale data using multiple modified coding and noncoding data sets of 187 species representing almost all major clades of the family. Additionally, we thoroughly characterize conflicting phylogenomic signal across the plastome in light of the family’s complex history of plastome evolution. Most analyses produced largely congruent topologies with strong statistical support and provided strong support for resolution of some long-controversial deep relationships among the early diverging lineages of the subfamilies Caesalpinioideae and Papilionoideae. The robust phylogenetic backbone reconstructed in this study establishes a framework for future studies on legume classification, evolution, and diversification. However, conflicting phylogenetic signal was detected and quantified at several key nodes that prevent the confident resolution of these nodes using plastome data alone. [Leguminosae; maximum likelihood; phylogenetic conflict; plastome; recalcitrant relationships; stochasticity; systematic error.]


2019 ◽  
Vol 73 (8) ◽  
pp. 893-901
Author(s):  
Sinead J. Barton ◽  
Bryan M. Hennelly

Cosmic ray artifacts may be present in all photo-electric readout systems. In spectroscopy, they present as random unidirectional sharp spikes that distort spectra and may have an affect on post-processing, possibly affecting the results of multivariate statistical classification. A number of methods have previously been proposed to remove cosmic ray artifacts from spectra but the goal of removing the artifacts while making no other change to the underlying spectrum is challenging. One of the most successful and commonly applied methods for the removal of comic ray artifacts involves the capture of two sequential spectra that are compared in order to identify spikes. The disadvantage of this approach is that at least two recordings are necessary, which may be problematic for dynamically changing spectra, and which can reduce the signal-to-noise (S/N) ratio when compared with a single recording of equivalent duration due to the inclusion of two instances of read noise. In this paper, a cosmic ray artefact removal algorithm is proposed that works in a similar way to the double acquisition method but requires only a single capture, so long as a data set of similar spectra is available. The method employs normalized covariance in order to identify a similar spectrum in the data set, from which a direct comparison reveals the presence of cosmic ray artifacts, which are then replaced with the corresponding values from the matching spectrum. The advantage of the proposed method over the double acquisition method is investigated in the context of the S/N ratio and is applied to various data sets of Raman spectra recorded from biological cells.


2003 ◽  
Vol 9 (1) ◽  
pp. 1-17 ◽  
Author(s):  
Paul G. Kotula ◽  
Michael R. Keenan ◽  
Joseph R. Michael

Spectral imaging in the scanning electron microscope (SEM) equipped with an energy-dispersive X-ray (EDX) analyzer has the potential to be a powerful tool for chemical phase identification, but the large data sets have, in the past, proved too large to efficiently analyze. In the present work, we describe the application of a new automated, unbiased, multivariate statistical analysis technique to very large X-ray spectral image data sets. The method, based in part on principal components analysis, returns physically accurate (all positive) component spectra and images in a few minutes on a standard personal computer. The efficacy of the technique for microanalysis is illustrated by the analysis of complex multi-phase materials, particulates, a diffusion couple, and a single-pixel-detection problem.


2017 ◽  
Author(s):  
Ross Mounce

In this thesis I attempt to gather together a wide range of cladistic analyses of fossil and extant taxa representing a diverse array of phylogenetic groups. I use this data to quantitatively compare the effect of fossil taxa relative to extant taxa in terms of support for relationships, number of most parsimonious trees (MPTs) and leaf stability. In line with previous studies I find that the effects of fossil taxa are seldom different to extant taxa – although I highlight some interesting exceptions. I also use this data to compare the phylogenetic signal within vertebrate morphological data sets, by choosing to compare cranial data to postcranial data. Comparisons between molecular data and morphological data have been previously well explored, as have signals between different molecular loci. But comparative signal within morphological data sets is much less commonly characterized and certainly not across a wide array of clades. With this analysis I show that there are many studies in which the evidence provided by cranial data appears to be be significantly incongruent with the postcranial data – more than one would expect to see just by the effect of chance and noise alone. I devise and implement a modification to a rarely used measure of homoplasy that will hopefully encourage its wider usage. Previously it had some undesirable bias associated with the distribution of missing data in a dataset, but my modification controls for this. I also take an in-depth and extensive review of the ILD test, noting it is often misused or reported poorly, even in recent studies. Finally, in attempting to collect data and metadata on a large scale, I uncovered inefficiencies in the research publication system that obstruct re-use of data and scientific progress. I highlight the importance of replication and reproducibility – even simple reanalysis of high profile papers can turn up some very different results. Data is highly valuable and thus it must be retained and made available for further re-use to maximize the overall return on research investment.


PLoS ONE ◽  
2021 ◽  
Vol 16 (6) ◽  
pp. e0251900
Author(s):  
Alejandro Blanco

Our current knowledge on the crocodyliform evolution is strongly biased towards the skull morphology, and the postcranial skeleton is usually neglected in many taxonomic descriptions. However, it is logical to expect that it can contribute with its own phylogenetic signal. In this paper, the changes in the tree topology caused by the addition of the postcranial information are analysed for the family Allodaposuchidae, the most representative eusuchians in the latest Cretaceous of Europe. At present, different phylogenetic hypotheses have been proposed for this group without reaching a consensus. The results of this paper evidence a shift in the phylogenetic position when the postcranium is included in the dataset, pointing to a relevant phylogenetic signal in the postcranial elements. Finally, the phylogenetic relationships of allodaposuchids within Eusuchia are reassessed; and the internal relationships within Allodaposuchidae are also reconsidered after an exhaustive revision of the morphological data. New and improved diagnoses for each species are here provided.


2021 ◽  
Vol 46 (1) ◽  
pp. 162-174
Author(s):  
Ming-Hui Yan ◽  
Chun-Yang Li ◽  
Peter W. Fritsch ◽  
Jie Cai ◽  
Heng-Chang Wang

Abstract—The phylogenetic relationships among 11 out of the 12 genera of the angiosperm family Styracaceae have been largely resolved with DNA sequence data based on all protein-coding genes of the plastome. The only genus that has not been phylogenomically investigated in the family with molecular data is the monotypic genus Parastyrax, which is extremely rare in the wild and difficult to collect. To complete the sampling of the genera comprising the Styracaceae, examine the plastome composition of Parastyrax, and further explore the phylogenetic relationships of the entire family, we sequenced the whole plastome of P. lacei and incorporated it into the Styracaceae dataset for phylogenetic analysis. Similar to most others in the family, the plastome is 158189 bp in length and contains a large single-copy region of 88085 bp and a small single-copy region of 18540 bp separated by two inverted-repeat regions of 25781 bp each. A total of 113 genes was predicted, including 79 protein-coding genes, 30 tRNA genes, and four rRNA genes. Phylogenetic relationships among all 12 genera of the family were constructed with 79 protein-coding genes. Consistent with a previous study, Styrax, Huodendron, and a clade of Alniphyllum + Bruinsmia were successively sister to the remainder of the family. Parastyrax was strongly supported as sister to an internal clade comprising seven other genera of the family, whereas Halesia and Pterostyrax were both recovered as polyphyletic, as in prior studies. However, when we employed either the whole plastome or the large- or small-single copy regions as datasets, Pterostyrax was resolved as monophyletic with 100% support, consistent with expectations based on morphology and indicating that non-coding regions of the Styracaceae plastome contain informative phylogenetic signal. Conversely Halesia was still resolved as polyphyletic but with novel strong support.


2012 ◽  
Vol 81 (3) ◽  
pp. 125-146 ◽  
Author(s):  
Francesca Benzoni ◽  
Roberto Arrigoni ◽  
Fabrizio Stefani ◽  
Bastian T. Reijnen ◽  
Simone Montano ◽  
...  

The scleractinian species Psammocora explanulata and Coscinaraea wellsi were originally classified in the family Siderastreidae, but in a recent morpho-molecular study it appeared that they are more closely related to each other and to the Fungiidae than to any siderastreid taxon. A subsequent morpho-molecular study of the Fungiidae provided new insights regarding the phylogenetic relationships within that family. In the present study existing molecular data sets of both families were analyzed jointly with those of new specimens and sequences of P. explanulata and C. wellsi. The results indicate that both species actually belong to the Cycloseris clade within the family Fungiidae. A reappraisal of their morphologic characters based on museum specimens and recently collected material substantiate the molecular results. Consequently, they are renamed Cycloseris explanulata and C. wellsi. They are polystomatous and encrusting like C. mokai, another species recently added to the genus, whereas all Cycloseris species were initially thought to be monostomatous and free-living. In the light of the new findings, the taxonomy and distribution data of C. explanulata and C. wellsi have been updated and revised. Finally, the ecological implications of the evolutionary history of the three encrusting polystomatous Cycloseris species and their free-living monostomatous congeners are discussed.


1997 ◽  
Vol 3 (S2) ◽  
pp. 931-932 ◽  
Author(s):  
Ian M. Anderson ◽  
Jim Bentley

Recent developments in instrumentation and computing power have greatly improved the potential for quantitative imaging and analysis. For example, products are now commercially available that allow the practical acquisition of spectrum images, where an EELS or EDS spectrum can be acquired from a sequence of positions on the specimen. However, such data files typically contain megabytes of information and may be difficult to manipulate and analyze conveniently or systematically. A number of techniques are being explored for the purpose of analyzing these large data sets. Multivariate statistical analysis (MSA) provides a method for analyzing the raw data set as a whole. The basis of the MSA method has been outlined by Trebbia and Bonnet.MSA has a number of strengths relative to other methods of analysis. First, it is broadly applicable to any series of spectra or images. Applications include characterization of grain boundary segregation (position-), of channeling-enhanced microanalysis (orientation-), or of beam damage (time-variation of spectra).


2020 ◽  
Vol 12 (2) ◽  
pp. 3906-3916 ◽  
Author(s):  
James F Fleming ◽  
Roberto Feuda ◽  
Nicholas W Roberts ◽  
Davide Pisani

Abstract Our ability to correctly reconstruct a phylogenetic tree is strongly affected by both systematic errors and the amount of phylogenetic signal in the data. Current approaches to tackle tree reconstruction artifacts, such as the use of parameter-rich models, do not translate readily to single-gene alignments. This, coupled with the limited amount of phylogenetic information contained in single-gene alignments, makes gene trees particularly difficult to reconstruct. Opsin phylogeny illustrates this problem clearly. Opsins are G-protein coupled receptors utilized in photoreceptive processes across Metazoa and their protein sequences are roughly 300 amino acids long. A number of incongruent opsin phylogenies have been published and opsin evolution remains poorly understood. Here, we present a novel approach, the canary sequence approach, to investigate and potentially circumvent errors in single-gene phylogenies. First, we demonstrate our approach using two well-understood cases of long-branch attraction in single-gene data sets, and simulations. After that, we apply our approach to a large collection of well-characterized opsins to clarify the relationships of the three main opsin subfamilies.


Sign in / Sign up

Export Citation Format

Share Document