scholarly journals Gene markers for exon capture and phylogenomics in ray-finned fishes

2017 ◽  
Author(s):  
Jiamei Jiang ◽  
Hao Yuan ◽  
Xin Zheng ◽  
Qian Wang ◽  
Ting Kuang ◽  
...  

AbstractGene capture coupled with the next generation sequencing has become one of the favorable methods in subsampling genomes for phylogenomic studies. Many target gene markers have been developed in plants, sharks, frogs, reptiles and others, but few have been reported in the ray-finned fishes. Here, we identified a suite of “single-copy” protein coding sequence (CDS) markers through comparing eight fish genomes, and tested them empirically in 83 species (33 families and 11 orders) of ray-finned fishes. Sorting through the markers according to their completeness and phylogenetic decisiveness in taxa tested resulted in a selection of 4,434 markers, which were proven to be useful in reconstructing phylogenies of the ray-finned fishes at different taxonomic level. We also proposed a strategy of refining baits (probes) design a posteriori based on empirical data. The markers that we have developed may fill a gap in the tool kit of phylogenomic study in vertebrates.


2016 ◽  
Author(s):  
Luisa C. Teasdale ◽  
Frank Köehler ◽  
Kevin D. Murray ◽  
Tim O’Hara ◽  
Adnan Moussalli

ABSTRACTThe qualification of orthology is a significant challenge when developing large, multiloci phylogenetic datasets from assembled transcripts. Transcriptome assemblies have various attributes, such as fragmentation, frameshifts, and mis-indexing, which pose problems to automated methods of orthology assessment. Here, we identify a set of orthologous single-copy genes from transcriptome assemblies for the land snails and slugs (Eupulmonata) using a thorough approach to orthology determination involving manual alignment curation, gene tree assessment and sequencing from genomic DNA. We qualified the orthology of 500 nuclear, protein coding genes from the transcriptome assemblies of 21 eupulmonate species to produce the most complete gene data matrix for a major molluscan lineage to date, both in terms of taxon and character completeness. Exon-capture targeting 490 of the 500 genes (those with at least one exon > 120 bp) from 22 species of Australian Camaenidae successfully captured sequences of 2,825 exons (representing all targeted genes), with only a 3.7% reduction in the data matrix due to the presence of putative paralogs or pseudogenes. The automated pipeline Agalma retrieved the majority of the manually qualified 500 single-copy gene set and identified a further 375 putative single-copy genes, although it failed to account for fragmented transcripts resulting in lower data matrix completeness. This could potentially explain the minor inconsistencies we observed in the supported topologies for the 21 eupulmonate species between the manually curated and Agalma-equivalent dataset (sharing 458 genes). Overall, our study confirms the utility of the 500 gene set to resolve phylogenetic relationships at a broad range of evolutionary depths, and highlights the importance of addressing fragmentation at the homolog alignment stage for probe design.



Plants ◽  
2020 ◽  
Vol 9 (2) ◽  
pp. 199 ◽  
Author(s):  
Arif Khan ◽  
Sajjad Asaf ◽  
Abdul Latif Khan ◽  
Tariq Shehzad ◽  
Ahmed Al-Rawahi ◽  
...  

Euphorbia is one of the largest genera in the Euphorbiaceae family, comprising 2000 species possessing commercial, medicinal, and ornamental importance. However, there are very little data available on their molecular phylogeny and genomics, and uncertainties still exist at a taxonomic level. Herein, we sequence the complete chloroplast (cp) genomes of two species, E. larica and E. smithii, of the genus Euphorbia through next-generation sequencing and perform a comparative analysis with nine related genomes in the family. The results revealed that the cp genomes had similar quadripartite structure, gene content, and genome organization with previously reported genomes from the same family. The size of cp genomes ranged from 162,172 to 162,358 bp with 132 and 133 genes, 8 rRNAs, 39 tRNA in E. smithii and E. larica, respectively. The numbers of protein-coding genes were 85 and 86, with each containing 19 introns. The four-junction regions were studied and results reveal that rps19 was present at JLB (large single copy region and inverted repeat b junction) in E. larica where its complete presence was located in the IRb (inverted repeat b) region in E. smithii. The sequence comparison revealed that highly divergent regions in rpoC1, rpocB, ycf3, clpP, petD, ycf1, and ndhF of the cp genomes might provide better understanding of phylogenetic inferences in the Euphorbiaceae and order Malpighiales. Phylogenetic analyses of this study illustrate sister clades of E. smithii with E. tricullii and these species form a monophyletic clade with E. larica. The current study might help us to understand the genome architecture, genetic diversity among populations, and evolutionary depiction in the genera.



2021 ◽  
Author(s):  
Chengfeng Yang ◽  
Qinzhi Su ◽  
Min Tang ◽  
Shiqi Luo ◽  
Hao Zheng ◽  
...  

An in-depth understanding of microbial function and the division of ecological niches requires accurate delineation and identification of microbes at a fine taxonomic resolution. Microbial phylotypes are typically defined using a 97% small subunit (16S) rRNA threshold. However, increasing evidence has demonstrated the ubiquitous presence of taxonomic units of distinct functions within phylotypes. These so-called sequence-discrete populations (SDPs) have used to be mainly delineated by disjunct sequence similarity at the whole-genome level. However, gene markers that could accurately identify and quantify SDPs are lacking in microbial community studies. Here we developed a pipeline to screen single-copy protein-coding genes that could accurately characterize SDP diversity via amplicon sequencing of microbial communities. Fifteen candidate marker genes were evaluated using three criteria (extent of sequence divergence, phylogenetic accuracy, and conservation of primer regions) and the selected genes were subject to test the efficiency in differentiating SDPs within Gilliamella, a core honeybee gut microbial phylotype, as a proof-of-concept. The results showed that the 16S V4 region failed to report accurate SDP diversities due to low taxonomic resolution and changing copy numbers. In contrast, the single-copy genes recommended by our pipeline were able to successfully quantify Gilliamella SDPs for both mock samples and honeybee guts, with results highly consistent with those of metagenomics. The pipeline developed in this study is expected to identify single-copy protein coding genes capable of accurately quantifying diverse bacterial communities at the SDP level.



2021 ◽  
Vol 3 ◽  
Author(s):  
Jannis Rinne ◽  
Claus-Peter Witte ◽  
Marco Herde

In this study, we describe the establishment of the knockout marker gene MAR1 for selection of CRISPR/Cas9-edited Arabidopsis seedlings and tomato explants in tissue culture. MAR1 encodes a transporter that is located in mitochondria and chloroplasts and is involved in iron homeostasis. It also opportunistically transports aminoglycoside antibiotics into these organelles and defects of the gene render plants insensitive to those compounds. Here, we show that mutations of MAR1 induced by the CRISPR system confer kanamycin-resistance to Arabidopsis plants and tomato tissues. MAR1 is single-copy in a variety of plant species and the corresponding proteins form a distinct phylogenetic clade allowing easy identification of MAR1 orthologs in different plants. We demonstrate that in multiplexing approaches, where Arabidopsis seedlings were selected via a CRISPR/Cas9-induced kanamycin resistance mediated by MAR1 mutation, a mutation in a second target gene was observed with higher frequency than in a control population only selected for the presence of the transgene. This so called co-selection has not been shown before to occur in plants. The technique can be employed to select for edited plants, which might be particularly useful if editing events are rare.



PLoS ONE ◽  
2021 ◽  
Vol 16 (9) ◽  
pp. e0256861
Author(s):  
Danielle N. Stringer ◽  
Terry Bertozzi ◽  
Karen Meusemann ◽  
Steven Delean ◽  
Michelle T. Guzik ◽  
...  

Transcriptome-based exon capture approaches, along with next-generation sequencing, are allowing for the rapid and cost-effective production of extensive and informative phylogenomic datasets from non-model organisms for phylogenetics and population genetics research. These approaches generally employ a reference genome to infer the intron-exon structure of targeted loci and preferentially select longer exons. However, in the absence of an existing and well-annotated genome, we applied this exon capture method directly, without initially identifying intron-exon boundaries for bait design, to a group of highly diverse Haloniscus (Philosciidae), paraplatyarthrid and armadillid isopods, and examined the performance of our methods and bait design for phylogenetic inference. Here, we identified an isopod-specific set of single-copy protein-coding loci, and a custom bait design to capture targeted regions from 469 genes, and analysed the resulting sequence data with a mapping approach and newly-created post-processing scripts. We effectively recovered a large and informative dataset comprising both short (<100 bp) and longer (>300 bp) exons, with high uniformity in sequencing depth. We were also able to successfully capture exon data from up to 16-year-old museum specimens along with more distantly related outgroup taxa, and efficiently pool multiple samples prior to capture. Our well-resolved phylogenies highlight the overall utility of this methodological approach and custom bait design, which offer enormous potential for application to future isopod, as well as broader crustacean, molecular studies.



1995 ◽  
Vol 31 (2) ◽  
pp. 193-204 ◽  
Author(s):  
Koen Grijspeerdt ◽  
Peter Vanrolleghem ◽  
Willy Verstraete

A comparative study of several recently proposed one-dimensional sedimentation models has been made. This has been achieved by fitting these models to steady-state and dynamic concentration profiles obtained in a down-scaled secondary decanter. The models were evaluated with several a posteriori model selection criteria. Since the purpose of the modelling task is to do on-line simulations, the calculation time was used as one of the selection criteria. Finally, the practical identifiability of the models for the available data sets was also investigated. It could be concluded that the model of Takács et al. (1991) gave the most reliable results.



Plants ◽  
2021 ◽  
Vol 10 (8) ◽  
pp. 1517
Author(s):  
Se-Hwan Cheon ◽  
Min-Ah Woo ◽  
Sangjin Jo ◽  
Young-Kee Kim ◽  
Ki-Joong Kim

The genus Zoysia Willd. (Chloridoideae) is widely distributed from the temperate regions of Northeast Asia—including China, Japan, and Korea—to the tropical regions of Southeast Asia. Among these, four species—Zoysia japonica Steud., Zoysia sinica Hance, Zoysia tenuifolia Thiele, and Zoysia macrostachya Franch. & Sav.—are naturally distributed in the Korean Peninsula. In this study, we report the complete plastome sequences of these Korean Zoysia species (NCBI acc. nos. MF953592, MF967579~MF967581). The length of Zoysia plastomes ranges from 135,854 to 135,904 bp, and the plastomes have a typical quadripartite structure, which consists of a pair of inverted repeat regions (20,962~20,966 bp) separated by a large (81,348~81,392 bp) and a small (12,582~12,586 bp) single-copy region. In terms of gene order and structure, Zoysia plastomes are similar to the typical plastomes of Poaceae. The plastomes encode 110 genes, of which 76 are protein-coding genes, 30 are tRNA genes, and four are rRNA genes. Fourteen genes contain single introns and one gene has two introns. Three evolutionary hotspot spacer regions—atpB~rbcL, rps16~rps3, and rpl32~trnL-UAG—were recognized among six analyzed Zoysia species. The high divergences in the atpB~rbcL spacer and rpl16~rpl3 region are primarily due to the differences in base substitutions and indels. In contrast, the high divergence between rpl32~trnL-UAG spacers is due to a small inversion with a pair of 22 bp stem and an 11 bp loop. Simple sequence repeats (SSRs) were identified in 59 different locations in Z. japonica, 63 in Z. sinica, 62 in Z. macrostachya, and 63 in Z. tenuifolia plastomes. Phylogenetic analysis showed that the Zoysia (Zoysiinae) forms a monophyletic group, which is sister to Sporobolus (Sporobolinae), with 100% bootstrap support. Within the Zoysia clade, the relationship of (Z. sinica, Z japonica), (Z. tenuifolia, Z. matrella), (Z. macrostachya, Z. macrantha) was suggested.



2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Anzhen Fu ◽  
Qing Wang ◽  
Jianlou Mu ◽  
Lili Ma ◽  
Changlong Wen ◽  
...  

AbstractChayote (Sechium edule) is an agricultural crop in the Cucurbitaceae family that is rich in bioactive components. To enhance genetic research on chayote, we used Nanopore third-generation sequencing combined with Hi–C data to assemble a draft chayote genome. A chromosome-level assembly anchored on 14 chromosomes (N50 contig and scaffold sizes of 8.40 and 46.56 Mb, respectively) estimated the genome size as 606.42 Mb, which is large for the Cucurbitaceae, with 65.94% (401.08 Mb) of the genome comprising repetitive sequences; 28,237 protein-coding genes were predicted. Comparative genome analysis indicated that chayote and snake gourd diverged from sponge gourd and that a whole-genome duplication (WGD) event occurred in chayote at 25 ± 4 Mya. Transcriptional and metabolic analysis revealed genes involved in fruit texture, pigment, flavor, flavonoids, antioxidants, and plant hormones during chayote fruit development. The analysis of the genome, transcriptome, and metabolome provides insights into chayote evolution and lays the groundwork for future research on fruit and tuber development and genetic improvements in chayote.



2018 ◽  
Vol 50 (10) ◽  
pp. 884-892
Author(s):  
T. M. Casey ◽  
J. F. Walker ◽  
K. Bhide ◽  
J. Thimmapuram ◽  
J. P. Schoonmaker

Steer progeny suckled by cows fed a dried distillers grains and solubles (DDGS) diet the first 3 mo of lactation were heavier during feedlot finishing and had significantly lower marbling and larger longissimus muscles than steers suckled by cows fed a control diet (CON). These differences were profound in that progeny were managed and fed identically from weaning until finishing, and findings suggest that the suckling period established the developmental program of muscle composition. Here transcriptomes of longissimus muscle were measured by next-generation sequencing to investigate whether there were any developmental clues to the differences in marbling scores and muscle content between steers suckled by DDGS ( n = 5) vs. control (CON; n = 5) diet-fed cows during lactation. There were 809 genes differentially expressed ( P-adj<0.1) between CON and DDGS muscle. Of these 636 were upregulated and 173 downregulated in DDGS relative to CON. Overall the DDGS vs. CON muscle transcriptomic signature was promyogenic and antiadipogenic. In particular, myokines/satellite cell maintenance factors were found among upregulated (LIF, CNTF, FGFB1, EPHB1) genes. The antiadipogenic signature was typified by the upregulation of anti-inflammatory cytokines and receptors (IL1RAP, IL1RL2, IL13RA2, IL1F10), and downregulation of expression of inflammation/inflammatory cytokines and receptor (TNF, IL6R, CXCL9), which suggests a selection of differentiation pathways away from adipogenic line. The upregulation of TGFB, SPP1, and INHBA supports selection of fibroblast lineage of cells. Thus, the lactation phase of production can effect meat quality by affecting transcriptional signatures that favor myogenesis and depress inflammation.



2012 ◽  
Vol 287 (15) ◽  
pp. 12405-12416 ◽  
Author(s):  
Tong Zhang ◽  
Jhoanna G. Berrocal ◽  
Jie Yao ◽  
Michelle E. DuMond ◽  
Raga Krishnakumar ◽  
...  

NMNAT-1 and PARP-1, two key enzymes in the NAD+ metabolic pathway, localize to the nucleus where integration of their enzymatic activities has the potential to control a variety of nuclear processes. Using a variety of biochemical, molecular, cell-based, and genomic assays, we show that NMNAT-1 and PARP-1 physically and functionally interact at target gene promoters in MCF-7 cells. Specifically, we show that PARP-1 recruits NMNAT-1 to promoters where it produces NAD+ to support PARP-1 catalytic activity, but also enhances the enzymatic activity of PARP-1 independently of NAD+ production. Furthermore, using two-photon excitation microscopy, we show that NMNAT-1 catalyzes the production of NAD+ in a nuclear pool that may be distinct from other cellular compartments. In expression microarray experiments, depletion of NMNAT-1 or PARP-1 alters the expression of about 200 protein-coding genes each, with about 10% overlap between the two gene sets. NMNAT-1 enzymatic activity is required for PARP-1-dependent poly(ADP-ribosyl)ation at the promoters of commonly regulated target genes, as well as the expression of those target genes. Collectively, our studies link the enzymatic activities of NMNAT-1 and PARP-1 to the regulation of a set of common target genes through functional interactions at target gene promoters.



Sign in / Sign up

Export Citation Format

Share Document