scholarly journals Pattern of New Gene Origination in a Special Fish Lineage, the Flatfishes

Genes ◽  
2021 ◽  
Vol 12 (11) ◽  
pp. 1819
Author(s):  
Haorong Li ◽  
Chunyan Chen ◽  
Zhongkai Wang ◽  
Kun Wang ◽  
Yongxin Li ◽  
...  

Origination of new genes are of inherent interest of evolutionary geneticists for decades, but few studies have addressed the general pattern in a fish lineage. Using our recent released whole genome data of flatfishes, which evolved one of the most specialized body plans in vertebrates, we identified 1541 (6.9% of the starry flounder genes) flatfish-lineage-specific genes. The origination pattern of these flatfish new genes is largely similar to those observed in other vertebrates, as shown by the proportion of DNA-mediated duplication (1317; 85.5%), RNA-mediated duplication (retrogenes; 96; 6.2%), and de novo–origination (128; 8.3%). The emergence rate of species-specific genes is 32.1 per Mya and the whole average level rate for the flatfish-lineage-specific genes is 20.9 per Mya. A large proportion (31.4%) of these new genes have been subjected to selection, in contrast to the 4.0% in primates, while the old genes remain quite similar (66.4% vs. 65.0%). In addition, most of these new genes (70.8%) are found to be expressed, indicating their functionality. This study not only presents one example of systematic new gene identification in a teleost taxon based on comprehensive phylogenomic data, but also shows that new genes may play roles in body planning.

Genes ◽  
2021 ◽  
Vol 12 (10) ◽  
pp. 1628
Author(s):  
Saara K. Luna ◽  
Frédéric J. J. Chain

Gene duplications generate new genes that can contribute to expression changes and the evolution of new functions. Genomes often consist of gene families that undergo expansions, some of which occur in specific lineages that reflect recent adaptive diversification. In this study, lineage-specific genes and gene family expansions were studied across five dictyostelid species to determine when and how they are expressed during multicellular development. Lineage-specific genes were found to be enriched among genes with biased expression (predominant expression in one developmental stage) in each species and at most developmental time points, suggesting independent functional innovations of new genes throughout the phylogeny. Biased duplicate genes had greater expression divergence than their orthologs and paralogs, consistent with subfunctionalization or neofunctionalization. Lineage-specific expansions in particular had biased genes with both molecular signals of positive selection and high expression, suggesting adaptive genetic and transcriptional diversification following duplication. Our results present insights into the potential contributions of lineage-specific genes and families in generating species-specific phenotypes during multicellular development in dictyostelids.


2016 ◽  
Author(s):  
Peter A. Andrews ◽  
Ivan Iossifov ◽  
Jude Kendall ◽  
Steven Marks ◽  
Lakshmi Muthuswamy ◽  
...  

AbstractMotivationStandard genome sequence alignment tools primarily designed to find one alignment per read have difficulty detecting inversion, translocation and large insertion and deletion (indel) events. Moreover, dedicated split read alignment methods that depend only upon the reference genome may misidentify or find too many potential split read alignments because of reference genome anomalies.MethodsWe introduce MUMdex, a Maximal Unique Match (MUM)-based genomic analysis software package consisting of a sequence aligner to the reference genome, a storage-indexing format and analysis software. Discordant reference alignments of MUMs are especially suitable for identifying inversion, translocation and large indel differences in unique regions. Extracted population databases are used as filters for flaws in the reference genome. We describe the concepts underlying MUM-based analysis, the software implementation and its usage.ResultsWe demonstrate via simulation that the MUMdex aligner and alignment format are able to correctly detect and record genomic events. We characterize alignment performance and output file sizes for human whole genome data and compare to Bowtie 2 and the BAM format. Preliminary results demonstrate the practicality of the analysis approach by detecting de novo mutation candidates in human whole genome DNA sequence data from 510 families. We provide a population database of events from these families for use by others.Availabilityhttp://mumdex.com/[email protected] (or [email protected])Supplementary informationSupplementary data are available online.


2021 ◽  
Author(s):  
Yuan Huang ◽  
Jiahui Chen ◽  
Chuan Dong ◽  
Dylan Sosa ◽  
Shengqian Xia ◽  
...  

Abstract Gene duplication is increasingly recognized as an important mechanism for the origination of new genes, as revealed by comparative genomic analysis. However, how new duplicate genes contribute to phenotypic evolution remains largely unknown, especially in plants. Here, we identified the new gene EXOV, derived from a partial gene duplication of its parental gene EXOVL in Arabidopsis thaliana. EXOV is a species-specific gene that originated within the last 3.5 million years and shows strong signals of positive selection. Unexpectedly, RNA-seq analyses revealed that, despite its young age, EXOV has acquired many novel direct and indirect interactions in which the parental gene does not engage. This observation is consistent with the high, selection-driven substitution rate of its encoded protein, in contrast to the slowly evolving EXOVL, suggesting an important role for EXOV in phenotypic evolution. We observed significant differentiation of morphological changes for all phenotypes assessed in genome-edited and T-DNA insertional single mutants and in double T-DNA insertion mutants in EXOV and EXOVL. We discovered a substantial divergence of phenotypic effects by principal component analyses, suggesting neofunctionalization of the new gene. These results reveal a young gene that plays critical roles in biological processes that underlie morphological evolution in A. thaliana.


2021 ◽  
Vol 19 (3) ◽  
pp. e32
Author(s):  
Jeong-An Gim ◽  
Kyung-Wan Baek ◽  
Young-Sool Hah ◽  
Ho Jin Choo ◽  
Ji-Seok Kim ◽  
...  

Semisulcospira libertina, a species of freshwater snail, is widespread in East Asia. It is important as a food source. Additionally, it is a vector of clonorchiasis, paragonimiasis, metagonimiasis, and other parasites. Although S. libertina has ecological, commercial, and clinical importance, its whole-genome has not been reported yet. Here, we revealed the genome of S. libertina through de novo assembly. We assembled the whole-genome of S. libertina and determined its transcriptome for the first time using Illumina NovaSeq 6000 platform. According to the k-mer analysis, the genome size of S. libertina was estimated to be 3.04 Gb. Using RepeatMasker, a total of 53.68% of repeats were identified in the genome assembly. Genome data of S. libertina reported in this study will be useful for identification and conservation of S. libertina in East Asia.


2017 ◽  
Vol 55 (6) ◽  
pp. 1946-1953 ◽  
Author(s):  
Scott A. Cunningham ◽  
Nicholas Chia ◽  
Patricio R. Jeraldo ◽  
Daniel J. Quest ◽  
Julie A. Johnson ◽  
...  

ABSTRACT Whole-genome sequencing (WGS) can provide excellent resolution in global and local epidemiological investigations of Staphylococcus aureus outbreaks. A variety of sequencing approaches and analytical tools have been used; it is not clear which is ideal. We compared two WGS strategies and two analytical approaches to the standard method of SmaI restriction digestion pulsed-field gel electrophoresis (PFGE) for typing S. aureus . Forty-two S. aureus isolates from three outbreaks and 12 reference isolates were studied. Near-complete genomes, assembled de novo with paired-end and long-mate-pair (8 kb) libraries were first assembled and analyzed utilizing an in-house assembly and analytical informatics pipeline. In addition, paired-end data were assembled and analyzed using a commercial software package. Single nucleotide variant (SNP) analysis was performed using the in-house pipeline. Two assembly strategies were used to generate core genome multilocus sequence typing (cgMLST) data. First, the near-complete genome data generated with the in-house pipeline were imported into the commercial software and used to perform cgMLST analysis. Second, the commercial software was used to assemble paired-end data, and resolved assemblies were used to perform cgMLST. Similar isolate clustering was observed using SNP calling and cgMLST, regardless of data assembly strategy. All methods provided more discrimination between outbreaks than did PFGE. Overall, all of the evaluated WGS strategies yielded statistically similar results for S. aureus typing.


Author(s):  
А.Р. Зарипова ◽  
Л.Р. Нургалиева ◽  
А.В. Тюрин ◽  
И.Р. Минниахметов ◽  
Р.И. Хусаинова

Проведено исследование гена интерферон индуцированного трансмембранного белка 5 (IFITM5) у 99 пациентов с несовершенным остеогенезом (НО) из 86 неродственных семей. НО - клинически и генетически гетерогенное наследственное заболевание соединительной ткани, основное клиническое проявление которого - множественные переломы, начиная с неонатального периода жизни, зачастую приводящие к инвалидизации с детского возраста. К основным клиническим признакам НО относятся голубые склеры, потеря слуха, аномалия дентина, повышенная ломкость костей, нарушения роста и осанки с развитием характерных инвалидизирующих деформаций костей и сопутствующих проблем, включающих дыхательные, неврологические, сердечные, почечные нарушения. НО встречается как у мужчин, так и у женщин. До сих пор не определена степень генетической гетерогенности заболевания. На сегодняшний день известно 20 генов, вовлеченных в патогенез НО, и исследователи разных стран продолжают искать новые гены. В последнее десятилетие стало известно, что аутосомно-рецессивные, аутосомно-доминантные и Х-сцепленные мутации в широком спектре генов, кодирующих белки, которые участвуют в синтезе коллагена I типа, его процессинге, секреции и посттрансляционной модификации, а также в белках, которые регулируют дифференцировку и активность костеобразующих клеток, вызывают НО. Мутации в гене IFITM5, также называемом BRIL (bone-restricted IFITM-like protein), участвующем в формировании остеобластов, приводят к развитию НО типа V. До 5% пациентов имеют НО типа V, который характеризуется образованием гиперпластического каллуса после переломов, кальцификацией межкостной мембраны предплечья и сетчатым рисунком ламелирования, наблюдаемого при гистологическом исследовании кости. В 2012 г. гетерозиготная мутация (c.-14C> T) в 5’-нетранслируемой области (UTR) гена IFITM5 была идентифицирована как основная причина НО V типа. В представленной работе проведен анализ гена IFITM5 и идентифицирована мутация c.-14C>T, возникшая de novo, у одного пациента с НО, которому впоследствии был установлен V тип заболевания. Также выявлены три известных полиморфных варианта: rs57285449; c.80G>C (p.Gly27Ala) и rs2293745; c.187-45C>T и rs755971385 c.279G>A (p.Thr93=) и один ранее не описанный вариант: c.128G>A (p.Ser43Asn) AGC>AAC (S/D), которые не являются патогенными. В статье уделяется внимание особенностям клинических проявлений НО V типа и рекомендуется определение мутации c.-14C>T в гене IFITM5 при подозрении на данную форму заболевания. A study was made of interferon-induced transmembrane protein 5 gene (IFITM5) in 99 patients with osteogenesis imperfecta (OI) from 86 unrelated families and a search for pathogenic gene variants involved in the formation of the disease phenotype. OI is a clinically and genetically heterogeneous hereditary disease of the connective tissue, the main clinical manifestation of which is multiple fractures, starting from the natal period of life, often leading to disability from childhood. The main clinical signs of OI include blue sclera, hearing loss, anomaly of dentin, increased fragility of bones, impaired growth and posture, with the development of characteristic disabling bone deformities and associated problems, including respiratory, neurological, cardiac, and renal disorders. OI occurs in both men and women. The degree of genetic heterogeneity of the disease has not yet been determined. To date, 20 genes are known to be involved in the pathogenesis of OI, and researchers from different countries continue to search for new genes. In the last decade, it has become known that autosomal recessive, autosomal dominant and X-linked mutations in a wide range of genes encoding proteins that are involved in the synthesis of type I collagen, its processing, secretion and post-translational modification, as well as in proteins that regulate the differentiation and activity of bone-forming cells cause OI. Mutations in the IFITM5 gene, also called BRIL (bone-restricted IFITM-like protein), involved in the formation of osteoblasts, lead to the development of OI type V. Up to 5% of patients have OI type V, which is characterized by the formation of a hyperplastic callus after fractures, calcification of the interosseous membrane of the forearm, and a mesh lamellar pattern observed during histological examination of the bone. In 2012, a heterozygous mutation (c.-14C> T) in the 5’-untranslated region (UTR) of the IFITM5 gene was identified as the main cause of OI type V. In the present work, the IFITM5 gene was analyzed and the de novo c.-14C> T mutation was identified in one patient with OI who was subsequently diagnosed with type V of the disease. Three known polymorphic variants were also identified: rs57285449; c.80G> C (p.Gly27Ala) and rs2293745; c.187-45C> T and rs755971385 c.279G> A (p.Thr93 =) and one previously undescribed variant: c.128G> A (p.Ser43Asn) AGC> AAC (S / D), which were not pathogenic. The article focuses on the features of the clinical manifestations of OI type V, and it is recommended to determine the c.-14C> T mutation in the IFITM5 gene if this form of the disease is suspected.


2021 ◽  
Vol 6 (1) ◽  
Author(s):  
Brent S. Pedersen ◽  
Joe M. Brown ◽  
Harriet Dashnow ◽  
Amelia D. Wallace ◽  
Matt Velinder ◽  
...  

AbstractIn studies of families with rare disease, it is common to screen for de novo mutations, as well as recessive or dominant variants that explain the phenotype. However, the filtering strategies and software used to prioritize high-confidence variants vary from study to study. In an effort to establish recommendations for rare disease research, we explore effective guidelines for variant (SNP and INDEL) filtering and report the expected number of candidates for de novo dominant, recessive, and autosomal dominant modes of inheritance. We derived these guidelines using two large family-based cohorts that underwent whole-genome sequencing, as well as two family cohorts with whole-exome sequencing. The filters are applied to common attributes, including genotype-quality, sequencing depth, allele balance, and population allele frequency. The resulting guidelines yield ~10 candidate SNP and INDEL variants per exome, and 18 per genome for recessive and de novo dominant modes of inheritance, with substantially more candidates for autosomal dominant inheritance. For family-based, whole-genome sequencing studies, this number includes an average of three de novo, ten compound heterozygous, one autosomal recessive, four X-linked variants, and roughly 100 candidate variants following autosomal dominant inheritance. The slivar software we developed to establish and rapidly apply these filters to VCF files is available at https://github.com/brentp/slivar under an MIT license, and includes documentation and recommendations for best practices for rare disease analysis.


Animals ◽  
2021 ◽  
Vol 11 (8) ◽  
pp. 2226
Author(s):  
Sazia Kunvar ◽  
Sylwia Czarnomska ◽  
Cino Pertoldi ◽  
Małgorzata Tokarska

The European bison is a non-model organism; thus, most of its genetic and genomic analyses have been performed using cattle-specific resources, such as BovineSNP50 BeadChip or Illumina Bovine 800 K HD Bead Chip. The problem with non-specific tools is the potential loss of evolutionary diversified information (ascertainment bias) and species-specific markers. Here, we have used a genotyping-by-sequencing (GBS) approach for genotyping 256 samples from the European bison population in Bialowieza Forest (Poland) and performed an analysis using two integrated pipelines of the STACKS software: one is de novo (without reference genome) and the other is a reference pipeline (with reference genome). Moreover, we used a reference pipeline with two different genomes, i.e., Bos taurus and European bison. Genotyping by sequencing (GBS) is a useful tool for SNP genotyping in non-model organisms due to its cost effectiveness. Our results support GBS with a reference pipeline without PCR duplicates as a powerful approach for studying the population structure and genotyping data of non-model organisms. We found more polymorphic markers in the reference pipeline in comparison to the de novo pipeline. The decreased number of SNPs from the de novo pipeline could be due to the extremely low level of heterozygosity in European bison. It has been confirmed that all the de novo/Bos taurus and Bos taurus reference pipeline obtained SNPs were unique and not included in 800 K BovineHD BeadChip.


Author(s):  
Seyoung Mun ◽  
Songmi Kim ◽  
Wooseok Lee ◽  
Keunsoo Kang ◽  
Thomas J. Meyer ◽  
...  

AbstractAdvances in next-generation sequencing (NGS) technology have made personal genome sequencing possible, and indeed, many individual human genomes have now been sequenced. Comparisons of these individual genomes have revealed substantial genomic differences between human populations as well as between individuals from closely related ethnic groups. Transposable elements (TEs) are known to be one of the major sources of these variations and act through various mechanisms, including de novo insertion, insertion-mediated deletion, and TE–TE recombination-mediated deletion. In this study, we carried out de novo whole-genome sequencing of one Korean individual (KPGP9) via multiple insert-size libraries. The de novo whole-genome assembly resulted in 31,305 scaffolds with a scaffold N50 size of 13.23 Mb. Furthermore, through computational data analysis and experimental verification, we revealed that 182 TE-associated structural variation (TASV) insertions and 89 TASV deletions contributed 64,232 bp in sequence gain and 82,772 bp in sequence loss, respectively, in the KPGP9 genome relative to the hg19 reference genome. We also verified structural differences associated with TASVs by comparative analysis with TASVs in recent genomes (AK1 and TCGA genomes) and reported their details. Here, we constructed a new Korean de novo whole-genome assembly and provide the first study, to our knowledge, focused on the identification of TASVs in an individual Korean genome. Our findings again highlight the role of TEs as a major driver of structural variations in human individual genomes.


Genes ◽  
2021 ◽  
Vol 12 (2) ◽  
pp. 300
Author(s):  
Camilla Ceccatelli Berti ◽  
Giulia di Punzio ◽  
Cristina Dallabona ◽  
Enrico Baruffini ◽  
Paola Goffrini ◽  
...  

The increasing application of next generation sequencing approaches to the analysis of human exome and whole genome data has enabled the identification of novel variants and new genes involved in mitochondrial diseases. The ability of surviving in the absence of oxidative phosphorylation (OXPHOS) and mitochondrial genome makes the yeast Saccharomyces cerevisiae an excellent model system for investigating the role of these new variants in mitochondrial-related conditions and dissecting the molecular mechanisms associated with these diseases. The aim of this review was to highlight the main advantages offered by this model for the study of mitochondrial diseases, from the validation and characterisation of novel mutations to the dissection of the role played by genes in mitochondrial functionality and the discovery of potential therapeutic molecules. The review also provides a summary of the main contributions to the understanding of mitochondrial diseases emerged from the study of this simple eukaryotic organism.


Sign in / Sign up

Export Citation Format

Share Document