scholarly journals Novel de Novo Genome of Cynopterus brachyotis Reveals Evolutionarily Abrupt Shifts in Gene Family Composition across Fruit Bats

2020 ◽  
Vol 12 (4) ◽  
pp. 259-272
Author(s):  
Balaji Chattopadhyay ◽  
Kritika M Garg ◽  
Rajasri Ray ◽  
Ian H Mendenhall ◽  
Frank E Rheindt

Abstract Major novel physiological or phenotypic adaptations often require accompanying modifications at the genic level. Conversely, the detection of considerable contractions and/or expansions of gene families can be an indicator of fundamental but unrecognized physiological change. We sequenced a novel fruit bat genome (Cynopterus brachyotis) and adopted a comparative approach to reconstruct the evolution of fruit bats, mapping contractions and expansions of gene families along their evolutionary history. Despite a radical change in life history as compared with other bats (e.g., loss of echolocation, large size, and frugivory), fruit bats have undergone surprisingly limited change in their genic composition, perhaps apart from a potentially novel gene family expansion relating to telomere protection and longevity. In sharp contrast, within fruit bats, the new Cynopterus genome bears the signal of unusual gene loss and gene family contraction, despite its similar morphology and lifestyle to two other major fruit bat lineages. Most missing genes are regulatory, immune-related, and olfactory in nature, illustrating the diversity of genomic strategies employed by bats to contend with responses to viral infection and olfactory requirements. Our results underscore that significant fluctuations in gene family composition are not always associated with obvious examples of novel physiological and phenotypic adaptations but may often relate to less-obvious shifts in immune strategies.

2019 ◽  
Author(s):  
Cédric Finet ◽  
Kailey Slavik ◽  
Jian Pu ◽  
Sean B. Carroll ◽  
Henry Chung

AbstractThe birth-and-death evolutionary model proposes that some members of a multigene family are phylogenetically stable and persist as a single copy over time whereas other members are phylogenetically unstable and undergo frequent duplication and loss. Functional studies suggest that stable genes are likely to encode essential functions, while rapidly evolving genes reflect phenotypic differences in traits that diverge rapidly among species. One such class of rapidly diverging traits are insect cuticular hydrocarbons (CHCs), which play dual roles in chemical communications as short-range recognition pheromones as well as protecting the insect from desiccation. Insect CHCs diverge rapidly between related species leading to ecological adaptation and/or reproductive isolation. Because the CHC and essential fatty acid biosynthetic pathways share common genes, we hypothesized that genes involved in the synthesis of CHCs would be evolutionary unstable, while those involved in fatty acid-associated essential functions would be evolutionary stable. To test this hypothesis, we investigated the evolutionary history of the fatty acyl-CoA reductases (FARs) gene family that encodes enzymes in CHC synthesis. We compiled a unique dataset of 200 FAR proteins across 12 Drosophila species. We uncovered a broad diversity in FAR content which is generated by gene duplications, subsequent gene losses, and alternative splicing. We also show that FARs expressed in oenocytes and presumably involved in CHC synthesis are more unstable than FARs from other tissues. We suggest that a comparative approach investigating the birth-and-death evolution of gene families can identify candidate genes involved in rapidly diverging traits between species.


2020 ◽  
Vol 12 (3) ◽  
pp. 185-202
Author(s):  
Xia Han ◽  
Jindan Guo ◽  
Erli Pang ◽  
Hongtao Song ◽  
Kui Lin

Abstract How have genes evolved within a well-known genome phylogeny? Many protein-coding genes should have evolved as a whole at the gene level, and some should have evolved partly through fragments at the subgene level. To comprehensively explore such complex homologous relationships and better understand gene family evolution, here, with de novo-identified modules, the subgene units which could consecutively cover proteins within a set of closely related species, we applied a new phylogeny-based approach that considers evolutionary models with partial homology to classify all protein-coding genes in nine Drosophila genomes. Compared with two other popular methods for gene family construction, our approach improved practical gene family classifications with a more reasonable view of homology and provided a much more complete landscape of gene family evolution at the gene and subgene levels. In the case study, we found that most expanded gene families might have evolved mainly through module rearrangements rather than gene duplications and mainly generated single-module genes through partial gene duplication, suggesting that there might be pervasive subgene rearrangement in the evolution of protein-coding gene families. The use of a phylogeny-based approach with partial homology to classify and analyze protein-coding gene families may provide us with a more comprehensive landscape depicting how genes evolve within a well-known genome phylogeny.


2021 ◽  
Author(s):  
Adi Basukriadi ◽  
Erwin Nurdin ◽  
Andri Wibowo ◽  
Jimi Gunawan

AbstractBat is animal that occupies aerosphere, especially fruit bats that forage on the space around the trees. The fruit bats use whether narrow space below tree canopy or in edge space on the edge of canopy. Whereas the aerosphere occupancy of fruits bats related to the specific tree species is poorly understood. Here, this paper aims to assess and model the association of fruit bat Cynopterus brachyotis aerosphere occupancy (Ψ) with tree species planted in mountainous paddy fields in West Java. The studied tree species including Alianthus altissima, Acacia sp., Cocos nucifera, Mangifera indica, Pinus sp., and Swietenia macrophylla. The result shows that the tree species diversity has significantly (x2= 27.67, P < 0.05) affected the C. brachyotis aerosphere occupancy. According to values of Ψ and occupancy percentage, high occupancy of narrow space by C. brachyotis was observed in Swietenia macrophylla (Ψ = 0.934, 78%), followed by Alianthus altissima (Ψ = 0.803, 57%), and Mangifera indica (Ψ = 913, 55%). While high occupancy of edge space was observed in Mangifera indica (Ψ = 0.685, 41%), followed by Pinus sp. (Ψ = 0.674, 38%), and Alianthus altissima sp. (Ψ = 0.627, 36%). The best model for explaining C. brachyotis occupation in narrow space is the tree height with preferences on high tree (Ψ~tree height, AIC = 1.574, R2= 0.5535, Adj. R = 0.4047). While for edge space occupant, the best model is also the tree height (Ψ~tree height, AIC = −26.1510, R2= 0.7944, Adj. R = 0.7258).


2018 ◽  
Vol 35 (13) ◽  
pp. 2199-2207 ◽  
Author(s):  
Carine Rey ◽  
Philippe Veber ◽  
Bastien Boussau ◽  
Marie Sémon

Abstract Motivation RNA sequencing (RNA-Seq) is a widely used approach to obtain transcript sequences in non-model organisms, notably for performing comparative analyses. However, current bioinformatic pipelines do not take full advantage of pre-existing reference data in related species for improving RNA-Seq assembly, annotation and gene family reconstruction. Results We built an automated pipeline named CAARS to combine novel data from RNA-Seq experiments with existing multi-species gene family alignments. RNA-Seq reads are assembled into transcripts by both de novo and assisted assemblies. Then, CAARS incorporates transcripts into gene families, builds gene alignments and trees and uses phylogenetic information to classify the genes as orthologs and paralogs of existing genes. We used CAARS to assemble and annotate RNA-Seq data in rodents and fishes using distantly related genomes as reference, a difficult case for this kind of analysis. We showed CAARS assemblies are more complete and accurate than those assembled by a standard pipeline consisting of de novo assembly coupled with annotation by sequence similarity on a guide species. In addition to annotated transcripts, CAARS provides gene family alignments and trees, annotated with orthology relationships, directly usable for downstream comparative analyses. Availability and implementation CAARS is implemented in Python and Ocaml and is freely available at https://github.com/carinerey/caars. Supplementary information Supplementary data are available at Bioinformatics online.


2019 ◽  
Author(s):  
Eduardo Pérez-Palma ◽  
Patrick May ◽  
Sumaiya Iqbal ◽  
Lisa-Marie Niestroj ◽  
Juanjiangmeng Du ◽  
...  

AbstractMissense variant interpretation is challenging. Essential regions for protein function are conserved among gene family members, and genetic variants within these regions are potentially more likely to confer risk to disease. Here, we generated 2,871 gene family protein sequence alignments involving 9,990 genes and performed missense variant burden analyses to identify novel essential protein regions. We mapped 2,219,811 variants from the general population into these alignments and compared their distribution with 65,034 missense variants from patients. With this gene family approach, we identified 398 regions enriched for patient variants spanning 33,887 amino acids in 1,058 genes. As a comparison, testing the same genes individually we identified less patient variant enriched regions involving only 2,167 amino acids and 180 genes. Next, we selected de novo variants from 6,753 patients with neurodevelopmental disorders and 1,911 unaffected siblings, and observed a 5.56-fold enrichment of patient variants in our identified regions (95% C.I. =2.76-Inf, p-value = 6.66×10−8). Using an independent ClinVar variant set, we found missense variants inside the identified regions are 111-fold more likely to be classified as pathogenic in comparison to benign classification (OR = 111.48, 95% C.I = 68.09-195.58, p-value < 2.2e−16). All patient variant enriched regions identified (PERs) are available online through a user-friendly platform for interactive data mining, visualization and download at http://per.broadinstitute.org. In summary, our gene family burden analysis approach identified novel patient variant enriched regions in protein sequences. This annotation can empower variant interpretation.


2016 ◽  
Vol 2016 ◽  
pp. 1-7 ◽  
Author(s):  
Domenico Iaria ◽  
Adriana Chiappetta ◽  
Innocenzo Muzzalupo

In olive (Olea europaeaL.), the processes controlling self-incompatibility are still unclear and the molecular basis underlying this process are still not fully characterized. In order to determine compatibility relationships, using next-generation sequencing techniques and ade novotranscriptome assembly strategy, we show that pollen tubes from different olive plants, grownin vitroin a medium containing its own pistil and in combination pollen/pistil from self-sterile and self-fertile cultivars, have a distinct gene expression profile and many of the differentially expressed sequences between the samples fall within gene families involved in the development of the pollen tube, such as lipase, carboxylesterase, pectinesterase, pectin methylesterase, and callose synthase. Moreover, different genes involved in signal transduction, transcription, and growth are overrepresented. The analysis also allowed us to identify members in actin and actin depolymerization factor and fibrin gene family and member of the Ca2+binding gene family related to the development and polarization of pollen apical tip. The whole transcriptomic analysis, through the identification of the differentially expressed transcripts set and an extended functional annotation analysis, will lead to a better understanding of the mechanisms of pollen germination and pollen tube growth in the olive.


2017 ◽  
Author(s):  
Dennis Lal ◽  
Patrick May ◽  
Kaitlin E. Samocha ◽  
Jack A. Kosmicki ◽  
Elise B. Robinson ◽  
...  

AbstractDifferentiating risk-conferring from benign missense variants, and therefore optimal calculation of gene-variant burden, represent a major challenge in particular for rare and genetic heterogeneous disorders. While orthologous gene conservation is commonly employed in variant annotation, approximately 80% of known disease-associated genes are paralogs and belong to gene families. It has not been thoroughly investigated how gene family information can be utilized for disease gene discovery and variant interpretation. We developed a paralog conservation score to empirically evaluate whether paralog conserved or nonconserved sites of in-human paralogs are important for protein function. Using this score, we demonstrate that disease-associated missense variants are significantly enriched at paralog conserved sites across all disease groups and disease inheritance models tested. Next, we assessed whether gene family information could assist in discovering novel disease-associated genes. We subsequently developed a gene family de novo enrichment framework that identified 43 exome-wide enriched gene families including 98 de novo variant carrying genes in more than 10k neurodevelopmental disorder patients. 33 gene family enriched genes represent novel candidate genes which are brain expressed and variant constrained in neurodevelopmental disorders.


2021 ◽  
Vol 22 (6) ◽  
Author(s):  
MEIS NANGOY ◽  
TILTJE RANSALELEH ◽  
HANDRY LENGKONG ◽  
Roni Koneri ◽  
ALICE LATINNE ◽  
...  

Abstract. Nangoy M, Ransaleleh T, Lengkong H, Koneri R, Latinne A, Kyes RC. 2021. Diversity of fruit bats (Pteropodidae) and their ectoparasites in Batuputih Nature Tourism Park, Sulawesi, Indonesia. Biodiversitas 22: 3075-3082. Bats play an important role in the ecosystem as pollinators, seed dispersers, and predators, therefore, this study aims to identify the diversity of fruit bat species and ectoparasites at Batuputih Nature Tourism Park, Sulawesi, Indonesia. The study was conducted from May to July 2019, and carried out in three different habitats, namely primary and secondary forest, as well as agricultural land. Besides, the bats were caught using a mist net while the ectoparasites were collected and identified using morphological criteria. A total of 253 bats were sampled representing 10 species (all belonging to the family Pteropodidae) namely Cynopterus brachyotis (24.90%), C. luzoniensis (9.88%), Dobsonia exoleta (1.19%), Macroglossus minimus (3.16%), Nictymene cephalotes (4.75%), N. minutus (0.79%), Rousettus amplexicaudatus (17%), R. celebensis (20.95%), Thoopterus nigrescens (17%), and Thoopterus sp. (0.4%). Cynopterus brachyotis was the most abundant species (n = 63). Meanwhile, a total of 479 ectoparasites were collected and identified as belonging to three families, namely Nycteribiidae, Streblidae, and Spinturnicidae. Nycteribiidae (genus Leptocyclopodia) was the most abundant ectoparasite taxa (n= 475) while the highest mean abundance and intensity were observed for the genus Thoopterus and Rousettus. This study provides important baseline data for future reference in monitoring bat population status and conservation efforts in the region. Given the close relationship between the local people and bats (e.g. hunting and consumption), more work is needed to address the potential pathogen risks from zoonotic transmission, both from bats and the respective ectoparasites.


Genetics ◽  
1996 ◽  
Vol 142 (3) ◽  
pp. 1021-1031 ◽  
Author(s):  
Jianping Hu ◽  
Beth Anderson ◽  
Susan R Wessler

Abstract R and B genes and their homologues encode basic helix-loop-helix (bHLH) transcriptional activators that regulate the anthocyanin biosynthetic pathway in flowering plants. In maize, R/B genes comprise a very small gene family whose organization reflects the unique evolutionary history and genome architecture of maize. To know whether the organization of the R gene family could provide information about the origins of the distantly related grass rice, we characterized members of the R gene family from rice Oryza sativa. Despite being a true diploid, O. sativa has at least two R genes. An active homologue (Ra) with extensive homology with other R genes is located at a position on chromosome 4 previously shown to be in synteny with regions of maize chromosomes 2 and 10 that contain the B and R loci, respectively. A second rice R gene (Rb) of undetermined function was identified on chromosome 1 and found to be present only in rice species with AA genomes. All non-AA species have but one R gene that is Ra-like. These data suggest that the common ancestor shared by maize and rice had a single R gene and that the small R gene families of grasses have arisen recently and independently.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Tongqing Zhang ◽  
Jiawen Yin ◽  
Shengkai Tang ◽  
Daming Li ◽  
Xiankun Gu ◽  
...  

AbstractThe Asian Clam (Corbicula fluminea) is a valuable commercial and medicinal bivalve, which is widely distributed in East and Southeast Asia. As a natural nutrient source, the clam is rich in protein, amino acids, and microelements. The genome of C. fluminea has not yet been characterized; therefore, genome-assisted breeding and improvements cannot yet be implemented. In this work, we present a de novo chromosome-scale genome assembly of C. fluminea using PacBio and Hi-C sequencing technologies. The assembled genome comprised 4728 contigs, with a contig N50 of 521.06 Kb, and 1,215 scaffolds with a scaffold N50 of 70.62 Mb. More than 1.51 Gb (99.17%) of genomic sequences were anchored to 18 chromosomes, of which 1.40 Gb (92.81%) of genomic sequences were ordered and oriented. The genome contains 38,841 coding genes, 32,591 (83.91%) of which were annotated in at least one functional database. Compared with related species, C. fluminea had 851 expanded gene families and 191 contracted gene families. The phylogenetic tree showed that C. fluminea diverged from Ruditapes philippinarum, ~ 228.89 million years ago (Mya), and the genomes of C. fluminea and R. philippinarum shared 244 syntenic blocks. Additionally, we identified 2 MITF members and 99 NLRP members in C. fluminea genome. The high-quality and chromosomal Asian Clam genome will be a valuable resource for a range of development and breeding studies of C. fluminea in future research.


Sign in / Sign up

Export Citation Format

Share Document