scholarly journals De novo assembly and annotation of Asiatic lion (Panthera leo persica) genome

2019 ◽  
Author(s):  
Siuli Mitra ◽  
Ara Sreenivas ◽  
Divya Tej Sowpati ◽  
Amitha Sampat Kumar ◽  
Gowri Awasthi ◽  
...  

AbstractWe report the first draft of the whole genome assembly of a male Asiatic lion, Atul and whole transcriptomes of five Asiatic lion individuals. Evaluation of genetic diversity placed the Asiatic lion in the lowest bracket of genomic diversity index highlighting the gravity of its conservation status. Comparative analysis with other felids and mammalian genomes unraveled the evolutionary history of Asiatic lion and its position among other felids. The genome is estimated to be 2.3 Gb (Gigabase) long with 62X sequence coverage and is found to have 20,543 protein-coding genes. About 2.66% of the genome is covered by simple sequence repeats (SSRs) and 0.4% is estimated to have segmental duplications. Comparison with seven well annotated genomes indicates the presence of 6,295 single copy orthologs, 4 co-orthologs, 21 paralogs uniquely present in Asiatic lion and 8,024 other orthologs. Assessment of male and female transcriptomes gave a list of genes specifically expressed in the male.Our genomic analyses provide candidates for phenotypes characteristic to felids and lion, inviting further confirmation of their contribution through population genetic studies. An Asiatic lion-specific expansion is detected in the Cysteine Dioxygenase-I (CDO-I) family that is responsible for taurine biosynthesis in cats. Wilm’s tumor-associated protein (WT1) family, a non-Y chromosome genetic factor underlying male-sex determination and differentiation is found to have undergone expansion, interestingly like that of the human genome. Another protein family, translation machinery-associated protein 7 (TMA7) that has undergone expansion in humans, also expanded in Asiatic lion and can be further investigated as a candidate responsible for mane in lions because of its role in hair follicle morphogenesis.

2018 ◽  
Vol 2018 ◽  
pp. 1-10
Author(s):  
Alexandre Bueno Santos ◽  
Patrícia Silva Costa ◽  
Anderson Oliveira do Carmo ◽  
Gabriel da Rocha Fernandes ◽  
Larissa Lopes Silva Scholte ◽  
...  

Members of the genusChromobacteriumhave been isolated from geographically diverse ecosystems and exhibit considerable metabolic flexibility, as well as biotechnological and pathogenic properties in some species. This study reports the draft assembly and detailed sequence analysis ofChromobacterium amazonensestrain 56AF. The de novo-assembled genome is 4,556,707 bp in size and contains 4294 protein-coding and 95 RNA genes, including 88 tRNA, six rRNA, and one tmRNA operon. A repertoire of genes implicated in virulence, for example, hemolysin, hemolytic enterotoxins, colicin V, lytic proteins, and Nudix hydrolases, is present. The genome also contains a collection of genes of biotechnological interest, including esterases, lipase, auxins, chitinases, phytoene synthase and phytoene desaturase, polyhydroxyalkanoates, violacein, plastocyanin/azurin, and detoxifying compounds. Importantly, unlike otherChromobacteriumspecies, the 56AF genome contains genes for pore-forming toxin alpha-hemolysin, a type IV secretion system, among others. The analysis of theC. amazonensestrain 56AF genome reveals the versatility, adaptability, and biotechnological potential of this bacterium. This study provides molecular information that may pave the way for further comparative genomics and functional studies involvingChromobacterium-related isolates and improves our understanding of the global genomic diversity ofChromobacteriumspecies.


2021 ◽  
Vol 12 ◽  
Author(s):  
Fenghua Tian ◽  
Changtian Li ◽  
Yu Li

Yuanmo [Sarcomyxa edulis (Y.C. Dai, Niemelä & G.F. Qin) T. Saito, Tonouchi & T. Harada] is an important edible and medicinal mushroom endemic to Northeastern China. Here we report the de novo sequencing and assembly of the S. edulis genome using single-molecule real-time sequencing technology. The whole genome was approximately 35.65 Mb, with a G + C content of 48.31%. Genome assembly generated 41 contigs with an N50 length of 1,772,559 bp. The genome comprised 9,364 annotated protein-coding genes, many of which encoded enzymes involved in the modification, biosynthesis, and degradation of glycoconjugates and carbohydrates or enzymes predicted to be involved in the biosynthesis of secondary metabolites such as terpene, type I polyketide, siderophore, and fatty acids, which are responsible for the pharmacodynamic activities of S. edulis. We also identified genes encoding 1,3-β-glucan synthase and endo-1,3(4)-β-glucanase, which are involved in polysaccharide and uridine diphosphate glucose biosynthesis. Phylogenetic and comparative analyses of Basidiomycota fungi based on a single-copy orthologous protein indicated that the Sarcomyxa genus is an independent group that evolved from the Pleurotaceae family. The annotated whole-genome sequence of S. edulis can serve as a reference for investigations of bioactive compounds with medicinal value and the development and commercial production of superior S. edulis varieties.


2019 ◽  
Vol 42 (4) ◽  
pp. 601-611 ◽  
Author(s):  
Yan Li ◽  
Liukun Jia ◽  
Zhihua Wang ◽  
Rui Xing ◽  
Xiaofeng Chi ◽  
...  

Abstract Saxifraga sinomontana J.-T. Pan & Gornall belongs to Saxifraga sect. Ciliatae subsect. Hirculoideae, a lineage containing ca. 110 species whose phylogenetic relationships are largely unresolved due to recent rapid radiations. Analyses of complete chloroplast genomes have the potential to significantly improve the resolution of phylogenetic relationships in this young plant lineage. The complete chloroplast genome of S. sinomontana was de novo sequenced, assembled and then compared with that of other six Saxifragaceae species. The S. sinomontana chloroplast genome is 147,240 bp in length with a typical quadripartite structure, including a large single-copy region of 79,310 bp and a small single-copy region of 16,874 bp separated by a pair of inverted repeats (IRs) of 25,528 bp each. The chloroplast genome contains 113 unique genes, including 79 protein-coding genes, four rRNAs and 30 tRNAs, with 18 duplicates in the IRs. The gene content and organization are similar to other Saxifragaceae chloroplast genomes. Sixty-one simple sequence repeats were identified in the S. sinomontana chloroplast genome, mostly represented by mononucleotide repeats of polyadenine or polythymine. Comparative analysis revealed 12 highly divergent regions in the intergenic spacers, as well as coding genes of matK, ndhK, accD, cemA, rpoA, rps19, ndhF, ccsA, ndhD and ycf1. Phylogenetic reconstruction of seven Saxifragaceae species based on 66 protein-coding genes received high bootstrap support values for nearly all identified nodes, suggesting a promising opportunity to resolve infrasectional relationships of the most species-rich section Ciliatae of Saxifraga.


F1000Research ◽  
2021 ◽  
Vol 10 ◽  
pp. 961
Author(s):  
Kevin McKernan ◽  
Liam Kane ◽  
Yvonne Helbert ◽  
Lei Zhang ◽  
Nathan Houde ◽  
...  

The Psilocybe genus is well known for the synthesis of valuable psychoactive compounds such as Psilocybin, Psilocin, Baeocystin and Aeruginascin. The ubiquity of Psilocybin synthesis in Psilocybe has been attributed to a horizontal gene transfer mechanism of a ~20Kb gene cluster. A recently published highly contiguous reference genome derived from long read single molecule sequencing has underscored interesting variation in this Psilocybin synthesis gene cluster. This reference genome has also enabled the shotgun sequencing of spores from many Psilocybe strains to better catalog the genomic diversity in the Psilocybin synthesis pathway. Here we present the de novo assembly of 81 Psilocybe genomes compared to the P.envy reference genome. Surprisingly, the genomes of Psilocybe galindoi, Psilocybe tampanensis and Psilocybe azurescens lack sequence coverage over the previously described Psilocybin synthesis pathway but do demonstrate amino acid sequence homology to a less contiguous gene cluster and may illuminate the previously proposed evolution of psilocybin synthesis.


2017 ◽  
Author(s):  
Mariana B. Grizante ◽  
Marc Tollis ◽  
Juan J. Rodriguez ◽  
Ofir Levy ◽  
Michael J. Angilletta ◽  
...  

AbstractBackgroundThe eastern fence lizard (Sceloporus undulatus) has been a model species for ecological and evolutionary research. Genomic and transcriptomic resources for this species would promote investigation of genetic mechanisms that underpin plastic responses to environmental stress, such as climate warming. Moreover, such resources would aid comparative studies of complex traits at the molecular level, such as the transition from oviparous to viviparous reproduction, which happened at least four times within Sceloporus.FindingsA de novo transcriptome assembly for Sceloporus undulatus, Sund_v1.0, was generated using over 179 million Illumina reads obtained from three tissues (whole brain, skeletal muscle, and embryo) as well as previously reported liver sequences. The Sund_v1.0 assembly had an average contig length of 782 nucleotides and an E90N50 statistic of 2,550 nucleotides. Comparing S. undulatus transcripts with the benchmarking universal single-copy orthologs (BUSCO) for tetrapod species yielded 97.2% gene representation. A total of 13,422 protein-coding orthologs were identified in comparison to the genome of the green anole lizard, Anolis carolinensis, which is the closest related species with genomic data available.ConclusionsThe multi-tissue transcriptome of S. undulatus is the first for a member of the family Phrynosomatidae, offering an important resource to advance studies of adaptation in this species and genomic research in reptiles.


PeerJ ◽  
2020 ◽  
Vol 8 ◽  
pp. e9552
Author(s):  
Furrukh Mehmood ◽  
Abdullah ◽  
Zartasha Ubaid ◽  
Iram Shahzadi ◽  
Ibrar Ahmed ◽  
...  

Species of the genus Nicotiana (Solanaceae), commonly referred to as tobacco plants, are often cultivated as non-food crops and garden ornamentals. In addition to the worldwide production of tobacco leaves, they are also used as evolutionary model systems due to their complex development history tangled by polyploidy and hybridization. Here, we assembled the plastid genomes of five tobacco species: N. knightiana, N. rustica, N. paniculata, N. obtusifolia and N. glauca. De novo assembled tobacco plastid genomes had the typical quadripartite structure, consisting of a pair of inverted repeat (IR) regions (25,323–25,369 bp each) separated by a large single-copy (LSC) region (86,510–86,716 bp) and a small single-copy (SSC) region (18,441–18,555 bp). Comparative analyses of Nicotiana plastid genomes with currently available Solanaceae genome sequences showed similar GC and gene content, codon usage, simple sequence and oligonucleotide repeats, RNA editing sites, and substitutions. We identified 20 highly polymorphic regions, mostly belonging to intergenic spacer regions (IGS), which could be suitable for the development of robust and cost-effective markers for inferring the phylogeny of the genus Nicotiana and family Solanaceae. Our comparative plastid genome analysis revealed that the maternal parent of the tetraploid N. rustica was the common ancestor of N. paniculata and N. knightiana, and the later species is more closely related to N. rustica. Relaxed molecular clock analyses estimated the speciation event between N. rustica and N. knightiana appeared 0.56 Ma (HPD 0.65–0.46). Biogeographical analysis supported a south-to-north range expansion and diversification for N. rustica and related species, where N. undulata and N. paniculata evolved in North/Central Peru, while N. rustica developed in Southern Peru and separated from N. knightiana, which adapted to the Southern coastal climatic regimes. We further inspected selective pressure on protein-coding genes among tobacco species to determine if this adaptation process affected the evolution of plastid genes. These analyses indicate that four genes involved in different plastid functions, including DNA replication (rpoA) and photosynthesis (atpB, ndhD and ndhF), came under positive selective pressure as a result of specific environmental conditions. Genetic mutations in these genes might have contributed to better survival and superior adaptations during the evolutionary history of tobacco species.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Mingzheng Duan ◽  
Haiying Bao ◽  
Tolgor Bau

AbstractIn this study, we report a de novo assembly of the first high-quality genome for a wild mushroom species Leucocalocybe mongolica (LM). We performed high-throughput transcriptome sequencing to analyze the genetic basis for the life history of LM. Our results show that the genome size of LM is 46.0 Mb, including 26 contigs with a contig N50 size of 3.6 Mb. In total, we predicted 11,599 protein-coding genes, of which 65.7% (7630) could be aligned with high confidence to annotated homologous genes in other species. We performed phylogenetic analyses using genes form 3269 single-copy gene families and showed support for distinguishing LM from the genus Tricholoma (L.) P.Kumm., in which it is sometimes circumscribed. We believe that one reason for limited wild occurrences of LM may be the loss of key metabolic genes, especially carbohydrate-active enzymes (CAZymes), based on comparisons with other closely related species. The results of our transcriptome analyses between vegetative (mycelia) and reproductive (fruiting bodies) organs indicated that changes in gene expression among some key CAZyme genes may help to determine the switch from asexual to sexual reproduction. Taken together, our genomic and transcriptome data for LM comprise a valuable resource for both understanding the evolutionary and life history of this species.


Diversity ◽  
2021 ◽  
Vol 13 (9) ◽  
pp. 403
Author(s):  
Umar Rehman ◽  
Nighat Sultana ◽  
Abdullah ◽  
Abbas Jamal ◽  
Maryam Muzaffar ◽  
...  

Family Phyllanthaceae belongs to the eudicot order Malpighiales, and its species are herbs, shrubs, and trees that are mostly distributed in tropical regions. Here, we elucidate the molecular evolution of the chloroplast genome in Phyllanthaceae and identify the polymorphic loci for phylogenetic inference. We de novo assembled the chloroplast genomes of three Phyllanthaceae species, i.e., Phyllanthus emblica, Flueggea virosa, and Leptopus cordifolius, and compared them with six other previously reported genomes. All species comprised two inverted repeat regions (size range 23,921–27,128 bp) that separated large single-copy (83,627–89,932 bp) and small single-copy (17,424–19,441 bp) regions. Chloroplast genomes contained 111–112 unique genes, including 77–78 protein-coding, 30 tRNAs, and 4 rRNAs. The deletion/pseudogenization of rps16 genes was found in only two species. High variability was seen in the number of oligonucleotide repeats, while guanine-cytosine contents, codon usage, amino acid frequency, simple sequence repeats, synonymous and non-synonymous substitutions, and transition and transversion substitutions were similar. The transition substitutions were higher in coding sequences than in non-coding sequences. Phylogenetic analysis revealed the polyphyletic nature of the genus Phyllanthus. The polymorphic protein-coding genes, including rpl22, ycf1, matK, ndhF, and rps15, were also determined, which may be helpful for reconstructing the high-resolution phylogenetic tree of the family Phyllanthaceae. Overall, the study provides insight into the chloroplast genome evolution in Phyllanthaceae.


2019 ◽  
Author(s):  
Mengyang Xu ◽  
Xiaoshan Su ◽  
Mengqi Zhang ◽  
Ming Li ◽  
Xiaoyun Huang ◽  
...  

AbstractThe long-spine porcupinefish, Diodon holocanthus (Diodontidae, Tetraodontiformes, Actinopterygii), also known as the freckled porcupinefish, attracts great interest of ecology and economy. Its distinct characteristics including inflation reaction, spiny skin and tetradotoxin, however, have not been fully studied without a complete genome assembly.In this study, the whole genome of a single individual was sequenced using single tube-Long Fragment Read co-barcode reads, generating 154.3 Gb of paired-end data (219.8× depth). The gap was further filled using small amount of Oxford Nanopore MinION long read dataset (11.4Gb, 15.9× depth). Taking full use of long, medium, short-range of genome assembly information, the final assembled sequences with a total length of 650.02 Mb obtained contig and scaffold N50 sizes of 2.15 Mb and 8.13 Mb, respectively, despite of high repetitive content. Benchmarking Universal Single-Copy Orthologs captured 95.7% (2,474) of core genes to assess the completeness. In addition, 206.5 Mb (32.10%) of repetitive sequences were identified, and 20,840 protein-coding genes were annotated, among which 18,281 (87.72%) proteins were assigned with possible functions.This is the first demonstration of de novo genome of the porcupinefish, which will benefit downstream analysis of ontogeny, phylogeny, and evolution, and improve the exploration of its unique defensive mechanism.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Myung-Shin Kim ◽  
Geun Young Chae ◽  
Soohyun Oh ◽  
Jihyun Kim ◽  
Hyunggon Mang ◽  
...  

Abstract Background Peppers (Capsicum annuum L.) containing distinct capsaicinoids are the most widely cultivated spices in the world. However, extreme genomic diversity among species represents an obstacle to breeding pepper. Results Here, we report de novo genome assemblies of Capsicum annuum ‘Early Calwonder (non-pungent, ECW)’ and ‘Small Fruit (pungent, SF)’ along with their annotations. In total, we assembled 2.9 Gb of ECW and SF genome sequences, representing over 91% of the estimated genome sizes. Structural and functional annotation of the two pepper genomes generated about 35,000 protein-coding genes each, of which 93% were assigned putative functions. Comparison between newly and publicly available pepper gene annotations revealed both shared and specific gene content. In addition, a comprehensive analysis of nucleotide-binding and leucine-rich repeat (NLR) genes through whole-genome alignment identified five significant regions of NLR copy number variation (CNV). Detailed comparisons of those regions revealed that these CNVs were generated by intra-specific genomic variations that accelerated diversification of NLRs among peppers. Conclusions Our analyses unveil an evolutionary mechanism responsible for generating CNVs of NLRs among pepper accessions, and provide novel genomic resources for functional genomics and molecular breeding of disease resistance in Capsicum species.


Sign in / Sign up

Export Citation Format

Share Document