De novo assembly and analysis of crow lungs transcriptome

The jungle crow (Corvus macrorhynchos) belongs to the order Passeriformes of bird species and is important for avian ecological and evolutionary genetics studies. However, there is limited information on the transcriptome data of this species. In the present study, we report the characterization of the lung transcriptome of the jungle crow using GS FLX Titanium XLR70. Altogether, 1 510 303 high-quality sequence reads with 581 198 230 bases was de novo assembled into 22 169 isotigs (isotig represents an individual transcript) and 784 009 singletons. Using these isotigs and 581 681 length-filtered (greater than 300 bp) singletons, 20 010 unique protein-coding genes were identified by BLASTx comparison against a nonredundant (nr) protein sequence database. Comparative analysis revealed that 46 604 (70.29%) and 51 642 (72.48%) of the assembled transcripts have significant similarity to zebra finch and chicken RefSeq proteins, respectively. As determined by GO annotation and KEGG pathway mapping, functional annotation of the unigenes recovered diverse biological functions and processes. Transcripts putatively involved in the immune response were identified. Furthermore, 20 599 single nucleotide polymorphisms (SNPs) and 7525 simple sequence repeats (SSRs) were retrieved from the assembled transcript database. This resource should lay an important base for future ecological, evolutionary, and conservation genetic studies on this species and in other related species.

Download Full-text

The Draft Genome of the Endangered Sichuan Partridge (Arborophila rufipectus) with Evolutionary Implications

Genes ◽

10.3390/genes10090677 ◽

2019 ◽

Vol 10 (9) ◽

pp. 677 ◽

Cited By ~ 1

Author(s):

Chuang Zhou ◽

Hongmei Tu ◽

Haoran Yu ◽

Shuai Zheng ◽

Bo Dai ◽

...

Keyword(s):

De Novo ◽

Genome Structure ◽

Draft Genome ◽

Enrichment Analysis ◽

Phylogenetic Position ◽

Nucleotide Polymorphisms ◽

Protein Coding ◽

Go Enrichment ◽

And Behavior ◽

Or Genes

The Sichuan partridge (Arborophila rufipectus, Phasianidae, Galliformes) is distributed in south-west China, and classified as endangered grade. To examine the evolution and genomic features of Sichuan partridge, we de novo assembled the Sichuan partridge reference genome. The final draft assembly consisted of approximately 1.09 Gb, and had a scaffold N50 of 4.57 Mb. About 1.94 million heterozygous single-nucleotide polymorphisms (SNPs) were detected, 17,519 protein-coding genes were predicted, and 9.29% of the genome was identified as repetitive elements. A total of 56 olfactory receptor (OR) genes were found in Sichuan partridge, and conserved motifs were detected. Comparisons between the Sichuan partridge genome and chicken genome revealed a conserved genome structure, and phylogenetic analysis demonstrated that Arborophila possessed a basal phylogenetic position within Phasianidae. Gene Ontology (GO) enrichment analysis of positively selected genes (PSGs) in Sichuan partridge showed over-represented GO functions related to environmental adaptation, such as energy metabolism and behavior. Pairwise sequentially Markovian coalescent analysis revealed the recent demographic trajectory for the Sichuan partridge. Our data and findings provide valuable genomic resources not only for studying the evolutionary adaptation, but also for facilitating the long-term conservation and genetic diversity for this endangered species.

Download Full-text

Transcriptome analysis of colored calla lily (Zantedeschia rehmanniiEngl.) by Illumina sequencing:de novoassembly, annotation and EST-SSR marker development

PeerJ ◽

10.7717/peerj.2378 ◽

2016 ◽

Vol 4 ◽

pp. e2378 ◽

Cited By ~ 13

Author(s):

Zunzheng Wei ◽

Zhenzhen Sun ◽

Binbin Cui ◽

Qixiang Zhang ◽

Min Xiong ◽

...

Keyword(s):

Molecular Markers ◽

De Novo ◽

Average Length ◽

Orthologous Group ◽

Sequence Information ◽

Nucleotide Polymorphisms ◽

Protein Database ◽

Illumina Hiseq ◽

Significant Similarity ◽

Calla Lily

Colored calla lily is the short name for the species or hybrids in sectionAestivaeof genusZantedeschia. It is currently one of the most popular flower plants in the world due to its beautiful flower spathe and long postharvest life. However, little genomic information and few molecular markers are available for its genetic improvement. Here,de novotranscriptome sequencing was performed to produce large transcript sequences forZ. rehmanniicv. ‘Rehmannii’ using an Illumina HiSeq 2000 instrument. More than 59.9 million cDNA sequence reads were obtained and assembled into 39,298 unigenes with an average length of 1,038 bp. Among these, 21,077 unigenes showed significant similarity to protein sequences in the non-redundant protein database (Nr) and in the Swiss-Prot, Gene Ontology (GO), Cluster of Orthologous Group (COG) and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases. Moreover, a total of 117 unique transcripts were then defined that might regulate the flower spathe development of colored calla lily. Additionally, 9,933 simple sequence repeats (SSRs) and 7,162 single nucleotide polymorphisms (SNPs) were identified as putative molecular markers. High-quality primers for 200 SSR loci were designed and selected, of which 58 amplified reproducible amplicons were polymorphic among 21 accessions of colored calla lily. The sequence information and molecular markers in the present study will provide valuable resources for genetic diversity analysis, germplasm characterization and marker-assisted selection in the genusZantedeschia.

Download Full-text

Analysis of Inter-Chromosomal Distribution of Disease-Related Genes in Human Genome

Current Protein and Peptide Science ◽

10.2174/1389203721666200426233158 ◽

2020 ◽

Vol 21 (11) ◽

pp. 1068-1077

Author(s):

Xiaochao Sun ◽

Bin Yang ◽

Qunye Zhang

Keyword(s):

Spatial Distribution ◽

Model Organisms ◽

Nucleotide Polymorphisms ◽

Chromosomal Distribution ◽

Single Nucleotide ◽

Protein Coding ◽

Single Chromosome ◽

Deletion Mutations ◽

Protein Coding Genes ◽

Disease Related Genes

: Many studies have shown that the spatial distribution of genes within a single chromosome exhibits distinct patterns. However, little is known about the characteristics of inter-chromosomal distribution of genes (including protein-coding genes, processed transcripts and pseudogenes) in different genomes. In this study, we explored these issues using the available genomic data of both human and model organisms. Moreover, we also analyzed the distribution pattern of protein-coding genes that have been associated with 14 common diseases and the insert/deletion mutations and single nucleotide polymorphisms detected by whole genome sequencing in an acute promyelocyte leukemia patient. We obtained the following novel findings. Firstly, inter-chromosomal distribution of genes displays a nonstochastic pattern and the gene densities in different chromosomes are heterogeneous. This kind of heterogeneity is observed in genomes of both lower and higher species. Secondly, protein-coding genes involved in certain biological processes tend to be enriched in one or a few chromosomes. Our findings have added new insights into our understanding of the spatial distribution of genome and disease- related genes across chromosomes. These results could be useful in improving the efficiency of disease-associated gene screening studies by targeting specific chromosomes.

Download Full-text

PSI-40 Two mitochondrial lineages revealed in North American yak

Journal of Animal Science ◽

10.1093/jas/skaa278.833 ◽

2020 ◽

Vol 98 (Supplement_4) ◽

pp. 477-477

Author(s):

Leah K Treffer ◽

Edward S Rice ◽

Anna M Fuller ◽

Samuel Cutler ◽

Jessica L Petersen

Keyword(s):

Sequence Data ◽

Haplotype Network ◽

Ovis Aries ◽

Similar Species ◽

Nucleotide Polymorphisms ◽

Mt Dna ◽

Protein Coding ◽

Sister Clade ◽

Mtdna Sequence ◽

The Impact

Abstract Domestic yak (Bos grunniens) are bovids native to the Asian Qinghai-Tibetan Plateau. Studies of Asian yak have revealed that introgression with domestic cattle has contributed to the evolution of the species. When imported to North America (NA), some hybridization with B. taurus did occur. The objective of this study was to use mitochondrial (mt) DNA sequence data to better understand the mtDNA origin of NA yak and their relationship to Asian yak and related species. The complete mtDNA sequence of 14 individuals (12 NA yak, 1 Tibetan yak, 1 Tibetan B. indicus) was generated and compared with sequences of similar species from GeneBank (B. indicus, B. grunniens (Chinese), B. taurus, B. gaurus, B. primigenius, B. frontalis, Bison bison, and Ovis aries). Individuals were aligned to the B. grunniens reference genome (ARS_UNL_BGru_maternal_1.0), which was also included in the analyses. The mtDNA genes were annotated using the ARS-UCD1.2 cattle sequence as a reference. Ten unique NA yak haplotypes were identified, which a haplotype network separated into two clusters. Variation among the NA haplotypes included 93 nonsynonymous single nucleotide polymorphisms. A maximum likelihood tree including all taxa was made using IQtree after the data were partitioned into twenty-two subgroups using PartitionFinder2. Notably, six NA yak haplotypes formed a clade with B. indicus; the other four haplotypes grouped with B. grunniens and fell as a sister clade to bison, gaur and gayal. These data demonstrate two mitochondrial origins of NA yak with genetic variation in protein coding genes. Although these data suggest yak introgression with B. indicus, it appears to date prior to importation into NA. In addition to contributing to our understanding of the species history, these results suggest the two major mtDNA haplotypes in NA yak may functionally differ. Characterization of the impact of these differences on cellular function is currently underway.

Download Full-text

De Novo SNP Discovery and Genotyping of Iranian Pimpinella Species Using ddRAD Sequencing

Agronomy ◽

10.3390/agronomy11071342 ◽

2021 ◽

Vol 11 (7) ◽

pp. 1342

Author(s):

Shaghayegh Mehravi ◽

Gholam Ali Ranjbar ◽

Ghader Mirzaghaderi ◽

Anita Alice Severn-Ellis ◽

Armin Scheben ◽

...

Keyword(s):

De Novo ◽

Genetic Relationships ◽

Nucleotide Polymorphisms ◽

High Quality ◽

Genomic Resources ◽

High Quality Snps ◽

The Family ◽

Double Digestion ◽

Flanking Sequences ◽

Downstream Analysis

The species of Pimpinella, one of the largest genera of the family Apiaceae, are traditionally cultivated for medicinal purposes. In this study, high-throughput double digest restriction-site associated DNA sequencing technology (ddRAD-seq) was used to identify single nucleotide polymorphisms (SNPs) in eight Pimpinella species from Iran. After double-digestion with the enzymes HpyCH4IV and HinfI, a total of 334,702,966 paired-end reads were de novo assembled into 1,270,791 loci with an average of 28.8 reads per locus. After stringent filtering, 2440 high-quality SNPs were identified for downstream analysis. Analysis of genetic relationships and population structure, based on these retained SNPs, indicated the presence of three major groups. Gene ontology and pathway analysis were determined by using comparison SNP-associated flanking sequences with a public non-redundant database. Due to the lack of genomic resources in this genus, our present study is the first report to provide high-quality SNPs in Pimpinella based on a de novo analysis pipeline using ddRAD-seq. This data will enhance the molecular knowledge of the genus Pimpinella and will provide an important source of information for breeders and the research community to enhance breeding programs and support the management of Pimpinella genomic resources.

Download Full-text

Characterization of metabolic responses, genetic variations, and microsatellite instability in ammonia-stressed CHO cells grown in fed-batch cultures

BMC Biotechnology ◽

10.1186/s12896-020-00667-2 ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Dylan G. Chitwood ◽

Qinghua Wang ◽

Kathryn Elliott ◽

Aiyana Bullock ◽

Dwon Jordana ◽

...

Keyword(s):

Microsatellite Instability ◽

Microsatellite Loci ◽

Cho Cells ◽

Genome Stability ◽

De Novo ◽

Genome Instability ◽

Chinese Hamster ◽

Functional Genes ◽

Nucleotide Polymorphisms ◽

De Novo Mutations

Abstract Background As bioprocess intensification has increased over the last 30 years, yields from mammalian cell processes have increased from 10’s of milligrams to over 10’s of grams per liter. Most of these gains in productivity can be attributed to increasing cell densities within bioreactors. As such, strategies have been developed to minimize accumulation of metabolic wastes, such as lactate and ammonia. Unfortunately, neither cell growth nor biopharmaceutical production can occur without some waste metabolite accumulation. Inevitably, metabolic waste accumulation leads to decline and termination of the culture. While it is understood that the accumulation of these unwanted compounds imparts a suboptimal culture environment, little is known about the genotoxic properties of these compounds that may lead to global genome instability. In this study, we examined the effects of high and moderate extracellular ammonia on the physiology and genomic integrity of Chinese hamster ovary (CHO) cells. Results Through whole genome sequencing, we discovered 2394 variant sites within functional genes comprised of both single nucleotide polymorphisms and insertion/deletion mutations as a result of ammonia stress with high or moderate impact on functional genes. Furthermore, several of these de novo mutations were found in genes whose functions are to maintain genome stability, such as Tp53, Tnfsf11, Brca1, as well as Nfkb1. Furthermore, we characterized microsatellite content of the cultures using the CriGri-PICR Chinese hamster genome assembly and discovered an abundance of microsatellite loci that are not replicated faithfully in the ammonia-stressed cultures. Unfaithful replication of these loci is a signature of microsatellite instability. With rigorous filtering, we found 124 candidate microsatellite loci that may be suitable for further investigation to determine whether these loci may be reliable biomarkers to predict genome instability in CHO cultures. Conclusion This study advances our knowledge with regards to the effects of ammonia accumulation on CHO cell culture performance by identifying ammonia-sensitive genes linked to genome stability and lays the foundation for the development of a new diagnostic tool for assessing genome stability.

Download Full-text

Insights into the Genome Sequence ofChromobacterium amazonenseIsolated from a Tropical Freshwater Lake

International Journal of Genomics ◽

10.1155/2018/1062716 ◽

2018 ◽

Vol 2018 ◽

pp. 1-10

Author(s):

Alexandre Bueno Santos ◽

Patrícia Silva Costa ◽

Anderson Oliveira do Carmo ◽

Gabriel da Rocha Fernandes ◽

Larissa Lopes Silva Scholte ◽

...

Keyword(s):

De Novo ◽

Genomic Diversity ◽

Protein Coding ◽

Biotechnological Potential ◽

Draft Assembly ◽

Functional Studies ◽

Alpha Hemolysin ◽

Type Iv ◽

Nudix Hydrolases ◽

Colicin V

Members of the genusChromobacteriumhave been isolated from geographically diverse ecosystems and exhibit considerable metabolic flexibility, as well as biotechnological and pathogenic properties in some species. This study reports the draft assembly and detailed sequence analysis ofChromobacterium amazonensestrain 56AF. The de novo-assembled genome is 4,556,707 bp in size and contains 4294 protein-coding and 95 RNA genes, including 88 tRNA, six rRNA, and one tmRNA operon. A repertoire of genes implicated in virulence, for example, hemolysin, hemolytic enterotoxins, colicin V, lytic proteins, and Nudix hydrolases, is present. The genome also contains a collection of genes of biotechnological interest, including esterases, lipase, auxins, chitinases, phytoene synthase and phytoene desaturase, polyhydroxyalkanoates, violacein, plastocyanin/azurin, and detoxifying compounds. Importantly, unlike otherChromobacteriumspecies, the 56AF genome contains genes for pore-forming toxin alpha-hemolysin, a type IV secretion system, among others. The analysis of theC. amazonensestrain 56AF genome reveals the versatility, adaptability, and biotechnological potential of this bacterium. This study provides molecular information that may pave the way for further comparative genomics and functional studies involvingChromobacterium-related isolates and improves our understanding of the global genomic diversity ofChromobacteriumspecies.

Download Full-text

CALINCA—A Novel Pipeline for the Identification of lncRNAs in Podocyte Disease

Cells ◽

10.3390/cells10030692 ◽

2021 ◽

Vol 10 (3) ◽

pp. 692

Author(s):

Sweta Talyan ◽

Samantha Filipów ◽

Michael Ignarski ◽

Magdalena Smieszek ◽

He Chen ◽

...

Keyword(s):

Cell Biology ◽

Mammalian Cells ◽

De Novo ◽

Depth Information ◽

Gene Products ◽

Classical Analysis ◽

Protein Coding ◽

Bioinformatic Pipeline ◽

Non Coding Rnas ◽

Filtration Unit

Diseases of the renal filtration unit—the glomerulus—are the most common cause of chronic kidney disease. Podocytes are the pivotal cell type for the function of this filter and focal-segmental glomerulosclerosis (FSGS) is a classic example of a podocytopathy leading to proteinuria and glomerular scarring. Currently, no targeted treatment of FSGS is available. This lack of therapeutic strategies is explained by a limited understanding of the defects in podocyte cell biology leading to FSGS. To date, most studies in the field have focused on protein-coding genes and their gene products. However, more than 80% of all transcripts produced by mammalian cells are actually non-coding. Here, long non-coding RNAs (lncRNAs) are a relatively novel class of transcripts and have not been systematically studied in FSGS to date. The appropriate tools to facilitate lncRNA research for the renal scientific community are urgently required due to a row of challenges compared to classical analysis pipelines optimized for coding RNA expression analysis. Here, we present the bioinformatic pipeline CALINCA as a solution for this problem. CALINCA automatically analyzes datasets from murine FSGS models and quantifies both annotated and de novo assembled lncRNAs. In addition, the tool provides in-depth information on podocyte specificity of these lncRNAs, as well as evolutionary conservation and expression in human datasets making this pipeline a crucial basis to lncRNA studies in FSGS.

Download Full-text

Draft Genome Sequence of Photorhabdus luminescens Strain DSPV002N Isolated from Santa Fe, Argentina

Genome Announcements ◽

10.1128/genomea.00744-16 ◽

2016 ◽

Vol 4 (4) ◽

Author(s):

Leopoldo Palma ◽

Eleodoro E. Del Valle ◽

Laureano Frizzo ◽

Colin Berry ◽

Primitivo Caballero

Keyword(s):

Genome Sequence ◽

Draft Genome ◽

Draft Genome Sequence ◽

Photorhabdus Luminescens ◽

Santa Fe ◽

Significant Similarity ◽

Insecticidal Toxin ◽

Protein Coding ◽

Protein Coding Genes

Here, we report the draft genome sequence of Photorhabdus luminescens strain DSPV002N, which consists of 177 contig sequences accounting for 5,518,143 bp, with a G+C content of 42.3% and 4,701 predicted protein-coding genes (CDSs). From these, 27 CDSs exhibited significant similarity with insecticidal toxin proteins from Photorhabdus luminescens subsp. laumondii TT01.

Download Full-text

Molecular marker information from de novo assembled transcriptomes of chilli pepper (Capsicum annuum L.) varieties based on next-generation sequencing technology

Plant Genetic Resources ◽

10.1017/s147926211400032x ◽

2014 ◽

Vol 12 (S1) ◽

pp. S83-S86 ◽

Cited By ~ 1

Author(s):

Yul-Kyun Ahn ◽

Swati Tripathi ◽

Young-Il Cho ◽

Jeong-Ho Kim ◽

Hye-Eun Lee ◽

...

Keyword(s):

Molecular Markers ◽

Next Generation Sequencing ◽

De Novo ◽

Transcriptome Assembly ◽

Sequence Variant ◽

Nucleotide Polymorphisms ◽

Next Generation ◽

Chilli Pepper ◽

Next Generation Sequencing Technology ◽

Generation Sequencing

Next-generation sequencing technique has been known as a useful tool for de novo transcriptome assembly, functional annotation of genes and identification of molecular markers. This study was carried out to mine molecular markers from de novo assembled transcriptomes of four chilli pepper varieties, the highly pungent ‘Saengryeg 211’ and non-pungent ‘Saengryeg 213’ and variably pigmented ‘Mandarin’ and ‘Blackcluster’. Pyrosequencing of the complementary DNA library resulted in 361,671, 274,269, 279,221, and 316,357 raw reads, which were assembled in 23,607, 19,894, 18,340 and 20,357 contigs, for the four varieties, respectively. Detailed sequence variant analysis identified numerous potential single-nucleotide polymorphisms (SNPs) and simple sequence repeats (SSRs) for all the varieties for which the primers were designed. The transcriptome information and SNP/SSR markers generated in this study provide valuable resources for high-density molecular genetic mapping in chilli pepper and Quantitative trait loci analysis related to fruit qualities. These markers for pepper will be highly valuable for marker-assisted breeding and other genetic studies.

Download Full-text