Genome and transcriptome analysis of the latent pathogen Lasiodiplodia theobromae, an emerging threat to the cacao industry

Genome ◽  
2020 ◽  
Vol 63 (1) ◽  
pp. 37-52 ◽  
Author(s):  
Shahin S. Ali ◽  
Asman Asman ◽  
Jonathan Shao ◽  
Johnny F. Balidion ◽  
Mary D. Strem ◽  
...  

Lasiodiplodia theobromae (Pat.) Griffon & Maubl., a member of the family Botryosphaeriaceae, is becoming a significant threat to crops and woody plants in many parts of the world, including the major cacao growing areas. While attempting to isolate Ceratobasidium theobromae, a causal agent of vascular streak dieback (VSD), from symptomatic cacao stems, 74% of isolated fungi were Lasiodiplodia spp. Sequence-based identification of 52 putative isolates of L. theobromae indicated that diverse species of Lasiodiplodia were associated with cacao in the studied areas, and the isolates showed variation in aggressiveness when assayed using cacao leaf discs. The present study reports a 43.75 Mb de novo assembled genome of an isolate of L. theobromae from cacao. Ab initio gene prediction generated 13 061 protein-coding genes, of which 2862 are unique to L. theobromae, when compared with other closely related Botryosphaeriaceae. Transcriptome analysis revealed that 11 860 predicted genes were transcriptionally active and 1255 were more highly expressed in planta compared with cultured mycelia. The predicted genes differentially expressed during infection were mainly those involved in carbohydrate, pectin, and lignin catabolism, cytochrome P450, necrosis-inducing proteins, and putative effectors. These findings significantly expand our knowledge of the genome of L. theobromae and the genes involved in virulence and pathogenicity.

2020 ◽  
Vol 110 (9) ◽  
pp. 1503-1506
Author(s):  
Olufemi A. Akinsanmi ◽  
Lilia C. Carvalhais

Pseudocercospora macadamiae causes husk spot in macadamia in Australia. Lack of genomic resources for this pathogen has restricted acquiring knowledge on the mechanism of disease development, spread, and its role in fruit abscission. To address this gap, we sequenced the genome of P. macadamiae. The sequence was de novo assembled into a draft genome of 40 Mb, which is comparable to closely related species in the family Mycosphaerellaceae. The draft genome comprises 212 scaffolds, of which 99 scaffolds are over 50 kb. The genome has a 49% GC content and is predicted to contain 15,430 protein-coding genes. This draft genome sequence is the first for P. macadamiae and represents a valuable resource for understanding genome evolution and plant disease resistance.


2018 ◽  
Vol 6 (16) ◽  
pp. e00265-18 ◽  
Author(s):  
Stewart T. G. Burgess ◽  
Kathryn Bartley ◽  
Edward J. Marr ◽  
Harry W. Wright ◽  
Robert J. Weaver ◽  
...  

ABSTRACT Sheep scab, caused by infestation with Psoroptes ovis, is highly contagious, results in intense pruritus, and represents a major welfare and economic concern. Here, we report the first draft genome assembly and gene prediction of P. ovis based on PacBio de novo sequencing. The ∼63.2-Mb genome encodes 12,041 protein-coding genes.


2021 ◽  
Vol 3 (2) ◽  
Author(s):  
Alexandre Lomsadze ◽  
Christophe Bonny ◽  
Francesco Strozzi ◽  
Mark Borodovsky

Abstract Computational reconstruction of nearly complete genomes from metagenomic reads may identify thousands of new uncultured candidate bacterial species. We have shown that reconstructed prokaryotic genomes along with genomes of sequenced microbial isolates can be used to support more accurate gene prediction in novel metagenomic sequences. We have proposed an approach that used three types of gene prediction algorithms and found for all contigs in a metagenome nearly optimal models of protein-coding regions either in libraries of pre-computed models or constructed de novo. The model selection process and gene annotation were done by the new GeneMark-HM pipeline. We have created a database of the species level pan-genomes for the human microbiome. To create a library of models representing each pan-genome we used a self-training algorithm GeneMarkS-2. Genes initially predicted in each contig served as queries for a fast similarity search through the pan-genome database. The best matches led to selection of the model for gene prediction. Contigs not assigned to pan-genomes were analyzed by crude, but still accurate models designed for sequences with particular GC compositions. Tests of GeneMark-HM on simulated metagenomes demonstrated improvement in gene annotation of human metagenomic sequences in comparison with the current state-of-the-art gene prediction tools.


Diversity ◽  
2021 ◽  
Vol 13 (9) ◽  
pp. 403
Author(s):  
Umar Rehman ◽  
Nighat Sultana ◽  
Abdullah ◽  
Abbas Jamal ◽  
Maryam Muzaffar ◽  
...  

Family Phyllanthaceae belongs to the eudicot order Malpighiales, and its species are herbs, shrubs, and trees that are mostly distributed in tropical regions. Here, we elucidate the molecular evolution of the chloroplast genome in Phyllanthaceae and identify the polymorphic loci for phylogenetic inference. We de novo assembled the chloroplast genomes of three Phyllanthaceae species, i.e., Phyllanthus emblica, Flueggea virosa, and Leptopus cordifolius, and compared them with six other previously reported genomes. All species comprised two inverted repeat regions (size range 23,921–27,128 bp) that separated large single-copy (83,627–89,932 bp) and small single-copy (17,424–19,441 bp) regions. Chloroplast genomes contained 111–112 unique genes, including 77–78 protein-coding, 30 tRNAs, and 4 rRNAs. The deletion/pseudogenization of rps16 genes was found in only two species. High variability was seen in the number of oligonucleotide repeats, while guanine-cytosine contents, codon usage, amino acid frequency, simple sequence repeats, synonymous and non-synonymous substitutions, and transition and transversion substitutions were similar. The transition substitutions were higher in coding sequences than in non-coding sequences. Phylogenetic analysis revealed the polyphyletic nature of the genus Phyllanthus. The polymorphic protein-coding genes, including rpl22, ycf1, matK, ndhF, and rps15, were also determined, which may be helpful for reconstructing the high-resolution phylogenetic tree of the family Phyllanthaceae. Overall, the study provides insight into the chloroplast genome evolution in Phyllanthaceae.


2021 ◽  
Author(s):  
Kenta Shirasawa ◽  
Ryohei Arimoto ◽  
Hideki Hirakawa ◽  
Motoyuki Ishimorai ◽  
Andrea Ghelfi ◽  
...  

AbstractEustoma grandiflorum (Raf.) Shinn., is an annual herbaceous plant native to the southern United States, Mexico, and the Greater Antilles. It has a large flower with a variety of colors and an important flower crop. In this study, we established a chromosome-scale de novo assembly of E. grandiflorum by integrating four genomic and genetic approaches: (1) Pacific Biosciences (PacBio) Sequel deep sequencing, (2) error correction of the assembly by Illumina short reads, (3) scaffolding by chromatin conformation capture sequencing (Hi-C), and (4) genetic linkage maps derived from an F2 mapping population. The 36 pseudomolecules and unplaced 64 scaffolds were created with total length of 1,324.8 Mb. Full-length transcript sequencing was obtained by PacBio Iso-Seq sequencing for gene prediction on the assembled genome, Egra_v1. A total of 36,619 genes were predicted on the genome as high confidence HC) genes. Of the 36,619, 25,936 were annotated functions by ZenAnnotation. Genetic diversity analysis was also performed for nine commercial E. grandiflorum varieties bred in Japan, and 254,205 variants were identified. This is the first report of the construction of reference genome sequences in E. grandiflorum as well as in the family Gentianaceae.


GigaScience ◽  
2019 ◽  
Vol 8 (7) ◽  
Author(s):  
Chang-Ming Bai ◽  
Lu-Sheng Xin ◽  
Umberto Rosani ◽  
Biao Wu ◽  
Qing-Chen Wang ◽  
...  

Abstract Background The blood clam, Scapharca (Anadara) broughtonii, is an economically and ecologically important marine bivalve of the family Arcidae. Efforts to study their population genetics, breeding, cultivation, and stock enrichment have been somewhat hindered by the lack of a reference genome. Herein, we report the complete genome sequence of S. broughtonii, a first reference genome of the family Arcidae. Findings A total of 75.79 Gb clean data were generated with the Pacific Biosciences and Oxford Nanopore platforms, which represented approximately 86× coverage of the S. broughtonii genome. De novo assembly of these long reads resulted in an 884.5-Mb genome, with a contig N50 of 1.80 Mb and scaffold N50 of 45.00 Mb. Genome Hi-C scaffolding resulted in 19 chromosomes containing 99.35% of bases in the assembled genome. Genome annotation revealed that nearly half of the genome (46.1%) is composed of repeated sequences, while 24,045 protein-coding genes were predicted and 84.7% of them were annotated. Conclusions We report here a chromosomal-level assembly of the S. broughtonii genome based on long-read sequencing and Hi-C scaffolding. The genomic data can serve as a reference for the family Arcidae and will provide a valuable resource for the scientific community and aquaculture sector.


2020 ◽  
Vol 11 (1) ◽  
Author(s):  
Guillermo Friis ◽  
Joel Vizueta ◽  
Edward G Smith ◽  
David R Nelson ◽  
Basel Khraiwesh ◽  
...  

Abstract The gray mangrove [Avicennia marina (Forsk.) Vierh.] is the most widely distributed mangrove species, ranging throughout the Indo-West Pacific. It presents remarkable levels of geographic variation both in phenotypic traits and habitat, often occupying extreme environments at the edges of its distribution. However, subspecific evolutionary relationships and adaptive mechanisms remain understudied, especially across populations of the West Indian Ocean. High-quality genomic resources accounting for such variability are also sparse. Here we report the first chromosome-level assembly of the genome of A. marina. We used a previously release draft assembly and proximity ligation libraries Chicago and Dovetail HiC for scaffolding, producing a 456,526,188-bp long genome. The largest 32 scaffolds (22.4–10.5 Mb) accounted for 98% of the genome assembly, with the remaining 2% distributed among much shorter 3,759 scaffolds (62.4–1 kb). We annotated 45,032 protein-coding genes using tissue-specific RNA-seq data in combination with de novo gene prediction, from which 34,442 were associated to GO terms. Genome assembly and annotated set of genes yield a 96.7% and 95.1% completeness score, respectively, when compared with the eudicots BUSCO dataset. Furthermore, an FST survey based on resequencing data successfully identified a set of candidate genes potentially involved in local adaptation and revealed patterns of adaptive variability correlating with a temperature gradient in Arabian mangrove populations. Our A. marina genomic assembly provides a highly valuable resource for genome evolution analysis, as well as for identifying functional genes involved in adaptive processes and speciation.


BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Nikoletta A. Nagy ◽  
Rita Rácz ◽  
Oliver Rimington ◽  
Szilárd Póliska ◽  
Pablo Orozco-terWengel ◽  
...  

Abstract Background The lack of an understanding about the genomic architecture underpinning parental behaviour in subsocial insects displaying simple parental behaviours prevents the development of a full understanding about the evolutionary origin of sociality. Lethrus apterus is one of the few insect species that has biparental care. Division of labour can be observed between parents during the reproductive period in order to provide food and protection for their offspring. Results Here, we report the draft genome of L. apterus, the first genome in the family Geotrupidae. The final assembly consisted of 286.93 Mbp in 66,933 scaffolds. Completeness analysis found the assembly contained 93.5% of the Endopterygota core BUSCO gene set. Ab initio gene prediction resulted in 25,385 coding genes, whereas homology-based analyses predicted 22,551 protein coding genes. After merging, 20,734 were found during functional annotation. Compared to other publicly available beetle genomes, 23,528 genes among the predicted genes were assigned to orthogroups of which 1664 were in species-specific groups. Additionally, reproduction related genes were found among the predicted genes based on which a reduction in the number of odorant- and pheromone-binding proteins was detected. Conclusions These genes can be used in further comparative and functional genomic researches which can advance our understanding of the genetic basis and hence the evolution of parental behaviour.


2019 ◽  
Author(s):  
Hsin-Yen Larry Wu ◽  
Gaoyuan Song ◽  
Justin W. Walley ◽  
Polly Yingshan Hsu

mRNA translation is a critical step in gene expression, but our understanding of the landscape and control of translation in diverse crops remains lacking. Here, we combined de novo transcriptome assembly and ribosome profiling to study global mRNA translation in tomato roots. Taking advantage of the 3-nucleotide periodicity displayed by translating ribosomes, we identified 354 novel small ORFs (sORFs) translated from previously unannotated transcripts, as well as 1329 upstream ORFs (uORFs) translated within the 5-prime UTRs of annotated protein-coding genes. Proteomic analysis confirmed that some of these novel uORFs and sORFs generate stable proteins in planta. Compared with the annotated ORFs, the uORFs use more flexible Kozak sequences around translation start sites. Interestingly, uORF-containing genes are enriched for protein phosphorylation/dephosphorylation and signaling transduction pathways, suggesting a regulatory role for uORFs in these processes. We also demonstrated that ribosome profiling is useful to facilitate the annotation of translated ORFs and noncanonical translation initiation sites. In addition to defining the translatome, our results revealed the global control of mRNA translation by uORFs and microRNAs in tomato. In summary, our approach provides a high-throughput method to discover unannotated ORFs, elucidates evolutionarily conserved translational features, and identifies new regulatory mechanisms hidden in a crop genome.


2019 ◽  
Vol 6 (1) ◽  
Author(s):  
Shahin S. Ali ◽  
Asman Asman ◽  
Jonathan Shao ◽  
Amanda P. Firmansyah ◽  
Agung W. Susilo ◽  
...  

Abstract Background Ceratobasidium theobromae, a member of the Ceratobasidiaceae family, is the causal agent of vascular-streak dieback (VSD) of cacao, a major threat to the chocolate industry in the South-East Asia. The fastidious pathogen is very hard to isolate and maintain in pure culture, which is a major bottleneck in the study of its genetic diversity and genome. Result This study describes for the first time, a 33.90 Mbp de novo assembled genome of a putative C. theobromae isolate from cacao. Ab initio gene prediction identified 9264 protein-coding genes, of which 800 are unique to C. theobromae when compared to Rhizoctonia spp., a closely related group. Transcriptome analysis using RNA isolated from 4 independent VSD symptomatic cacao stems identified 3550 transcriptionally active genes when compared to the assembled C. theobromae genome while transcripts for only 4 C. theobromae genes were detected in 2 asymptomatic stems. De novo assembly of the non-cacao associated reads from the VSD symptomatic stems uniformly produced genes with high identity to predicted genes in the C. theobromae genome as compared to Rhizoctonia spp. or genes found in Genbank. Further analysis of the predicted C. theobromae transcriptome was carried out identifying CAZy gene classes, KEGG-pathway associated genes, and 138 putative effector proteins. Conclusion These findings put forth, for the first time, a predicted genome for the fastidious basidiomycete C. theobromae causing VSD on cacao providing a model for testing and comparison in the future. The C. theobromae genome predicts a pathogenesis model involving secreted effector proteins to suppress plant defense mechanisms and plant cell wall degrading enzymes.


Sign in / Sign up

Export Citation Format

Share Document