scholarly journals Review on the Development and Applications of Medicinal Plant Genomes

2021 ◽  
Vol 12 ◽  
Author(s):  
Qi-Qing Cheng ◽  
Yue Ouyang ◽  
Zi-Yu Tang ◽  
Chi-Chou Lao ◽  
Yan-Yu Zhang ◽  
...  

With the development of sequencing technology, the research on medicinal plants is no longer limited to the aspects of chemistry, pharmacology, and pharmacodynamics, but reveals them from the genetic level. As the price of next-generation sequencing technology becomes affordable, and the long-read sequencing technology is established, the medicinal plant genomes with large sizes have been sequenced and assembled more easily. Although the review of plant genomes has been reported several times, there is no review giving a systematic and comprehensive introduction about the development and application of medicinal plant genomes that have been reported until now. Here, we provide a historical perspective on the current situation of genomes in medicinal plant biology, highlight the use of the rapidly developing sequencing technologies, and conduct a comprehensive summary on how the genomes apply to solve the practical problems in medicinal plants, like genomics-assisted herb breeding, evolution history revelation, herbal synthetic biology study, and geoherbal research, which are important for effective utilization, rational use and sustainable protection of medicinal plants.

2019 ◽  
Vol 47 (1) ◽  
pp. 23-32 ◽  
Author(s):  
Yann Fichou ◽  
Isabelle Berlivet ◽  
Gaëlle Richard ◽  
Christophe Tournamille ◽  
Lilian Castilho ◽  
...  

Background: In the novel era of blood group genomics, (re-)defining reference gene/allele sequences of blood group genes has become an important goal to achieve, both for diagnostic and research purposes. As novel potent sequencing technologies are available, we thought to investigate the variability encountered in the three most common alleles of ACKR1, the gene encoding the clinically relevant Duffy antigens, at the haplotype level by a long-read sequencing approach. Materials and Methods: After long-range PCR amplification spanning the whole ACKR1 gene locus (∼2.5 kilobases), amplicons generated from 81 samples with known genotypes were sequenced in a single read by using the Pacific Biosciences (PacBio) single molecule, real-time (SMRT) sequencing technology. Results: High-quality sequencing reads were obtained for the 162 alleles (accuracy >0.999). Twenty-two nucleotide variations reported in databases were identified, defining 19 haplotypes: four, eight, and seven haplotypes in 46 ACKR1*01, 63 ACKR1*02, and 53 ACKR1*02N.01 alleles, respectively. Discussion: Overall, we have defined a subset of reference alleles by third-generation (long-read) sequencing. This technology, which provides a “longitudinal” overview of the loci of interest (several thousand base pairs) and is complementary to the second-generation (short-read) next-generation sequencing technology, is of critical interest for resolving novel, rare, and null alleles.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Patrick Driguez ◽  
Salim Bougouffa ◽  
Karen Carty ◽  
Alexander Putra ◽  
Kamel Jabbari ◽  
...  

AbstractCurrently, different sequencing platforms are used to generate plant genomes and no workflow has been properly developed to optimize time, cost, and assembly quality. We present LeafGo, a complete de novo plant genome workflow, that starts from tissue and produces genomes with modest laboratory and bioinformatic resources in approximately 7 days and using one long-read sequencing technology. LeafGo is optimized with ten different plant species, three of which are used to generate high-quality chromosome-level assemblies without any scaffolding technologies. Finally, we report the diploid genomes of Eucalyptus rudis and E. camaldulensis and the allotetraploid genome of Arachis hypogaea.


2020 ◽  
Vol 71 (18) ◽  
pp. 5313-5322 ◽  
Author(s):  
Kathryn Dumschott ◽  
Maximilian H-W Schmidt ◽  
Harmeet Singh Chawla ◽  
Rod Snowdon ◽  
Björn Usadel

Abstract DNA sequencing was dominated by Sanger’s chain termination method until the mid-2000s, when it was progressively supplanted by new sequencing technologies that can generate much larger quantities of data in a shorter time. At the forefront of these developments, long-read sequencing technologies (third-generation sequencing) can produce reads that are several kilobases in length. This greatly improves the accuracy of genome assemblies by spanning the highly repetitive segments that cause difficulty for second-generation short-read technologies. Third-generation sequencing is especially appealing for plant genomes, which can be extremely large with long stretches of highly repetitive DNA. Until recently, the low basecalling accuracy of third-generation technologies meant that accurate genome assembly required expensive, high-coverage sequencing followed by computational analysis to correct for errors. However, today’s long-read technologies are more accurate and less expensive, making them the method of choice for the assembly of complex genomes. Oxford Nanopore Technologies (ONT), a third-generation platform for the sequencing of native DNA strands, is particularly suitable for the generation of high-quality assemblies of highly repetitive plant genomes. Here we discuss the benefits of ONT, especially for the plant science community, and describe the issues that remain to be addressed when using ONT for plant genome sequencing.


Gigabyte ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-10 ◽  
Author(s):  
Priyanka Sharma ◽  
Othman Al-Dossary ◽  
Bader Alsubaie ◽  
Ibrahim Al-Mssallem ◽  
Onkar Nath ◽  
...  

Advances in DNA sequencing have made it easier to sequence and assemble plant genomes. Here, we extend an earlier study, and compare recent methods for long read sequencing and assembly. Updated Oxford Nanopore Technology software improved assemblies. Using more accurate sequences produced by repeated sequencing of the same molecule (Pacific Biosciences HiFi) resulted in less fragmented assembly of sequencing reads. Using data for increased genome coverage resulted in longer contigs, but reduced total assembly length and improved genome completeness. The original model species, Macadamia jansenii, was also compared with three other Macadamia species, as well as avocado (Persea americana) and jojoba (Simmondsia chinensis). In these angiosperms, increasing sequence data volumes caused a linear increase in contig size, decreased assembly length and further improved already high completeness. Differences in genome size and sequence complexity influenced the success of assembly. Advances in long read sequencing technology continue to improve plant genome sequencing and assembly. However, results were improved by greater genome coverage, with the amount needed to achieve a particular level of assembly being species dependent.


2020 ◽  
Vol 53 (2) ◽  
pp. 217-232
Author(s):  
M. H. SHAHRAJABIAN ◽  
W. SUN ◽  
Q. CHENG

Chinese medicinal herbs and fruits have grown rapidly and significantly in recent years and have a positive influence on improving people’s attention to their health and organic life style. According to the advancement of sequencing technologies and reduced costs, the genome sequencing data of medicinal plants are accumulating rapidly. Our aim was to review plant genomes of three important medicinal plants in China. There is an ample genetic diversity of plants with medicinal importance around the globe and this pool of genetic variation serves as the base for selection, as well as for plant improvement. Plant genomes are characterized by large variations of genome size and ploidy level. Comparative genomics provides a method to unravel the relationship between genomes, by describing conserved chromosomes or chromosomal regions between related species. It is also clear that it is possible to use plant genome as a tool for improving breeding strategies. However, certain limitations represent a number of challenges for the generation and utilization of genomic resources in many important medicinal plant species. This review has focused on plant genomes of some important horticultural plants, which are famous in traditional Chinese medicine, namely ginger, ginseng and goji berry. However, more researches are needed to introduce the genome research of medicinal plants.


2017 ◽  
Author(s):  
Xuefang Zhao ◽  
Alexandra M. Weber ◽  
Ryan E. Mills

ABSTRACTAlthough there are numerous algorithms that have been developed to identify structural variation (SVs) in genomic sequences, there is a dearth of approaches that can be used to evaluate their results. The emergence of new sequencing technologies that generate longer sequence reads can, in theory, provide direct evidence for all types of SVs regardless of the length of region through which it spans. However, current efforts to use these data in this manner require the use of large computational resources to assemble these sequences as well as manual inspection of each region. Here, we present VaPoR, a highly efficient algorithm that autonomously validates large SV sets using long read sequencing data. We assess of the performance of VaPoR on both simulated and real SVs and report a high-fidelity rate for various features including overall accuracy, sensitivity of breakpoint precision, and predicted genotype.


2020 ◽  
Vol 15 (2) ◽  
pp. 165-172
Author(s):  
Chaithra Pradeep ◽  
Dharam Nandan ◽  
Arya A. Das ◽  
Dinesh Velayutham

Background: The standard approach for transcriptomic profiling involves high throughput short-read sequencing technology, mainly dominated by Illumina. However, the short reads have limitations in transcriptome assembly and in obtaining full-length transcripts due to the complex nature of transcriptomes with variable length and multiple alternative spliced isoforms. Recent advances in long read sequencing by the Oxford Nanopore Technologies (ONT) offered both cDNA as well as direct RNA sequencing and has brought a paradigm change in the sequencing technology to greatly improve the assembly and expression estimates. ONT enables molecules to be sequenced without fragmentation resulting in ultra-long read length enabling the entire genes and transcripts to be fully characterized. The direct RNA sequencing method, in addition, circumvents the reverse transcription and amplification steps. Objective: In this study, RNA sequencing methods were assessed by comparing data from Illumina (ILM), ONT cDNA (OCD) and ONT direct RNA (ODR). Methods: The sensitivity & specificity of the isoform detection was determined from the data generated by Illumina, ONT cDNA and ONT direct RNA sequencing technologies using Saccharomyces cerevisiae as model. Comparative studies were conducted with two pipelines to detect the isoforms, novel genes and variable gene length. Results: Mapping metrics and qualitative profiles for different pipelines are presented to understand these disruptive technologies. The variability in sequencing technology and the analysis pipeline were studied.


2019 ◽  
Vol 3 (1) ◽  
pp. 1
Author(s):  
Roxana Guillen

Sequencing technologies have suffered over the last few years improvements in its performance, Next-generation Sequencing is being used more frequently to control infectious diseases, to know and anticipate antimicrobial resistance (AMR) and in surveillance controls against possible infectious outbreaks. Molecular assays used to detect pathogenic or antibiotic resistant agents take a lot of time and effort, and often enough information is not collected to make decisions. Next- generation sequencing appears to elucidate in the least time possible the whole DNA sequence and provide us with enough data to know resistance, virulence and typing that can be analyzed and a great help in research and decision making. NGS is a very promising technology, in order for it to be used extensively, requires the development of data analysis platforms and reduction of trials costs that still is very high for a massive use.


2019 ◽  
Author(s):  
Krithika Arumugam ◽  
Caner Bağci ◽  
Irina Bessarab ◽  
Sina Beier ◽  
Benjamin Buchfink ◽  
...  

AbstractBackgroundShort-read sequencing technologies have long been the work-horse of microbiome analysis. Continuing technological advances are making the application of long-read sequencing to metagenomic samples increasingly feasible.ResultsWe demonstrate that whole bacterial chromosomes can be obtained from a complex community, by application of MinION sequencing to a sample from an EBPR bio-reactor, producing 6Gb of sequence that assembles in to multiple closed bacterial chromosomes. We provide a simple pipeline for processing such data, which includes a new approach to correcting erroneous frame-shifts.ConclusionsAdvances in long read sequencing technology and corresponding algorithms will allow the routine extraction of whole chromosomes from environmental samples, providing a more detailed picture of individual members of a microbiome.


Author(s):  
Pierre Morisse ◽  
Thierry Lecroq ◽  
Arnaud Lefebvre

AbstractThird generation sequencing technologies Pacific Biosciences and Oxford Nanopore Technologies were respectively made available in 2011 and 2014. In contrast with second generation sequencing technologies such as Illumina, these new technologies allow the sequencing of long reads of tens to hundreds of kbps. These so called long reads are particularly promising, and are especially expected to solve various problems such as contig and haplotype assembly or scaffolding, for instance. However, these reads are also much more error prone than second generation reads, and display error rates reaching 10 to 30%, according to the sequencing technology and to the version of the chemistry. Moreover, these errors are mainly composed of insertions and deletions, whereas most errors are substitutions in Illumina reads. As a result, long reads require efficient error correction, and a plethora of error correction tools, directly targeted at these reads, were developed in the past nine years. These methods can adopt a hybrid approach, using complementary short reads to perform correction, or a self-correction approach, only making use of the information contained in the long reads sequences. Both these approaches make use of various strategies such as multiple sequence alignment, de Bruijn graphs, hidden Markov models, or even combine different strategies. In this paper, we describe a complete survey of long-read error correction, reviewing all the different methodologies and tools existing up to date, for both hybrid and self-correction. Moreover, the long reads characteristics, such as sequencing depth, length, error rate, or even sequencing technology, can have an impact on how well a given tool or strategy performs, and can thus drastically reduce the correction quality. We thus also present an in-depth benchmark of available long-read error correction tools, on a wide variety of datasets, composed of both simulated and real data, with various error rates, coverages, and read lengths, ranging from small bacterial to large mammal genomes.


Sign in / Sign up

Export Citation Format

Share Document