scholarly journals A high-continuity and annotated tomato reference genome

BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Xiao Su ◽  
Baoan Wang ◽  
Xiaolin Geng ◽  
Yuefan Du ◽  
Qinqin Yang ◽  
...  

Abstract Background Genetic and functional genomics studies require a high-quality genome assembly. Tomato (Solanum lycopersicum), an important horticultural crop, is an ideal model species for the study of fruit development. Results Here, we assembled an updated reference genome of S. lycopersicum cv. Heinz 1706 that was 799.09 Mb in length, containing 34,384 predicted protein-coding genes and 65.66% repetitive sequences. By comparing the genomes of S. lycopersicum and S. pimpinellifolium LA2093, we found a large number of genomic fragments probably associated with human selection, which may have had crucial roles in the domestication of tomato. We also used a recombinant inbred line (RIL) population to generate a high-density genetic map with high resolution and accuracy. Using these resources, we identified a number of candidate genes that were likely to be related to important agronomic traits in tomato. Conclusion Our results offer opportunities for understanding the evolution of the tomato genome and will facilitate the study of genetic mechanisms in tomato biology.

2021 ◽  
Author(s):  
Xiao Su ◽  
Baoan Wang ◽  
Xiaolin Geng ◽  
Yuefan Du ◽  
Qinqin Yang ◽  
...  

Abstract Background: Genetic and functional genomics studies require a high-quality genome assembly. Tomato (Solanum lycopersicum), an important horticultural crop, is an ideal model species for the study of fruit development. Results: Here, we assembled an updated reference genome of S. lycopersicum cv. Heinz 1706 that was 799.09 Mb in length, containing 34,384 predicted protein-coding genes and 65.66% repetitive sequences. By comparing the genomes of S. lycopersicum and S. pimpinellifolium LA2093, we found a large number of genomic fragments probably associated with human selection, which may have had crucial roles in the domestication of tomato. We also used a recombinant inbred line (RIL) population to generate a high-density genetic map with high resolution and accuracy. Using these resources, we identified a number of candidate genes that were likely to be related to important agronomic traits in tomato. Conclusion:Our results offer opportunities for understanding the evolution of the tomato genome and will facilitate the study of genetic mechanisms in tomato biology.


2019 ◽  
Vol 6 (1) ◽  
Author(s):  
Haolin Wu ◽  
Tao Ma ◽  
Minghui Kang ◽  
Fandi Ai ◽  
Junlin Zhang ◽  
...  

Abstract Actinidia chinensis (kiwifruit) is a perennial horticultural crop species of the Actinidiaceae family with high nutritional and economic value. Two versions of the A. chinensis genomes have been previously assembled, based mainly on relatively short reads. Here, we report an improved chromosome-level reference genome of A. chinensis (v3.0), based mainly on PacBio long reads and Hi-C data. The high-quality assembled genome is 653 Mb long, with 0.76% heterozygosity. At least 43% of the genome consists of repetitive sequences, and the most abundant long terminal repeats were further identified and account for 23.38% of our novel genome. It has clear improvements in contiguity, accuracy, and gene annotation over the two previous versions and contains 40,464 annotated protein-coding genes, of which 94.41% are functionally annotated. Moreover, further analyses of genetic collinearity revealed that the kiwifruit genome has undergone two whole-genome duplications: one affecting all Ericales families near the K-T extinction event and a recent genus-specific duplication. The reference genome presented here will be highly useful for further molecular elucidation of diverse traits and for the breeding of this horticultural crop, as well as evolutionary studies with related taxa.


2021 ◽  
Author(s):  
Dong Gao ◽  
Wenyu Fang ◽  
Joanna Collins ◽  
James Torrance ◽  
Ying Yan ◽  
...  

The yellowfin seabream, Acanthopagrus latus, is widely distributed throughout the Indo-West Pacific. This fish is an ideal model species in which to study the mechanism of sex reversal since it exhibits a specific feature: sequential hermaphrodite. Here, we report a chromosome-scale assembly of the A. latus based on PacBio and Hi-C data. 22,485 protein-coding genes were annotated in whole genome level using transcriptome data. Taken together, this highly accurate, chromosome-level reference genome can provide a valuable resource to elucidate the mechanism of sex reversal for A. latus.


Author(s):  
Alaina Shumate ◽  
Steven L Salzberg

Abstract Motivation Improvements in DNA sequencing technology and computational methods have led to a substantial increase in the creation of high-quality genome assemblies of many species. To understand the biology of these genomes, annotation of gene features and other functional elements is essential; however for most species, only the reference genome is well-annotated. Results One strategy to annotate new or improved genome assemblies is to map or ‘lift over’ the genes from a previously-annotated reference genome. Here we describe Liftoff, a new genome annotation lift-over tool capable of mapping genes between two assemblies of the same or closely-related species. Liftoff aligns genes from a reference genome to a target genome and finds the mapping that maximizes sequence identity while preserving the structure of each exon, transcript, and gene. We show that Liftoff can accurately map 99.9% of genes between two versions of the human reference genome with an average sequence identity >99.9%. We also show that Liftoff can map genes across species by successfully lifting over 98.3% of human protein-coding genes to a chimpanzee genome assembly with 98.2% sequence identity. Availability and Implementation Liftoff can be installed via bioconda and PyPI. Additionally, the source code for Liftoff is available at https://github.com/agshumate/Liftoff Supplementary information Supplementary data are available at Bioinformatics online.


2019 ◽  
Author(s):  
Qiuju Xia ◽  
Ru Zhang ◽  
Xuemei Ni ◽  
Lei Pan ◽  
Yangzi Wang ◽  
...  

AbstractAsparagus bean (Vigna. unguiculata ssp. sesquipedialis), known for its very long and tender green pods, is an important vegetable crop broadly grown in the developing countries. Despite its agricultural and economic values, asparagus bean does not have a high-quality genome assembly for breeding novel agronomic traits. In this study, we reported a high-quality 632.8 Mb assembly of asparagus bean based on the whole genome shotgun sequencing strategy. We also generated a high-density linkage map for asparagus bean, which helped anchor 94.42% of the scaffolds into 11 pseudo-chromosomes. A total of 42,609 protein-coding genes and 3,579 non-protein-coding genes were predicted from the assembly. Taken together, these genomic resources of asparagus bean will facilitate the investigation of economically valuable traits in a variety of legume species, so that the cultivation of these plants would help combat the protein and energy malnutrition in the developing world.


2020 ◽  
Vol 7 (1) ◽  
Author(s):  
Qingzhen Wei ◽  
Jinglei Wang ◽  
Wuhong Wang ◽  
Tianhua Hu ◽  
Haijiao Hu ◽  
...  

Abstract Eggplant (Solanum melongena L.) is an economically important vegetable crop in the Solanaceae family, with extensive diversity among landraces and close relatives. Here, we report a high-quality reference genome for the eggplant inbred line HQ-1315 (S. melongena-HQ) using a combination of Illumina, Nanopore and 10X genomics sequencing technologies and Hi-C technology for genome assembly. The assembled genome has a total size of ~1.17 Gb and 12 chromosomes, with a contig N50 of 5.26 Mb, consisting of 36,582 protein-coding genes. Repetitive sequences comprise 70.09% (811.14 Mb) of the eggplant genome, most of which are long terminal repeat (LTR) retrotransposons (65.80%), followed by long interspersed nuclear elements (LINEs, 1.54%) and DNA transposons (0.85%). The S. melongena-HQ eggplant genome carries a total of 563 accession-specific gene families containing 1009 genes. In total, 73 expanded gene families (892 genes) and 34 contraction gene families (114 genes) were functionally annotated. Comparative analysis of different eggplant genomes identified three types of variations, including single-nucleotide polymorphisms (SNPs), insertions/deletions (indels) and structural variants (SVs). Asymmetric SV accumulation was found in potential regulatory regions of protein-coding genes among the different eggplant genomes. Furthermore, we performed QTL-seq for eggplant fruit length using the S. melongena-HQ reference genome and detected a QTL interval of 71.29–78.26 Mb on chromosome E03. The gene Smechr0301963, which belongs to the SUN gene family, is predicted to be a key candidate gene for eggplant fruit length regulation. Moreover, we anchored a total of 210 linkage markers associated with 71 traits to the eggplant chromosomes and finally obtained 26 QTL hotspots. The eggplant HQ-1315 genome assembly can be accessed at http://eggplant-hq.cn. In conclusion, the eggplant genome presented herein provides a global view of genomic divergence at the whole-genome level and powerful tools for the identification of candidate genes for important traits in eggplant.


2020 ◽  
Author(s):  
Tingting Song ◽  
Mengyan Zhou ◽  
Yuying Yuan ◽  
Jinqiu Yu ◽  
Hua Cai ◽  
...  

AbstractAmphicarpaea edgeworthii, an annual twining herb, is a widely distributed species and an ideal model for studying complex flowering types and evolutionary mechanisms of species. Herein, we generated a high-quality assembly of A. edgeworthii by using a combination of PacBio, 10× Genomics libraries, and Hi-C mapping technologies. The final 11 chromosome-level scaffolds covered 90.61% of the estimated genome (343.78 Mb), which is the first chromosome-scale assembled genome of an amphicarpic plant. These data will be beneficial for the discovery of genes that control major agronomic traits, spur genetic improvement of and functional genetic studies in legumes, and supply comparative genetic resources for other amphicarpic plants.


2019 ◽  
Author(s):  
Xiaolei Liu ◽  
Yayan Feng ◽  
Xue Bai ◽  
Xuelin Wang ◽  
Rui Qin ◽  
...  

AbstractUnderstanding roles of repetitive sequences in genomes of parasites could offer insights into their evolution, speciation, and parasitism. As a unique intracellular nematode, Trichinella consists of two clades, encapsulated and non-encapsulated. Genomic correlation to the distinct differences between the two clades is still unclear. Here we report an annotated draft reference genome of non-encapsulated Trichinella, T. pseudospiralis, and performed comparative analyses with encapsulated T. spiralis. Genome analysis revealed that, during Trichinella evolution, repetitive sequence insertions played an important role in gene family expansion in synergy with DNA methylation, especially for the DNase II members of the phospholipase D superfamily and Glutathione S-transferases. We further identify the genomic and epigenomic regulation of excretory/secretory products in relation to differences in parasitism, pathology and immunology between the two clades Trichinella. The present study provided a foundation for further elucidation of mechanism of nurse cell formation and immunoevasion as well as identification of phamarcological and diagnostic targets of trichinellosis.


Author(s):  
Ying-Feng Niu ◽  
Guo-Hua Li ◽  
Shu-Bang Ni ◽  
Xi-Yong He ◽  
Cheng Zheng ◽  
...  

AbstractMacadamia is a kind of evergreen nut trees which belong to the Proteaceae family. The two commercial macadamia species, Macadamia integrifolia and M. tetraphylla, are highly prized for their edible kernels. Catherine et al. reported M. integrifolia genome using NGS sequencing technology. However, the lack of a high-quality assembly for M. tetraphylla hinders the progress in biological research and breeding program. In this study, we report a high-quality genome sequence of M. tetraphylla using the Oxford Nanopore Technologies (ONT) technology. We generated an assembly of 750.54 Mb with a contig N50 length of 1.18 Mb, which is close to the size estimated by flow cytometry and k-mer analysis. Repetitive sequence represent 58.57% of the genome sequence, which is strikingly higher compared with M. integrifolia. A total of 31,571 protein-coding genes were annotated with an average length of 6,055 bp, of which 92.59% were functionally annotated. The genome sequence of M. tetraphylla will provide novel insights into the breeding of novel strains and genetic improvement of agronomic traits.


Author(s):  
Saptarathi Deb ◽  
Suvratha J ◽  
Samathmika Ravi ◽  
Raksha Rao K ◽  
Saurabh Whadgar ◽  
...  

ABSTRACTIn the age of genomics-based crop improvement, a high-quality genome of a local landrace adapted to the local environmental conditions is critically important. Grain amaranths produce highly nutritional grains with a multitude of desirable properties including C4 photosynthesis highly sought-after in other crops. For improving the agronomic traits of grain amaranth and for the transfer of desirable traits to dicot crops, a reference genome of a local landrace is necessary. Towards this end, our lab had initiated sequencing the genome of Amaranthus (A.) hypochondriacus (A.hyp_K_white) and had reported a draft genome in 2014. We selected this landrace because it is well adapted for cultivation in India during the last century and is currently a candidate for TILLING-based crop improvement. More recently, a high-quality chromosome-level assembly of A. hypochondriacus (PI558499, Plainsman) was reported. Here, we report a chromosome-level assembly of A.hyp_K_white (AhKP) using low-coverage PacBio reads, contigs from the reported draft genome of A.hyp_K_white, raw HiC data and reference genome of Plainsman. The placement of A.hyp_K_white on the phylogenetic tree of grain amaranths of known accessions clearly suggests that A.hyp_K_white is genetically distal from Plainsman and is most closely related to the accession PI619259 from Nepal (Ramdana). Furthermore, the classification of another accession, Suvarna, adapted to the local environment and selected for yield and other desirable traits, is clearly A. cruentus. A classification based on hundreds of thousands of SNPs validated taxonomy-based classification for a majority of the accessions providing the opportunity for reclassification of a few.


Sign in / Sign up

Export Citation Format

Share Document