Genome sequence of the model rice variety KitaakeX

Abstract Background The availability of thousands of complete rice genome sequences from diverse varieties and accessions has laid the foundation for in-depth exploration of the rice genome. One drawback to these collections is that most of these rice varieties have long life cycles, and/or low transformation efficiencies, which limits their usefulness as model organisms for functional genomics studies. In contrast, the rice variety Kitaake has a rapid life cycle (9 weeks seed to seed) and is easy to transform and propagate. For these reasons, Kitaake has emerged as a model for studies of diverse monocotyledonous species. Results Here, we report the de novo genome sequencing and analysis of Oryza sativa ssp. japonica variety KitaakeX, a Kitaake plant carrying the rice XA21 immune receptor. Our KitaakeX sequence assembly contains 377.6 Mb, consisting of 33 scaffolds (476 contigs) with a contig N50 of 1.4 Mb. Complementing the assembly are detailed gene annotations of 35,594 protein coding genes. We identified 331,335 genomic variations between KitaakeX and Nipponbare (ssp. japonica), and 2,785,991 variations between KitaakeX and Zhenshan97 (ssp. indica). We also compared Kitaake resequencing reads to the KitaakeX assembly and identified 219 small variations. The high-quality genome of the model rice plant KitaakeX will accelerate rice functional genomics. Conclusions The high quality, de novo assembly of the KitaakeX genome will serve as a useful reference genome for rice and will accelerate functional genomics studies of rice and other species.

Download Full-text

High-quality genome assembly and high-density genetic map of asparagus bean

10.1101/521179 ◽

2019 ◽

Author(s):

Qiuju Xia ◽

Ru Zhang ◽

Xuemei Ni ◽

Lei Pan ◽

Yangzi Wang ◽

...

Keyword(s):

Genome Assembly ◽

Agronomic Traits ◽

High Density ◽

High Quality ◽

Protein Coding ◽

Protein Coding Genes ◽

Economically Valuable Traits ◽

Sequencing Strategy ◽

Asparagus Bean ◽

High Quality Genome

AbstractAsparagus bean (Vigna. unguiculata ssp. sesquipedialis), known for its very long and tender green pods, is an important vegetable crop broadly grown in the developing countries. Despite its agricultural and economic values, asparagus bean does not have a high-quality genome assembly for breeding novel agronomic traits. In this study, we reported a high-quality 632.8 Mb assembly of asparagus bean based on the whole genome shotgun sequencing strategy. We also generated a high-density linkage map for asparagus bean, which helped anchor 94.42% of the scaffolds into 11 pseudo-chromosomes. A total of 42,609 protein-coding genes and 3,579 non-protein-coding genes were predicted from the assembly. Taken together, these genomic resources of asparagus bean will facilitate the investigation of economically valuable traits in a variety of legume species, so that the cultivation of these plants would help combat the protein and energy malnutrition in the developing world.

Download Full-text

A high-quality chromosomal genome assembly of Diospyros oleifera Cheng

GigaScience ◽

10.1093/gigascience/giz164 ◽

2020 ◽

Vol 9 (1) ◽

Author(s):

Yujing Suo ◽

Peng Sun ◽

Huihui Cheng ◽

Weijuan Han ◽

Songfeng Diao ◽

...

Keyword(s):

Molecular Mechanisms ◽

De Novo ◽

Phylogenetic Analyses ◽

Draft Genome ◽

Diospyros Kaki ◽

High Quality ◽

Phylogenetic Tree Analysis ◽

Protein Coding ◽

Protein Coding Genes ◽

Anthocyanin Pathway

Abstract Background Diospyros oleifera Cheng, of the family Ebenaceae, is an economically important tree. Phylogenetic analyses indicate that D. oleifera is closely related to Diospyros kaki Thunb. and could be used as a model plant for studies of D. kaki. Therefore, development of genomic resources of D. oleifera will facilitate auxiliary assembly of the hexaploid persimmon genome and elucidate the molecular mechanisms of important traits. Findings The D. oleifera genome was assembled with 443.6 Gb of raw reads using the Pacific Bioscience Sequel and Illumina HiSeq X Ten platforms. The final draft genome was ∼812.3 Mb and had a high level of continuity with N50 of 3.36 Mb. Fifteen scaffolds corresponding to the 15 chromosomes were assembled to a final size of 721.5 Mb using 332 scaffolds, accounting for 88.81% of the genome. Repeat sequences accounted for 54.8% of the genome. By de novo sequencing and analysis of homology with other plant species, 30,530 protein-coding genes with an average transcript size of 7,105.40 bp were annotated; of these, 28,580 protein-coding genes (93.61%) had conserved functional motifs or terms. In addition, 171 candidate genes involved in tannin synthesis and deastringency in persimmon were identified; of these chalcone synthase (CHS) genes were expanded in the D. oleifera genome compared with Diospyros lotus, Camellia sinensis, and Vitis vinifera. Moreover, 186 positively selected genes were identified, including chalcone isomerase (CHI) gene, a key enzyme in the flavonoid-anthocyanin pathway. Phylogenetic tree analysis indicated that the split of D. oleifera and D. lotus likely occurred 9.0 million years ago. In addition to the ancient γ event, a second whole-genome duplication event occurred in D. oleifera and D. lotus. Conclusions We generated a high-quality chromosome-level draft genome for D. oleifera, which will facilitate assembly of the hexaploid persimmon genome and further studies of major economic traits in the genus Diospyros.

Download Full-text

Genome sequence of the model rice variety KitaakeX

10.21203/rs.2.10528/v1 ◽

2019 ◽

Author(s):

Rashmi Jain ◽

Jerry Jenkins ◽

Shengqiang Shu ◽

Mawsheng Chern ◽

Joel A. Martin ◽

...

Keyword(s):

Functional Genomics ◽

De Novo ◽

Rice Variety ◽

Rice Genome ◽

Life Cycles ◽

Model Organisms ◽

High Quality ◽

Japonica Variety ◽

Rice Varieties ◽

Protein Coding

Abstract Background: The availability of thousands of complete rice genome sequences from diverse varieties and accessions has laid the foundation for in-depth exploration of the rice genome. One drawback to these collections is that most of these rice varieties have long life cycles, and/or low transformation efficiencies, which limits their usefulness as model organisms for functional genomics studies. In contrast, the rice variety Kitaake has a rapid life cycle (9 weeks seed to seed) and is easy to propagate. For these reasons, Kitaake has emerged as a model for studies of diverse monocotyledonous species. Results: Here, we report the de novo genome sequencing and analysis of Oryza sativa ssp. japonica variety KitaakeX, a Kitaake plant carrying the rice XA21 immune receptor. Our KitaakeX sequence assembly contains 377.6 Mb, consisting of 33 scaffolds (476 contigs) with a contig N50 of 1.4 Mb. Complementing the assembly are detailed gene annotations of 35,594 protein coding genes. We identified 331,335 genomic variations between KitaakeX and Nipponbare (ssp. japonica), and 2,785,991 variations between KitaakeX and Zhenshan97 (ssp. indica). We also compared Kitaake resequencing reads to the KitaakeX assembly and identified 219 small variations. The high-quality genome of the model rice plant KitaakeX will accelerate rice functional genomics. Conclusions: The high quality, de novo assembly of the KitaakeX genome will serve as a useful reference genome for rice and will accelerate functional genomics studies of rice and other species.

Download Full-text

Complete Genome Sequence of Pseudomonas sp. Strain SGAir0191, Isolated from Tropical Air Collected in Singapore

Microbiology Resource Announcements ◽

10.1128/mra.00617-19 ◽

2019 ◽

Vol 8 (34) ◽

Author(s):

Anthony Wong ◽

Ana Carolina M. Junqueira ◽

Ankur Chaturvedi ◽

Akira Uchida ◽

Rikky W. Purbojati ◽

...

Keyword(s):

Genome Sequence ◽

Genome Assembly ◽

Complete Genome Sequence ◽

Complete Genome ◽

Pseudomonas Sp ◽

High Quality ◽

Protein Coding ◽

Protein Coding Genes ◽

Air Sample ◽

High Quality Genome

Pseudomonas sp. strain SGAir0191 was isolated from an air sample collected in Singapore, and its genome was sequenced using a combination of long and short reads to generate a high-quality genome assembly. The complete genome is approximately 5.07 Mb with 4,370 protein-coding genes, 19 rRNAs, and 73 tRNAs.

Download Full-text

Comparative analysis of de novo genomes reveals dynamic intra-species divergence of NLRs in pepper

BMC Plant Biology ◽

10.1186/s12870-021-03057-8 ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Myung-Shin Kim ◽

Geun Young Chae ◽

Soohyun Oh ◽

Jihyun Kim ◽

Hyunggon Mang ◽

...

Keyword(s):

Capsicum Annuum ◽

De Novo ◽

Genomic Diversity ◽

Specific Gene ◽

Protein Coding ◽

Genomic Variations ◽

Gene Annotations ◽

Small Fruit ◽

Number Variation ◽

Genome Assemblies

Abstract Background Peppers (Capsicum annuum L.) containing distinct capsaicinoids are the most widely cultivated spices in the world. However, extreme genomic diversity among species represents an obstacle to breeding pepper. Results Here, we report de novo genome assemblies of Capsicum annuum ‘Early Calwonder (non-pungent, ECW)’ and ‘Small Fruit (pungent, SF)’ along with their annotations. In total, we assembled 2.9 Gb of ECW and SF genome sequences, representing over 91% of the estimated genome sizes. Structural and functional annotation of the two pepper genomes generated about 35,000 protein-coding genes each, of which 93% were assigned putative functions. Comparison between newly and publicly available pepper gene annotations revealed both shared and specific gene content. In addition, a comprehensive analysis of nucleotide-binding and leucine-rich repeat (NLR) genes through whole-genome alignment identified five significant regions of NLR copy number variation (CNV). Detailed comparisons of those regions revealed that these CNVs were generated by intra-specific genomic variations that accelerated diversification of NLRs among peppers. Conclusions Our analyses unveil an evolutionary mechanism responsible for generating CNVs of NLRs among pepper accessions, and provide novel genomic resources for functional genomics and molecular breeding of disease resistance in Capsicum species.

Download Full-text

Genome sequence of the model rice variety KitaakeX

10.21203/rs.2.10528/v3 ◽

2019 ◽

Author(s):

Rashmi Jain ◽

Jerry Jenkins ◽

Shengqiang Shu ◽

Mawsheng Chern ◽

Joel A. Martin ◽

...

Keyword(s):

Functional Genomics ◽

De Novo ◽

Rice Variety ◽

Rice Genome ◽

Life Cycles ◽

Model Organisms ◽

High Quality ◽

Japonica Variety ◽

Rice Varieties ◽

Protein Coding

Abstract Background: The availability of thousands of complete rice genome sequences from diverse varieties and accessions has laid the foundation for in-depth exploration of the rice genome. One drawback to these collections is that most of these rice varieties have long life cycles, and/or low transformation efficiencies, which limits their usefulness as model organisms for functional genomics studies. In contrast, the rice variety Kitaake has a rapid life cycle (9 weeks seed to seed) and is easy to propagate. For these reasons, Kitaake has emerged as a model for studies of diverse monocotyledonous species. Results: Here, we report the de novo genome sequencing and analysis of Oryza sativa ssp. japonica variety KitaakeX, a Kitaake plant carrying the rice XA21 immune receptor. Our KitaakeX sequence assembly contains 377.6 Mb, consisting of 33 scaffolds (476 contigs) with a contig N50 of 1.4 Mb. Complementing the assembly are detailed gene annotations of 35,594 protein coding genes. We identified 331,335 genomic variations between KitaakeX and Nipponbare (ssp. japonica), and 2,785,991 variations between KitaakeX and Zhenshan97 (ssp. indica). We also compared Kitaake resequencing reads to the KitaakeX assembly and identified 219 small variations. The high-quality genome of the model rice plant KitaakeX will accelerate rice functional genomics. Conclusions: The high quality, de novo assembly of the KitaakeX genome will serve as a useful reference genome for rice and will accelerate functional genomics studies of rice and other species.

Download Full-text

Genome sequence of the model rice variety KitaakeX

10.21203/rs.2.10528/v2 ◽

2019 ◽

Author(s):

Rashmi Jain ◽

Jerry Jenkins ◽

Shengqiang Shu ◽

Mawsheng Chern ◽

Joel A. Martin ◽

...

Keyword(s):

Functional Genomics ◽

De Novo ◽

Rice Variety ◽

Rice Genome ◽

Life Cycles ◽

Model Organisms ◽

High Quality ◽

Japonica Variety ◽

Rice Varieties ◽

Protein Coding

Abstract Background: The availability of thousands of complete rice genome sequences from diverse varieties and accessions has laid the foundation for in-depth exploration of the rice genome. One drawback to these collections is that most of these rice varieties have long life cycles, and/or low transformation efficiencies, which limits their usefulness as model organisms for functional genomics studies. In contrast, the rice variety Kitaake has a rapid life cycle (9 weeks seed to seed) and is easy to propagate. For these reasons, Kitaake has emerged as a model for studies of diverse monocotyledonous species. Results: Here, we report the de novo genome sequencing and analysis of Oryza sativa ssp. japonica variety KitaakeX, a Kitaake plant carrying the rice XA21 immune receptor. Our KitaakeX sequence assembly contains 377.6 Mb, consisting of 33 scaffolds (476 contigs) with a contig N50 of 1.4 Mb. Complementing the assembly are detailed gene annotations of 35,594 protein coding genes. We identified 331,335 genomic variations between KitaakeX and Nipponbare (ssp. japonica), and 2,785,991 variations between KitaakeX and Zhenshan97 (ssp. indica). We also compared Kitaake resequencing reads to the KitaakeX assembly and identified 219 small variations. The high-quality genome of the model rice plant KitaakeX will accelerate rice functional genomics. Conclusions: The high quality, de novo assembly of the KitaakeX genome will serve as a useful reference genome for rice and will accelerate functional genomics studies of rice and other species.

Download Full-text

From de novo to ‘de nono’: The majority of novel protein coding genes identified with phylostratigraphy are old genes or recent duplicates

Genome Biology and Evolution ◽

10.1093/gbe/evy231 ◽

2018 ◽

Cited By ~ 2

Author(s):

Claudio Casola

Keyword(s):

De Novo ◽

Protein Coding ◽

Protein Coding Genes ◽

Novel Protein

Download Full-text

Whole-Genome Sequencing of Chinese Yellow Catfish Provides a Valuable Genetic Resource for High-Throughput Identification of Toxin Genes

Toxins ◽

10.3390/toxins10120488 ◽

2018 ◽

Vol 10 (12) ◽

pp. 488 ◽

Cited By ~ 5

Author(s):

Shiyong Zhang ◽

Jia Li ◽

Qin Qin ◽

Wei Liu ◽

Chao Bian ◽

...

Keyword(s):

High Throughput ◽

Genome Assembly ◽

Raw Materials ◽

Pelteobagrus Fulvidraco ◽

Yellow Catfish ◽

High Quality ◽

Protein Coding ◽

Toxin Genes ◽

Sequencing Platforms ◽

High Quality Genome

Naturally derived toxins from animals are good raw materials for drug development. As a representative venomous teleost, Chinese yellow catfish (Pelteobagrus fulvidraco) can provide valuable resources for studies on toxin genes. Its venom glands are located in the pectoral and dorsal fins. Although with such interesting biologic traits and great value in economy, Chinese yellow catfish is still lacking a sequenced genome. Here, we report a high-quality genome assembly of Chinese yellow catfish using a combination of next-generation Illumina and third-generation PacBio sequencing platforms. The final assembly reached 714 Mb, with a contig N50 of 970 kb and a scaffold N50 of 3.65 Mb, respectively. We also annotated 21,562 protein-coding genes, in which 97.59% were assigned at least one functional annotation. Based on the genome sequence, we analyzed toxin genes in Chinese yellow catfish. Finally, we identified 207 toxin genes and classified them into three major groups. Interestingly, we also expanded a previously reported sex-related region (to ≈6 Mb) in the achieved genome assembly, and localized two important toxin genes within this region. In summary, we assembled a high-quality genome of Chinese yellow catfish and performed high-throughput identification of toxin genes from a genomic view. Therefore, the limited number of toxin sequences in public databases will be remarkably improved once we integrate multi-omics data from more and more sequenced species.

Download Full-text