scholarly journals Genome sequence of Hydrangea macrophylla and its application in analysis of the double flower phenotype

2020 ◽  
Author(s):  
K Nashima ◽  
K Shirasawa ◽  
A Ghelfi ◽  
H Hirakawa ◽  
S Isobe ◽  
...  

AbstractOwing to its high ornamental value, the double flower phenotype of hydrangea (Hydrangea macrophylla) is one of its most important traits. In this study, genome sequence information was obtained to explore effective DNA markers and the causative genes for double flower production in hydrangea. Single molecule real-time sequencing data followed by a HiC analysis was employed. The resultant haplotype-phased sequences consisted of 3,779 sequences (2.256 Gb in length and N50 of 1.5 Mb), and 18 pseudomolecules comprising 1.08 Gb scaffold sequences along with a high-density SNP genetic linkage map. Using the genome sequence data obtained from two breeding populations, the SNPs linked to double flower loci (Djo and Dsu), were discovered for each breeding population. DNA markers J01 linked to Djo and S01 linked to Dsu were developed, and these could be used successfully to distinguish the recessive double flower allele for each locus respectively. The LEAFY gene was suggested as the causative gene for Dsu, since frameshift was specifically observed in double flower accession with dsu. The genome information obtained in this study will facilitate a wide range of genomic studies on hydrangea in the future.

DNA Research ◽  
2020 ◽  
Author(s):  
Kenji Nashima ◽  
Kenta Shirasawa ◽  
Andrea Ghelfi ◽  
Hideki Hirakawa ◽  
Sachiko Isobe ◽  
...  

Abstract Owing to its high ornamental value, the double flower phenotype of hydrangea (Hydrangea macrophylla) is one of its most important traits. In this study, genome sequence information was obtained to explore effective DNA markers and the causative genes for double flower production in hydrangea. Single-molecule real-time sequencing data followed by a Hi-C analysis were employed. Two haplotype-phased sequences were obtained from the heterozygous genome of hydrangea. One assembly consisted of 3,779 scaffolds (2.256 Gb in length and N50 of 1.5 Mb), the other also contained 3,779 scaffolds (2.227 Gb in length, and N50 of 1.4 Mb). A total of 36,930 genes were predicted in the sequences, of which 32,205 and 32,222 were found in each haplotype. A pair of 18 pseudomolecules was constructed along with a high-density single-nucleotide polymorphism (SNP) genetic linkage map. Using the genome sequence data, and two F2 populations, the SNPs linked to double flower loci (djo and dsu) were discovered. DNA markers linked to djo and dsu were developed, and these could distinguish the recessive double flower allele for each locus, respectively. The LEAFY gene is a very likely candidate as the causative gene for dsu, since frameshift was specifically observed in the double flower accession with dsu.


2005 ◽  
Vol 69 (2) ◽  
pp. 306-325 ◽  
Author(s):  
Elvira Khalikova ◽  
Petri Susi ◽  
Timo Korpela

SUMMARY Dextran is a chemically and physically complex polymer, breakdown of which is carried out by a variety of endo- and exodextranases. Enzymes in many groups can be classified as dextranases according to function: such enzymes include dextranhydrolases, glucodextranases, exoisomaltohydrolases, exoisomaltotriohydrases, and branched-dextran exo-1,2-α-glucosidases. Cycloisomalto-oligosaccharide glucanotransferase does not formally belong to the dextranases even though its side reaction produces hydrolyzed dextrans. A new classification system for glycosylhydrolases and glycosyltransferases, which is based on amino acid sequence similarities, divides the dextranases into five families. However, this classification is still incomplete since sequence information is missing for many of the enzymes that have been biochemically characterized as dextranases. Dextran-degrading enzymes have been isolated from a wide range of microorganisms. The major characteristics of these enzymes, the methods for analyzing their activities and biological roles, analysis of primary sequence data, and three-dimensional structures of dextranases have been dealt with in this review. Dextranases are promising for future use in various scientific and biotechnological applications.


PeerJ ◽  
2018 ◽  
Vol 6 ◽  
pp. e5895 ◽  
Author(s):  
Thomas Andreas Kohl ◽  
Christian Utpatel ◽  
Viola Schleusener ◽  
Maria Rosaria De Filippo ◽  
Patrick Beckert ◽  
...  

Analyzing whole-genome sequencing data of Mycobacterium tuberculosis complex (MTBC) isolates in a standardized workflow enables both comprehensive antibiotic resistance profiling and outbreak surveillance with highest resolution up to the identification of recent transmission chains. Here, we present MTBseq, a bioinformatics pipeline for next-generation genome sequence data analysis of MTBC isolates. Employing a reference mapping based workflow, MTBseq reports detected variant positions annotated with known association to antibiotic resistance and performs a lineage classification based on phylogenetic single nucleotide polymorphisms (SNPs). When comparing multiple datasets, MTBseq provides a joint list of variants and a FASTA alignment of SNP positions for use in phylogenomic analysis, and identifies groups of related isolates. The pipeline is customizable, expandable and can be used on a desktop computer or laptop without any internet connection, ensuring mobile usage and data security. MTBseq and accompanying documentation is available from https://github.com/ngs-fzb/MTBseq_source.


2016 ◽  
Vol 4 (5) ◽  
Author(s):  
Kai Bernd Stadermann ◽  
Daniela Holtgräwe ◽  
Bernd Weisshaar

A publicly available data set from Pacific Biosciences was used to create an assembly of the chloroplast genome sequence of theArabidopsis thalianagenotype Landsbergerecta. The assembly is solely based on single-molecule, real-time sequencing data and hence provides high resolution of the two inverted repeat regions typically contained in chloroplast genomes.


F1000Research ◽  
2018 ◽  
Vol 7 ◽  
pp. 297 ◽  
Author(s):  
Jason R. Miller ◽  
Sergey Koren ◽  
Kari A. Dilley ◽  
Derek M. Harkins ◽  
Timothy B. Stockwell ◽  
...  

Background:The tick cell line ISE6, derived fromIxodes scapularis, is commonly used for amplification and detection of arboviruses in environmental or clinical samples.Methods:To assist with sequence-based assays, we sequenced the ISE6 genome with single-molecule, long-read technology.Results:The draft assembly appears near complete based on gene content analysis, though it appears to lack some instances of repeats in this highly repetitive genome. The assembly appears to have separated the haplotypes at many loci. DNA short read pairs, used for validation only, mapped to the cell line assembly at a higher rate than they mapped to theIxodes scapularisreference genome sequence.Conclusions:The assembly could be useful for filtering host genome sequence from sequence data obtained from cells infected with pathogens.


2019 ◽  
Author(s):  
Antonis Kioukis ◽  
Vassiliki A. Michalopoulou ◽  
Laura Briers ◽  
Stergios Pirintsos ◽  
David J. Studholme ◽  
...  

AbstractCrop wild relatives contain great levels of genetic diversity, representing an invaluable resource for crop improvement. Many of their traits have the potential to help crops become more resistant and resilient, and adapt to the new conditions that they will experience due to climate change. An impressive global effort occurs for the conservation of various wild crop relatives and facilitates their use in crop breeding for food security.The genus Brassica is listed in Annex I of the International Treaty on Plant Genetic Resources for Food and Agriculture. Brassica oleracea (or wild cabbage) is a species native to coastal southern and western Europe that has become established as an important human food crop plant because of its large reserves stored over the winter in its leaves.Brassica cretica Lam. is a wild relative crop in the brassica group and B. cretica subsp. nivea has been suggested as a separate subspecies. The species B. cretica has been proposed as a potential gene donor to a number of crops in the brassica group, including broccoli, Brussels sprout, cabbage, cauliflower, kale, swede, turnip and oilseed rape.Here, we present the draft de novo genome assemblies of four B. cretica individuals, including two B. cretica subsp. nivea and two B. cretica.De novo assembly of Illumina MiSeq genomic shotgun sequencing data yielded 243,461 contigs totalling 412.5 Mb in length, corresponding to 122 % of the estimated genome size of B. cretica (339 Mb). According to synteny mapping and phylogenetic analysis of conserved genes, B. cretica genome based on our sequence data reveals approximately 30.360 proteins.Furthermore, our demographic analysis based on whole genome data, suggests that distinct populations of B. cretica are not isolated. Our findings suggest that the classification of the B. cretica in distinct subspecies is not supported from the genome sequence data we analyzed.


2017 ◽  
Author(s):  
Andrew Whalen ◽  
Roger Ros-Freixedes ◽  
David L Wilson ◽  
Gregor Gorjanc ◽  
John M Hickey

AbstractIn this paper we extend multi-locus iterative peeling to be a computationally efficient method for calling, phasing, and imputing sequence data of any coverage in small or large pedigrees. Our method, called hybrid peeling, uses multi-locus iterative peeling to estimate shared chromosome segments between parents and their offspring, and then uses single-locus iterative peeling to aggregate genomic information across multiple generations. Using a synthetic dataset, we first analysed the performance of hybrid peeling for calling and phasing alleles in disconnected families, families which contained only a focal individual and its parents and grandparents. Second, we analysed the performance of hybrid peeling for calling and phasing alleles in the context of the full pedigree. Third, we analysed the performance of hybrid peeling for imputing whole genome sequence data to the remaining individuals in the population. We found that hybrid peeling substantially increase the number of genotypes that were called and phased by leveraging sequence information on related individuals. The calling rate and accuracy increased when the full pedigree was used compared to a reduced pedigree of just parents and grandparents. Finally, hybrid peeling accurately imputed whole genome sequence information to non-sequenced individuals. We believe that this algorithm will enable the generation of low cost and high accuracy whole genome sequence data in many pedigreed populations. We are making this algorithm available as a standalone program called AlphaPeel.


2016 ◽  
Vol 113 (40) ◽  
pp. E5925-E5933 ◽  
Author(s):  
Stilianos Louca ◽  
Alyse K. Hawley ◽  
Sergei Katsev ◽  
Monica Torres-Beltran ◽  
Maya P. Bhatia ◽  
...  

Microorganisms are the most abundant lifeform on Earth, mediating global fluxes of matter and energy. Over the past decade, high-throughput molecular techniques generating multiomic sequence information (DNA, mRNA, and protein) have transformed our perception of this microcosmos, conceptually linking microorganisms at the individual, population, and community levels to a wide range of ecosystem functions and services. Here, we develop a biogeochemical model that describes metabolic coupling along the redox gradient in Saanich Inlet—a seasonally anoxic fjord with biogeochemistry analogous to oxygen minimum zones (OMZs). The model reproduces measured biogeochemical process rates as well as DNA, mRNA, and protein concentration profiles across the redox gradient. Simulations make predictions about the role of ubiquitous OMZ microorganisms in mediating carbon, nitrogen, and sulfur cycling. For example, nitrite “leakage” during incomplete sulfide-driven denitrification by SUP05 Gammaproteobacteria is predicted to support inorganic carbon fixation and intense nitrogen loss via anaerobic ammonium oxidation. This coupling creates a metabolic niche for nitrous oxide reduction that completes denitrification by currently unidentified community members. These results quantitatively improve previous conceptual models describing microbial metabolic networks in OMZs. Beyond OMZ-specific predictions, model results indicate that geochemical fluxes are robust indicators of microbial community structure and reciprocally, that gene abundances and geochemical conditions largely determine gene expression patterns. The integration of real observational data, including geochemical profiles and process rate measurements as well as metagenomic, metatranscriptomic and metaproteomic sequence data, into a biogeochemical model, as shown here, enables holistic insight into the microbial metabolic network driving nutrient and energy flow at ecosystem scales.


2017 ◽  
Author(s):  
Harun Mustafa ◽  
Ingo Schilken ◽  
Mikhail Karasikov ◽  
Carsten Eickhoff ◽  
Gunnar Rätsch ◽  
...  

AbstractMotivationTechnological advancements in high-throughput DNA sequencing have led to an exponential growth of sequencing data being produced and stored as a byproduct of biomedical research. Despite its public availability, a majority of this data remains hard to query to the research community due to a lack of efficient data representation and indexing solutions. One of the available techniques to represent read data is a condensed form as an assembly graph. Such a representation contains all sequence information but does not store contextual information and metadata.ResultsWe present two new approaches for a compressed representation of a graph coloring: a lossless compression scheme based on a novel application of wavelet tries as well as a highly accurate lossy compression based on a set of Bloom filters. Both strategies retain a coloring with dynamically changing graph topology. We present construction and merge procedures for both methods and evaluate their performance on a wide range of different datasets. By dropping the requirement of a fully lossless compression and using the topological information of the underlying graph, we can reduce memory requirements by up to three orders of magnitude. Representing individual colors as independently stored modules, our approaches are fully dynamic and can be efficiently parallelized. These properties allow for an easy upscaling to the problem sizes common to the biomedical domain.AvailabilityWe provide prototype implementations in C++, summaries of our experiments as well as links to all datasets publicly at https://github.com/ratschlab/[email protected], [email protected], [email protected]


DNA Research ◽  
2019 ◽  
Vol 26 (5) ◽  
pp. 379-389 ◽  
Author(s):  
Kenta Shirasawa ◽  
Tomoya Esumi ◽  
Hideki Hirakawa ◽  
Hideyuki Tanaka ◽  
Akihiro Itai ◽  
...  

Abstract We report the phased genome sequence of an interspecific hybrid, the flowering cherry ‘Somei-Yoshino’ (Cerasus × yedoensis). The sequence data were obtained by single-molecule real-time sequencing technology, split into two subsets based on genome information of the two probable ancestors, and assembled to obtain two haplotype phased genome sequences of the interspecific hybrid. The resultant genome assembly consisting of the two haplotype sequences spanned 690.1 Mb with 4,552 contigs and an N50 length of 1.0 Mb. We predicted 95,076 high-confidence genes, including 94.9% of the core eukaryotic genes. Based on a high-density genetic map, we established a pair of eight pseudomolecule sequences, with highly conserved structures between the two haplotype sequences with 2.4 million sequence variants. A whole genome resequencing analysis of flowering cherries suggested that ‘Somei-Yoshino’ might be derived from a cross between C. spachiana and either C. speciosa or its relatives. A time-course transcriptome analysis of floral buds and flowers suggested comprehensive changes in gene expression in floral bud development towards flowering. These genome and transcriptome data are expected to provide insights into the evolution and cultivation of flowering cherry and the molecular mechanism underlying flowering.


Sign in / Sign up

Export Citation Format

Share Document