PCIR: a database of Plant Chloroplast Inverted Repeats

Abstract Inverted repeats (IRs) serve as potential biomarkers for genomic instability, DNA replication and other genetic processes. However, little information can be found in databases to help researchers recognize potential IR nucleotides, explore junction sites and annotate related functional genes. Plant Chloroplast Inverted Repeats (PCIR) is an interactive, web-based platform containing various sequenced chloroplast genomes that enables detection, searching and visualization of large-scale detailed information on IRs. PCIR contains many datasets, including 21 433 IRs, 113 plants chloroplast genomes, 16 948 functional genes and 21 659 visual maps. This database offers an online prediction tool for detecting IRs based on DNA sequences. PCIR can also analyze phylogenetic relationships using IR information among different species and provide users with high-quality marker maps. This database will be a valuable resource for IR distribution patterns, related genes and architectural features.

Download Full-text

Complex Analyses of Short Inverted Repeats in All Sequenced Chloroplast DNAs

BioMed Research International ◽

10.1155/2018/1097018 ◽

2018 ◽

Vol 2018 ◽

pp. 1-10 ◽

Cited By ~ 9

Author(s):

Václav Brázda ◽

Jiří Lýsek ◽

Martin Bartas ◽

Miroslav Fojta

Keyword(s):

Dna Sequences ◽

Average Frequency ◽

Levenshtein Distance ◽

Inverted Repeats ◽

Repeat Region ◽

Genome Database ◽

Stem Loop ◽

Loop Region ◽

Chloroplast Genomes ◽

Chloroplast Dnas

Chloroplasts are key organelles in the management of oxygen in algae and plants and are therefore crucial for all living beings that consume oxygen. Chloroplasts typically contain a circular DNA molecule with nucleus-independent replication and heredity. Using “palindrome analyser” we performed complete analyses of short inverted repeats (S-IRs) in all chloroplast DNAs (cpDNAs) available from the NCBI genome database. Our results provide basic parameters of cpDNAs including comparative information on localization, frequency, and differences in S-IR presence. In a total of 2,565 cpDNA sequences available, the average frequency of S-IRs in cpDNA genomes is 45 S-IRs/per kbp, significantly higher than that found in mitochondrial DNA sequences. The frequency of S-IRs in cpDNAs generally decreased with S-IR length, but not for S-IRs 15, 22, 24, or 27 bp long, which are significantly more abundant than S-IRs with other lengths. These results point to the importance of specific S-IRs in cpDNA genomes. Moreover, comparison by Levenshtein distance of S-IR similarities showed that a limited number of S-IR sequences are shared in the majority of cpDNAs. S-IRs are not located randomly in cpDNAs, but are length-dependently enriched in specific locations, including the repeat region, stem, introns, and tRNA regions. The highest enrichment was found for 12 bp and longer S-IRs in the stem-loop region followed by 12 bp and longer S-IRs located before the repeat region. On the other hand, S-IRs are relatively rare in rRNA sequences and around introns. These data show nonrandom and conserved arrangements of S-IRs in chloroplast genomes.

Download Full-text

Rapid, large-scale species discovery in hyperdiverse taxa using 1D MinION sequencing

BMC Biology ◽

10.1186/s12915-019-0706-9 ◽

2019 ◽

Vol 17 (1) ◽

Cited By ~ 17

Author(s):

Amrita Srivathsan ◽

Emily Hartop ◽

Jayanthi Puniamoorthy ◽

Wan Ting Lee ◽

Sujatha Narayanan Kutty ◽

...

Keyword(s):

Dna Sequences ◽

Large Scale ◽

Low Cost ◽

Small Body ◽

Small Subset ◽

Similar Species ◽

Large Species ◽

Morphological Examination ◽

Species Discovery ◽

Short Period

Abstract Background More than 80% of all animal species remain unknown to science. Most of these species live in the tropics and belong to animal taxa that combine small body size with high specimen abundance and large species richness. For such clades, using morphology for species discovery is slow because large numbers of specimens must be sorted based on detailed microscopic investigations. Fortunately, species discovery could be greatly accelerated if DNA sequences could be used for sorting specimens to species. Morphological verification of such “molecular operational taxonomic units” (mOTUs) could then be based on dissection of a small subset of specimens. However, this approach requires cost-effective and low-tech DNA barcoding techniques because well-equipped, well-funded molecular laboratories are not readily available in many biodiverse countries. Results We here document how MinION sequencing can be used for large-scale species discovery in a specimen- and species-rich taxon like the hyperdiverse fly family Phoridae (Diptera). We sequenced 7059 specimens collected in a single Malaise trap in Kibale National Park, Uganda, over the short period of 8 weeks. We discovered > 650 species which exceeds the number of phorid species currently described for the entire Afrotropical region. The barcodes were obtained using an improved low-cost MinION pipeline that increased the barcoding capacity sevenfold from 500 to 3500 barcodes per flowcell. This was achieved by adopting 1D sequencing, resequencing weak amplicons on a used flowcell, and improving demultiplexing. Comparison with Illumina data revealed that the MinION barcodes were very accurate (99.99% accuracy, 0.46% Ns) and thus yielded very similar species units (match ratio 0.991). Morphological examination of 100 mOTUs also confirmed good congruence with morphology (93% of mOTUs; > 99% of specimens) and revealed that 90% of the putative species belong to the neglected, megadiverse genus Megaselia. We demonstrate for one Megaselia species how the molecular data can guide the description of a new species (Megaselia sepsioides sp. nov.). Conclusions We document that one field site in Africa can be home to an estimated 1000 species of phorids and speculate that the Afrotropical diversity could exceed 200,000 species. We furthermore conclude that low-cost MinION sequencers are very suitable for reliable, rapid, and large-scale species discovery in hyperdiverse taxa. MinION sequencing could quickly reveal the extent of the unknown diversity and is especially suitable for biodiverse countries with limited access to capital-intensive sequencing facilities.

Download Full-text

Biological Impact of a Large-Scale Genomic Inversion That Grossly Disrupts the Relative Positions of the Origin and Terminus Loci of theStreptococcus pyogenesChromosome

Journal of Bacteriology ◽

10.1128/jb.00090-19 ◽

2019 ◽

Vol 201 (17) ◽

Cited By ~ 1

Author(s):

Dragutin J. Savic ◽

Scott V. Nguyen ◽

Kimberly McCullor ◽

W. Michael McShan

Keyword(s):

Dna Sequences ◽

Parental Strain ◽

Large Scale ◽

Galleria Mellonella ◽

Acute Infection ◽

Relative Length ◽

Published Data ◽

Rich Medium ◽

Content Type

ABSTRACTA large-scale genomic inversion encompassing 0.79 Mb of the 1.816-Mb-longStreptococcus pyogenesserotype M49 strain NZ131 chromosome spontaneously occurs in a minor subpopulation of cells, and in this report genetic selection was used to obtain a stable lineage with this chromosomal rearrangement. This inversion, which drastically displaces theorisite relative to the terminus, changes the relative length of the replication arms so that one replichore is approximately 0.41 Mb while the other is about 1.40 Mb in length. Genomic reversion to the original chromosome constellation is not observed in PCR-monitored analyses after 180 generations of growth in rich medium. Compared to the parental strain, the inversion surprisingly demonstrates a nearly identical growth pattern in the first phase of the exponential phase, but differences do occur when resources in the medium become limited. When cultured separately in rich medium during prolonged stationary phase or in an experimental acute infection animal model (Galleria mellonella), the parental strain and the invertant have equivalent survival rates. However, when they are coincubated together, bothin vitroandin vivo, the survival of the invertant declines relative to the level for the parental strain. The accompanying aspect of the study suggests that inversions taking place nearoriCalways happen to secure the linkage oforiCto DNA sequences responsible for chromosome partition. The biological relevance of large-scale inversions is also discussed.IMPORTANCEBased on our previous work, we created to our knowledge the largest asymmetric inversion, covering 43.5% of theS. pyogenesgenome. In spite of a drastic replacement of origin of replication and the unbalanced size of replichores (1.4 Mb versus 0.41 Mb), the invertant, when not challenged with its progenitor, showed impressive vitality for growthin vitroand in pathogenesis assays. The mutant supports the existing idea that slightly deleterious mutations can provide the setting for secondary adaptive changes. Furthermore, comparative analysis of the mutant with previously published data strongly indicates that even large genomic rearrangements survive provided that the integrity of theoriCand the chromosome partition cluster is preserved.

Download Full-text

Chloroplast genome sequence of the moss Tortula ruralis: gene content, polymorphism, and structural arrangement relative to other green plant chloroplast genomes

BMC Genomics ◽

10.1186/1471-2164-11-143 ◽

2010 ◽

Vol 11 (1) ◽

pp. 143 ◽

Cited By ~ 36

Author(s):

Melvin J Oliver ◽

Andrew G Murdock ◽

Brent D Mishler ◽

Jennifer V Kuehl ◽

Jeffrey L Boore ◽

...

Keyword(s):

Chloroplast Genome ◽

Genome Sequence ◽

Green Plant ◽

Gene Content ◽

Structural Arrangement ◽

Chloroplast Genomes ◽

Plant Chloroplast ◽

Tortula Ruralis ◽

Chloroplast Genome Sequence

Download Full-text

Fungarium specimens: a largely untapped source in global change biology and beyond

Philosophical Transactions of the Royal Society B Biological Sciences ◽

10.1098/rstb.2017.0392 ◽

2018 ◽

Vol 374 (1763) ◽

pp. 20170392 ◽

Cited By ~ 7

Author(s):

Carrie Andrew ◽

Jeffrey Diez ◽

Timothy Y. James ◽

Håvard Kauserud

Keyword(s):

Global Change ◽

Dna Sequences ◽

Large Scale ◽

Dna Analysis ◽

Sampling Location ◽

Sources Of Information ◽

Dispersal Patterns ◽

Global Change Biology ◽

Richness Patterns ◽

Collection Date

For several hundred years, millions of fungal sporocarps have been collected and deposited in worldwide collections (fungaria) to support fungal taxonomy. Owing to large-scale digitization programs, metadata associated with the records are now becoming publicly available, including information on taxonomy, sampling location, collection date and habitat/substrate information. This metadata, as well as data extracted from the physical fungarium specimens themselves, such as DNA sequences and biochemical characteristics, provide a rich source of information not only for taxonomy but also for other lines of biological inquiry. Here, we highlight and discuss how this information can be used to investigate emerging topics in fungal global change biology and beyond. Fungarium data are a prime source of knowledge on fungal distributions and richness patterns, and for assessing red-listed and invasive species. Information on collection dates has been used to investigate shifts in fungal distributions as well as phenology of sporocarp emergence in response to climate change. In addition to providing material for taxonomy and systematics, DNA sequences derived from the physical specimens provide information about fungal demography, dispersal patterns, and are emerging as a source of genomic data. As DNA analysis technologies develop further, the importance of fungarium specimens as easily accessible sources of information will likely continue to grow. This article is part of the theme issue ‘Biological collections for understanding biodiversity in the Anthropocene’.

Download Full-text

ABySS 2.0: Resource-Efficient Assembly of Large Genomes using a Bloom Filter

10.1101/068338 ◽

2016 ◽

Cited By ~ 4

Author(s):

Shaun D Jackman ◽

Benjamin P Vandervalk ◽

Hamid Mohamadi ◽

Justin Chu ◽

Sarah Yeo ◽

...

Keyword(s):

Human Genome ◽

Dna Sequences ◽

Message Passing ◽

Large Scale ◽

De Novo ◽

Bloom Filter ◽

Genomic Variation ◽

De Bruijn Graph ◽

Single Individual ◽

Probabilistic Data Structure

AbstractThe assembly of DNA sequences de novo is fundamental to genomics research. It is the first of many steps towards elucidating and characterizing whole genomes. Downstream applications, including analysis of genomic variation between species, between or within individuals critically depends on robustly assembled sequences. In the span of a single decade, the sequence throughput of leading DNA sequencing instruments has increased drastically, and coupled with established and planned large-scale, personalized medicine initiatives to sequence genomes in the thousands and even millions, the development of efficient, scalable and accurate bioinformatics tools for producing high-quality reference draft genomes is timely.With ABySS 1.0, we originally showed that assembling the human genome using short 50 bp sequencing reads was possible by aggregating the half terabyte of compute memory needed over several computers using a standardized message-passing system (MPI). We present here its re-design, which departs from MPI and instead implements algorithms that employ a Bloom filter, a probabilistic data structure, to represent a de Bruijn graph and reduce memory requirements.We present assembly benchmarks of human Genome in a Bottle 250 bp Illumina paired-end and 6 kbp mate-pair libraries from a single individual, yielding a NG50 (NGA50) scaffold contiguity of 3.5 (3.0) Mbp using less than 35 GB of RAM, a modest memory requirement by today’s standard that is often available on a single computer. We also investigate the use of BioNano Genomics and 10x Genomics’ Chromium data to further improve the scaffold contiguity of this assembly to 42 (15) Mbp.

Download Full-text

Global biogeography of living brachiopods: Bioregionalization patterns and possible controls

PLoS ONE ◽

10.1371/journal.pone.0259004 ◽

2021 ◽

Vol 16 (11) ◽

pp. e0259004

Author(s):

Facheng Ye ◽

G. R. Shi ◽

Maria Aleksandra Bitner

Keyword(s):

Large Scale ◽

Coastal Upwelling ◽

Seawater Temperature ◽

Regional Scale ◽

Distribution Patterns ◽

Global Scale ◽

Biodiversity Hotspots ◽

Network Analyses ◽

Ocean Gyres ◽

Ecological Variable

The global distribution patterns of 14918 geo-referenced occurrences from 394 living brachiopod species were mapped in 5° grid cells, which enabled the visualization and delineation of distinct bioregions and biodiversity hotspots. Further investigation using cluster and network analyses allowed us to propose the first systematically and quantitatively recognized global bioregionalization framework for living brachiopods, consisting of five bioregions and thirteen bioprovinces. No single environmental or ecological variable is accountable for the newly proposed global bioregionalization patterns of living brachiopods. Instead, the combined effects of large-scale ocean gyres, climatic zonation as well as some geohistorical factors (e.g., formation of land bridges and geological recent closure of ancient seaways) are considered as the main drivers at the global scale. At the regional scale, however, the faunal composition, diversity and biogeographical differentiation appear to be mainly controlled by seawater temperature variation, regional ocean currents and coastal upwelling systems.

Download Full-text

Antigen Presentation of mRNA-Based and Virus-Vectored SARS-CoV-2 Vaccines

Vaccines ◽

10.3390/vaccines9080848 ◽

2021 ◽

Vol 9 (8) ◽

pp. 848

Author(s):

Ger T. Rijkers ◽

Nynke Weterings ◽

Andres Obregon-Henao ◽

Michaëla Lepolder ◽

Taru S. Dutt ◽

...

Keyword(s):

Antigen Presentation ◽

Conformational Changes ◽

Dna Sequences ◽

Neutralizing Antibodies ◽

Large Scale ◽

Viral Vector ◽

Adenovirus Vector ◽

Platelet Factor ◽

Immune Mediated ◽

Protein Encoding

Infection with Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) causes Coronavirus Disease 2019 (COVID-19), which has reached pandemic proportions. A number of effective vaccines have been produced, including mRNA vaccines and viral vector vaccines, which are now being implemented on a large scale in order to control the pandemic. The mRNA vaccines are composed of viral Spike S1 protein encoding mRNA incorporated in a lipid nanoparticle and stabilized by polyethylene glycol (PEG). The mRNA vaccines are novel in many respects, including cellular uptake and the intracellular routing, processing, and secretion of the viral protein. Viral vector vaccines have incorporated DNA sequences, encoding the SARS-CoV-2 Spike protein into (attenuated) adenoviruses. The antigen presentation routes in MHC class I and class II, in relation to the induction of virus-neutralizing antibodies and cytotoxic T-lymphocytes, will be reviewed. In rare cases, mRNA vaccines induce unwanted immune mediated side effects. The mRNA-based vaccines may lead to an anaphylactic reaction. This reaction may be triggered by PEG. The intracellular routing of PEG and potential presentation in the context of CD1 will be discussed. Adenovirus vector-based vaccines have been associated with thrombocytopenic thrombosis events. The anti-platelet factor 4 antibodies found in these patients could be generated due to conformational changes of relevant epitopes presented to the immune system.

Download Full-text

A systematic comparison of eight new plastome sequences from Ipomoea L

PeerJ ◽

10.7717/peerj.6563 ◽

2019 ◽

Vol 7 ◽

pp. e6563

Author(s):

Jianying Sun ◽

Xiaofeng Dong ◽

Qinghe Cao ◽

Tao Xu ◽

Mingku Zhu ◽

...

Keyword(s):

Comparative Analysis ◽

Next Generation Sequencing ◽

Chloroplast Genome ◽

Dna Sequences ◽

Repetitive Sequences ◽

Single Copy ◽

Next Generation ◽

Variable Regions ◽

Chloroplast Genomes ◽

Generation Sequencing

Background Ipomoea is the largest genus in the family Convolvulaceae. The species in this genus have been widely used in many fields, such as agriculture, nutrition, and medicine. With the development of next-generation sequencing, more than 50 chloroplast genomes of Ipomoea species have been sequenced. However, the repeats and divergence regions in Ipomoea have not been well investigated. In the present study, we sequenced and assembled eight chloroplast genomes from sweet potato’s close wild relatives. By combining these with 32 published chloroplast genomes, we conducted a detailed comparative analysis of a broad range of Ipomoea species. Methods Eight chloroplast genomes were assembled using short DNA sequences generated by next-generation sequencing technology. By combining these chloroplast genomes with 32 other published Ipomoea chloroplast genomes downloaded from GenBank and the Oxford Research Archive, we conducted a comparative analysis of the repeat sequences and divergence regions across the Ipomoea genus. In addition, separate analyses of the Batatas group and Quamoclit group were also performed. Results The eight newly sequenced chloroplast genomes ranged from 161,225 to 161,721 bp in length and displayed the typical circular quadripartite structure, consisting of a pair of inverted repeat (IR) regions (30,798–30,910 bp each) separated by a large single copy (LSC) region (87,575–88,004 bp) and a small single copy (SSC) region (12,018–12,051 bp). The average guanine-cytosine (GC) content was approximately 40.5% in the IR region, 36.1% in the LSC region, 32.2% in the SSC regions, and 37.5% in complete sequence for all the generated plastomes. The eight chloroplast genome sequences from this study included 80 protein-coding genes, four rRNAs (rrn23, rrn16, rrn5, and rrn4.5), and 37 tRNAs. The boundaries of single copy regions and IR regions were highly conserved in the eight chloroplast genomes. In Ipomoea, 57–89 pairs of repetitive sequences and 39–64 simple sequence repeats were found. By conducting a sliding window analysis, we found six relatively high variable regions (ndhA intron, ndhH-ndhF, ndhF-rpl32, rpl32-trnL, rps16-trnQ, and ndhF) in the Ipomoea genus, eight (trnG, rpl32-trnL, ndhA intron, ndhF-rpl32, ndhH-ndhF, ccsA-ndhD, trnG-trnR, and pasA-ycf3) in the Batatas group, and eight (ndhA intron, petN-psbM, rpl32-trnL, trnG-trnR, trnK-rps16, ndhC-trnV, rps16-trnQ, and trnG) in the Quamoclit group. Our maximum-likelihood tree based on whole chloroplast genomes confirmed the phylogenetic topology reported in previous studies. Conclusions The chloroplast genome sequence and structure were highly conserved in the eight newly-sequenced Ipomoea species. Our comparative analysis included a broad range of Ipomoea chloroplast genomes, providing valuable information for Ipomoea species identification and enhancing the understanding of Ipomoea genetic resources.

Download Full-text