illumina sequencing data
Recently Published Documents


TOTAL DOCUMENTS

21
(FIVE YEARS 6)

H-INDEX

5
(FIVE YEARS 1)

2021 ◽  
Vol 7 (8) ◽  
Author(s):  
Stephen J. Bush

Minimizing false positives is a critical issue when variant calling as no method is without error. It is common practice to post-process a variant-call file (VCF) using hard filter criteria intended to discriminate true-positive (TP) from false-positive (FP) calls. These are applied on the simple principle that certain characteristics are disproportionately represented among the set of FP calls and that a user-chosen threshold can maximize the number detected. To provide guidance on this issue, this study empirically characterized all false SNP and indel calls made using real Illumina sequencing data from six disparate species and 166 variant-calling pipelines (the combination of 14 read aligners with up to 13 different variant callers, plus four ‘all-in-one’ pipelines). We did not seek to optimize filter thresholds but instead to draw attention to those filters of greatest efficacy and the pipelines to which they may most usefully be applied. In this respect, this study acts as a coda to our previous benchmarking evaluation of bacterial variant callers, and provides general recommendations for effective practice. The results suggest that, of the pipelines analysed in this study, the most straightforward way of minimizing false positives would simply be to use Snippy. We also find that a disproportionate number of false calls, irrespective of the variant-calling pipeline, are located in the vicinity of indels, and highlight this as an issue for future development.


2020 ◽  
Vol 9 (30) ◽  
Author(s):  
Victoria N. Lydick ◽  
Douglas B. Rusch ◽  
Blake Ushijima ◽  
Julia C. van Kessel

ABSTRACT Here, we report the complete genome sequence of Vibrio coralliilyticus OCN008, a marine bacterium that infects reef-building coral. Previous sequencing efforts yielded an incomplete sequence (210 contigs). We used Nanopore and Illumina sequencing data to obtain complete sequences of the two circular chromosomes (3.48 and 1.91 Mb) and one megaplasmid (244.69 kb).


Author(s):  
Alexandre Hassanin

Viruses closely related to SARS-CoV-2, which is the virus responsible of the Covid-19 pandemic, were sequenced in several Sunda pangolins (Manis javanica) seized in the Guangdong and Guangxi provinces of China between 2017 and 20191-3. These viruses belong to two lineages: one from Guangdong (GD/P) and the other from Guangxi (GX/P). The GD/P viruses are particularly intriguing as the amino-acid sequence of the receptor binding domain of the spike protein is very similar to that of the human SARS-CoV-2 virus (97.4%)2. This characteristic suggests that GD/P viruses are capable of binding human ACE2 receptor and may therefore be able to mediate infection of human cells. Whereas all six GX/P genomes were deposited as annotated sequences in GenBank, none of the two GD/P genomes assembled in previous studies2,3 are currently available. To overcome this absence, I assembled these genomes from the Sequence Read Archive (SRA) data available for SARS-CoV-2-like viruses detected in five captive pangolins from Guangdong. I found the genome assemblies of GD/P virus of poor quality, having high levels of missing data. Additionally, unexpected reads in the Illumina sequencing data were identified. The GD/P2S dataset2 contains reads that are identical to SARS-CoV-2, suggesting either the coexistence of two SARS-CoV-2-like viruses in the same pangolin or contamination by the human virus. In the four other GD/P datasets1 many mitochondrial reads from pangolin were identified, as well as from three other species, namely, human, mouse and tiger. Importantly, I only identified three polymorphic nucleotide sites between the five GD/P sequences. Such low levels of polymorphism may reasonably be accounted for by sequencing errors alone, thus raising the possibility that the five pangolins seized in Guangdong in March 2019 were infected by the same virus strain, most probably during their captivity.


2019 ◽  
Author(s):  
A. Marieke Oudelaar ◽  
Jim R. Hughes ◽  
Damien J. Downes

Abstract Tri-C is a Chromosome Conformation Capture \(3C) approach, which can very efficiently identify multi-way chromatin interactions at individual alleles with selected viewpoints of interest at high resolution. Tri-C allows for multiplexing both viewpoints and samples. As identification of multi-way interactions relies on Illumina sequencing, data can be generated at great depth and PCR duplicates \(based on identical sonication ends) can accurately be removed, allowing for high-throughput, quantitative analysis of multi-way chromatin interactions.


2018 ◽  
Vol 19 (1) ◽  
Author(s):  
Mahdi Heydari ◽  
Giles Miclotte ◽  
Yves Van de Peer ◽  
Jan Fostier

Author(s):  
Pierre Taberlet ◽  
Aurélie Bonin ◽  
Lucie Zinger ◽  
Eric Coissac

DNA metabarcoding generates huge amounts of data containing noise introduced by molecular methods. Chapter 8 “DNA metabarcoding data analysis” discusses the analytic steps and available software to curate and evaluate DNA metabarcoding data prior to final ecological analyses. It provides command lines to perform primary analyses of Illumina sequencing data with the OBITools, ranging from read assignment to samples to the formation of molecular operational taxonomic units (MOTUs) and their assignment to a taxon through comparison against reference databases. Chapter 8 also develops several methods to further curate sequencing data from contaminants or dysfunctional PCRs by using DNA extraction, PCR, and sequencing blank controls as well as PCR/biological replicates. It also presents several classical analyses to ensure that the diversity of the sample or the study site is appropriately covered. Finally, this chapter considers what conclusions on biodiversity and ecological processes can be really drawn from metabarcoding data.


2018 ◽  
Author(s):  
Shengcai Liu ◽  
Liyun Peng ◽  
Junfei Pan ◽  
Xiao Wang ◽  
Chunli Zhao ◽  
...  

Betalains are abundant in amaranth plants. Additionally, the betalain molecular structure and metabolic pathway differ from those of betanin in beet plants. To date, only a few studies have examined the regulatory roles of miRNAs in betalain biosynthesis in plants. Thus, we constructed small RNA libraries for the red and green sectors of amaranth leaves to identify miRNAs associated with betalain biosynthesis. We identified 198 known and 41 novel miRNAs. Moreover, 216 miRNAs were distributed in 44 miRNA families, including miR156, miR159, miR160, miR166, miR172, miR319, miR167, miR396, and miR398. An analysis of all unigene sequences in an amaranth transcriptome database resulted in the detection of 493 target genes for the 239 screened miRNAs. The targets included SPL2, ARF18, ARF6, and NAC. A quantitative real-time polymerase chain reaction validation of 20 miRNAs and nine target genes revealed expression-level differences between the red and green sectors of amaranth leaves. This study involved the application of an Illumina sequencing platform to identify miRNAs regulating betalain metabolism in amaranth plants. The data presented herein may provide insights into the molecular mechanisms underlying the regulation of betalain biosynthesis in amaranth and other plant species.


Sign in / Sign up

Export Citation Format

Share Document