illumina data
Recently Published Documents


TOTAL DOCUMENTS

53
(FIVE YEARS 25)

H-INDEX

12
(FIVE YEARS 2)

2021 ◽  
Vol 10 (1) ◽  
pp. 17
Author(s):  
Silvia Schiavon ◽  
Mauro Paolini ◽  
Raffaele Guzzon ◽  
Andrea Mancini ◽  
Roberto Larcher ◽  
...  

Bacteria can play different roles affecting flavors and food characteristics. Few studies have described the bacterial microbiota of butter. In the present paper, next-generation sequencing was used to determine bacterial diversity, together with aromatic characteristics, in raw cow milk butter processed by traditional fermentation, in fourteen small farms called “Malga”, located in the Trentino province (Alpine region, North-East of Italy). The physicochemical and aromatic characterization of traditional mountain butter (TMB) showed a low moisture level depending on the Malga producing the butter. Counts of lactic acid bacteria, Staphylococci, and coliforms, as well as diacetyl/acetoin concentrations exhibited changes according to the geographical origin of Malga and the residual humidity of butter. MiSeq Illumina data analysis revealed that the relative abundance of Lactococcus was higher in TMB samples with the highest values of acetoin (acetoin higher than 10 mg/kg). The traditional mountain butter bacterial community was characterized by a “core dominance” of psychrotrophic genera, mainly Acinetobacter and Pseudomonas, but according to ANCOM analysis, a complex bacterial population emerged and specific bacterial genera were able to characterize the TMB bacteria community, with their high abundance, based on the Malga producing the butter.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Wentao Ye ◽  
Wei Xu ◽  
Nan Xu ◽  
Rong Chen ◽  
Changhu Lu ◽  
...  

AbstractThe red-crowned crane (Grus japonensis) is an endangered species distributed across southeast Russia, northeast China, Korea, and Japan. Here, we sequenced for the first time the full-length unreferenced transcriptome of red-crowned crane mixed samples using a PacBio Sequel platform. A total of 359,136 circular consensus sequences (CCS) were obtained via clustering to remove redundancy. A total of 303,544 full-length non-chimeric sequences were identified by judging whether CCS contained 5′ and 3′ adapters, and the poly(A) tail. Eight samples were sequenced using Illumina, and PacBio sequencing data were corrected according to the collected Illumina data to obtain more accurate full-length transcripts. A total of 4,100 long non-coding RNAs, 13,115 simple sequences repeat loci and 29 transcription factor families were identified. The expression of lncRNAs and TFs in pancreas was lowest comparing with other tissues. Many enriched immune-related transmission pathways (MHC and IL receptors) were identified in the spleen. This study will contribute to a better understanding of the gene structure and post-transcriptional regulatory network, and provide references for future studies on red-crowned cranes.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Rhys A. Farrer

Abstract Background Identifying haplotypes is central to sequence analysis in diploid or polyploid genomes. Despite this, there remains a lack of research and tools designed for physical phasing and its downstream analysis. Results HaplotypeTools is a new toolset to phase variant sites using VCF and BAM files and to analyse phased VCFs. Phasing is achieved via the identification of reads overlapping ≥ 2 heterozygous positions and then extended by additional reads, a process that can be parallelized across a computer cluster. HaplotypeTools includes various utility scripts for downstream analysis including crossover detection and phylogenetic placement of haplotypes to other lineages or species. HaplotypeTools was assessed for accuracy against WhatsHap using simulated short and long reads, demonstrating higher accuracy, albeit with reduced haplotype length. HaplotypeTools was also tested on real Illumina data to determine the ancestry of hybrid fungal isolate Batrachochytrium dendrobatidis (Bd) SA-EC3, finding 80% of haplotypes across the genome phylogenetically cluster with parental lineages BdGPL (39%) and BdCAPE (41%), indicating those are the parental lineages. Finally, ~ 99% of phasing was conserved between overlapping phase groups between SA-EC3 and either parental lineage, indicating mitotic gene conversion/parasexuality as the mechanism of recombination for this hybrid isolate. HaplotypeTools is open source and freely available from https://github.com/rhysf/HaplotypeTools under the MIT License. Conclusions HaplotypeTools is a powerful resource for analyzing hybrid or recombinant diploid or polyploid genomes and identifying parental ancestry for sub-genomic regions.


2021 ◽  
Author(s):  
Mantas Sereika ◽  
Rasmus Hansen Krikegaard ◽  
Søren Michael Karst ◽  
Thomas Yssing Michaelsen ◽  
Emil Aarre Sørensen ◽  
...  

Short-read DNA sequencing has led to a massive growth of genome databases but mainly with highly fragmented metagenome assembled genomes from environmental systems. The fragmentation is a result of closely related species, strains, and genome repeats that cannot be resolved with short reads. To confidently explore the functional potential of a microbial community, high-quality reference genomes are needed. In this study, we evaluated the use of different combinations of short (Illumina) and long-read technologies (Nanopore R9.4, R10.3, and PacBio CCS) for recovering high-quality metagenome assembled genomes (HQ MAGs) from a complex microbial community (anaerobic digester). Depending on the sequencing approach, 33 to 86 HQ MAGs (encompassing up to 34 % of the assembly and 49 % of the reads) were recovered using long reads, with Nanopore R9 featuring the lowest sequencing costs per HQ MAG recovered. PacBio CCS was also found to be an effective platform for genome-centric metagenomics (74 HQ MAGs) and produced HQ MAGs with the lowest fragmentation (median of 9 contigs) as a stand-alone technology. Using PacBio CCS MAGs as reference, we show that, although a high number of high-quality MAGs can be generated using Nanopore R9, systematic indel errors are still present, which can lead to truncated gene calling. However, polishing the Nanopore MAGs with short-read Illumina data, enabled recovery of MAGs with similar quality as MAGs from PacBio CCS.


2021 ◽  
Author(s):  
James K Bonfield

Motivation: CRAM has established itself as a high compression alternative to the BAM file format for DNA sequencing data. We describe updates to further improve this on modern sequencing instruments. Results: With Illumina data CRAM 3.1 is 7 to 15% smaller than the equivalent CRAM 3.0 file, and 50 to 70% smaller than the corresponding BAM file. Long-read technology shows more modest compression due to the presence of high-entropy signals. Availability: The CRAM 3.0 specification is freely available from https://samtools.github.io/hts-specs/CRAMv3.pdf. The CRAM 3.1 improvements are available from https://github.com/samtools/hts-specs/pull/433, with OpenSource implementations in HTSlib and HTScodecs.


2021 ◽  
Author(s):  
Marc-André Lemay ◽  
Jonas A. Sibbesen ◽  
Davoud Torkamaneh ◽  
Jérémie Hamel ◽  
Roger C. Levesque ◽  
...  

Background: Structural variant (SV) discovery based on short reads is challenging due to their complex signatures and tendency to occur in repeated regions. The increasing availability of long-read technologies has greatly facilitated SV discovery, however these technologies remain too costly to apply routinely to population-level studies. Here, we combined short-read and long-read sequencing technologies to provide a comprehensive population-scale assessment of structural variation in a panel of Canadian soybean cultivars. Results: We used Oxford Nanopore sequencing data (~12X mean coverage) for 17 samples to both benchmark SV calls made from the Illumina data and predict SVs that were subsequently genotyped in a population of 102 samples using Illumina data. Benchmarking results show that variants discovered using Oxford Nanopore can be accurately genotyped from the Illumina data. We first use the genotyped SVs for population structure analysis and show that results are comparable to those based on single-nucleotide variants. We observe that the population frequency and distribution within the genome of SVs are constrained by the location of genes. Gene Ontology and PFAM domain enrichment analyses also confirm previous reports that genes harboring high-frequency SVs are enriched for functions in defense response. Finally, we discover polymorphic transposable elements from the SVs and report evidence of the recent activity of a Stowaway MITE. Conclusions: Our results demonstrate that long-read and short-read sequencing technologies can be efficiently combined to enhance SV analysis in large populations, providing a reusable framework for their study in a wider range of samples and non-model species.


2021 ◽  
Vol 10 (21) ◽  
Author(s):  
Jason E. Stajich ◽  
Andrea L. Vu ◽  
Howard S. Judelson ◽  
Gregory M. Vogel ◽  
Michael A. Gore ◽  
...  

The oomycete Phytophthora capsici is a destructive pathogen of a wide range of vegetable hosts, especially peppers and cucurbits. A 94.17-Mb genome assembly was constructed using PacBio and Illumina data and annotated with support from transcriptome sequencing (RNA-Seq) reads.


2021 ◽  
Vol 10 (7) ◽  
Author(s):  
Ashley V. Baugh ◽  
Thomas M. Howarth ◽  
Katrina L. West ◽  
Lydia E. J. Kerr ◽  
John Love ◽  
...  

ABSTRACT Weissella paramesenteroides has potential as an industrial biocatalyst due to its ability to produce lactic acid. A novel strain of W. paramesenteroides was isolated from ensiled sorghum. The genome was sequenced using a hybrid assembly of Oxford Nanopore and Illumina data to produce a 2-Mbp genome and 22-kbp plasmid sequence.


Author(s):  
Xinghua He ◽  
Zhilin Yuan

Abstract The novel DSE Laburnicola rhizohalophila (Pleosporales, Ascomycota) is frequently found in the halophytic seepweed (Suaeda salsa). In this paper, we report a near-chromosome-level hybrid assembly of this fungus using a combination of short-read Illumina data to polish assemblies generated from long-read Nanopore data. The reference genome for L. rhizohalophila was assembled into 26 scaffolds with a total length of 64.0 Mb and a N50 length of 3.15 Mb. Of them, 17 scaffolds approached the length of intact chromosomes, and 5 had telomeres at one end only. A total of 10,891 gene models were predicted. Intriguingly, 27.5 Mb of repeat sequences that accounted for 42.97% of the genome was identified, and long terminal repeat retrotransposons were the most frequent known transposable elements (TEs), indicating that TE proliferation contributes to its increased genome size. BUSCO analyses using the Fungi_odb10 dataset showed that 95.0% of genes were complete. In addition, 292 carbohydrate active enzymes, 33 secondary metabolite clusters, and 84 putative effectors were identified in silico. The resulting high-quality assembly and genome features are not only an important resource for further research on understanding the mechanism of root-fungi symbiotic interactions, but will also contribute to comparative analyses of genome biology and evolution within Pleosporalean species.


Sign in / Sign up

Export Citation Format

Share Document