paired end sequencing
Recently Published Documents


TOTAL DOCUMENTS

199
(FIVE YEARS 68)

H-INDEX

27
(FIVE YEARS 4)

2021 ◽  
Author(s):  
Zhi-Jin Liu ◽  
Xiong-Fei Zhang ◽  
Hua-Mei Wen ◽  
Ling Han ◽  
Jiang Zhou

Abstract BackgroundLoaches from the superfamily Cobitoidea (Cypriniformes, Nemacheilidae) are small elongated bottom-dwelling freshwater fishes with several barbels near the mouth, and some species of loach inhabit the underground drainage. The genus Oreonectes with 18 currently recognized loach species represent the three key stages of the evolutionary process (a surface-dwelling lifestyle, facultative cave persistence, and permanent cave dwelling). Some Oreonectes species show typical cave dwelling-related traits, such as partial or complete leucism and regression of the eyes, rendering them as suitable study objects of micro-evolution. Genome information of Oreonectes species is therefore an indispensable research resource of the evolution of cavefishes.ResultWe assembled the genome sequence of O. shuilongensis, a surface-dwelling species, using an integrated approach that combined PacBio single-molecule real-time sequencing and Illumina X-ten paired-end sequencing. The genome assembly contains 803 contigs with N50 values of 5.58 Mb. 25,247 protein-coding genes were predicted, of which 95.65% have been functionally annotated. Meanwhile,we found that dozens of genes related to eye development and melanogenesis were pseudogenised during the evolutionary process in cave environment, providing novel insights into complex phenotypic adaptations of animals in specific environment. ConclusionHere we report the first draft genome assembly of Oreonectes fishes, which is also the first genome reference for Cobitidea fishes. This genome assembly will contribute to the study of the evolution and adaptation of cavefishes within Oreonectes and beyond (Cobitidea) and provid valuable genomic resources for studies on the evolutionary history of the rapid speciation processes of family Nemacheilidae.


2021 ◽  
Author(s):  
Preeti P ◽  
Robin Sinha ◽  
kamal rawal

Background: Mobile genetic elements (MGEs) comprise a major portion of the human genome and are essential for genetic diversity. These elements are known to have the capability to induce mutations in the human genome. To date, there are several MGE insertions which have been reported to be associated with cancer. We aim to use genome next-generation sequencing data and appropriate bioinformatics tools to accurately identify the insertion sites of MGEs in the human genome.Results: Herein, we introduce the MeX pipeline for the localization and annotation of MGEs in paired-end sequencing data. It requires the reference genome sequence, MGE sequences and paired-end sequencing reads. We evaluated MeX on high depth (>75×) Illumina HiSeq data produced at the Broad Institute (NA12878) against human genome 38-built (including only chromosome 1, 2 and 3) and Alu elements. We could identify 78 reference and 1 non-reference Alu insertions in the NA12878 sample. Upon annotation, it was found that the non-reference Alu element was in the 3' UTR region of the RNF2 gene. Out of 78 reference insertions, 42 were in the intronic region, 7 in the upstream region, 5 in the downstream region, 1 in the 3’ UTR region and the rest were not associated with any gene. MeX showed high performance for the identification and annotation of MGEs in genome samples.Conclusion: This study showed that MeX is a robust and powerful tool for the identification and annotation of MGE insertions. It may also serve as a valuable tool to study the phenotypic changes resulting from transpositional events in cancer genomics.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Rune Nielsen ◽  
Yaxin Xue ◽  
Inge Jonassen ◽  
Ingvild Haaland ◽  
Øyvind Kommedal ◽  
...  

Abstract Objective Little is known concerning the stability of the lower airway microbiome. We have compared the microbiota identified by repeated bronchoscopy in healthy subjects and patients with ostructive lung diseaseases (OLD). Methods 21 healthy controls and 41 patients with OLD completed two bronchoscopies. In addition to negative controls (NCS) and oral wash (OW) samples, we gathered protected bronchoalveolar lavage in two fractions (PBAL1 and PBAL2) and protected specimen brushes (PSB). After DNA extraction, we amplified the V3V4 region of the 16S rRNA gene, and performed paired-end sequencing (Illumina MiSeq). Initial bioinformatic processing was carried out in the QIIME-2 pipeline, identifying amplicon sequence variants (ASVs) with the DADA2 algorithm. Potentially contaminating ASVs were identified and removed using the decontam package in R and the sequenced NCS. Results A final table of 551 ASVs consisted of 19 × 106 sequences. Alpha diversity was lower in the second exam for OW samples, and borderline lower for PBAL1, with larger differences in subjects not having received intercurrent antibiotics. Permutational tests of beta diversity indicated that within-individual changes were significantly lower than between-individual changes. A non-parametric trend test showed that differences in composition between the two exams (beta diversity) were largest in the PSBs, and that these differences followed a pattern of PSB > PBAL2 > PBAL1 > OW. Time between procedures was not associated with increased diversity. Conclusion The airways microbiota varied between examinations. However, there is compositional microbiota stability within a person, beyond that of chance, supporting the notion of a transient airways microbiota with a possibly more stable individual core microbiome.


2021 ◽  
Vol 71 (1) ◽  
Author(s):  
Meixiao Wu ◽  
Yuehua Wang ◽  
Yijing Wang ◽  
Xuefei Wang ◽  
Ming Yu ◽  
...  

Abstract Purpose To investigate the diversity of the epiphytic bacteria on corn (Zea mays) and alfalfa (Medicago sativa) collected in Hengshui City and Xingtai City, Hebei Province, China, and explore crops suitable for natural silage. Methods The Illumina MiSeq/NovaSeq high-throughput sequencing system was used to conduct paired-end sequencing of the community DNA fragments from the surface of corn and alfalfa collected in Hengshui and Xingtai. QIIME2 and R software were used to sort and calculate the number of sequences and taxonomic units for each sample. Thereafter, the alpha and beta diversity indices at of species level were calculated, and the abundance and distribution of taxa were analyzed and compared between samples. Result At phylum level, the dominant groups were Proteobacteria (70%), Firmicutes (13%), Actinobacteria (9%), and Bacteroidetes (7%). Meanwhile, the dominant genera were Pseudomonas (8%), Acinetobacter (4%), Chryseobacterium (3%), and Hymenobacter (1%). Enterobacteriaceae (24%) were the most predominant bacteria in both the corn and alfalfa samples. Alpha diversity analysis and beta diversity indices revealed that the diversity of epiphytic microbial communities was significantly affected by plant species but not by region. The diversity and richness of the epiphytic bacterial community of alfalfa were significantly higher than those of corn. Conclusion This study contributes to the expanding knowledge on the diversity of epiphytic bacteria in corn and alfalfa silage and provides a basis for the selection of raw materials.


PLoS Genetics ◽  
2021 ◽  
Vol 17 (9) ◽  
pp. e1009758
Author(s):  
Sofie Claerhout ◽  
Paulien Verstraete ◽  
Liesbeth Warnez ◽  
Simon Vanpaemel ◽  
Maarten Larmuseau ◽  
...  

Male-specific Y-chromosome (chrY) polymorphisms are interesting components of the DNA for population genetics. While single nucleotide polymorphisms (Y-SNPs) indicate distant evolutionary ancestry, short tandem repeats (Y-STRs) are able to identify close familial kinships. Detailed chrY analysis provides thus both biogeographical background information as paternal lineage identification. The rapid advancement of high-throughput massive parallel sequencing (MPS) technology in the past decade has revolutionized genetic research. Using MPS, single-base information of both Y-SNPs as Y-STRs can be analyzed in a single assay typing multiple samples at once. In this study, we present the first extensive chrY-specific targeted resequencing panel, the ‘CSYseq’, which simultaneously identifies slow mutating Y-SNPs as evolution markers and rapid mutating Y-STRs as patrilineage markers. The panel was validated by paired-end sequencing of 130 males, distributed over 65 deep-rooted pedigrees covering 1,279 generations. The CSYseq successfully targets 15,611 Y-SNPs including 9,014 phylogenetic informative Y-SNPs to identify 1,443 human evolutionary Y-subhaplogroup lineages worldwide. In addition, the CSYseq properly targets 202 Y-STRs, including 81 slow, 68 moderate, 27 fast and 26 rapid mutating Y-STRs to individualize close paternal relatives. The targeted chrY markers cover a high average number of reads (Y-SNP = 717, Y-STR = 150), easy interpretation, powerful discrimination capacity and chrY specificity. The CSYseq is interesting for research on different time scales: to identify evolutionary ancestry, to find distant family and to discriminate closely related males. Therefore, this panel serves as a unique tool valuable for a wide range of genetic-genealogical applications in interdisciplinary research within evolutionary, population, molecular, medical and forensic genetics.


2021 ◽  
Vol 7 (9) ◽  
pp. 701
Author(s):  
Kanti Kiran ◽  
Hukam C. Rawal ◽  
Himanshu Dubey ◽  
Rajdeep Jaswal ◽  
Subhash C. Bhardwaj ◽  
...  

Diseases caused by Puccinia graminis are some of the most devastating diseases of wheat. Extensive genomic understanding of the pathogen has proven helpful not only in understanding host- pathogen interaction but also in finding appropriate control measures. In the present study, whole-genome sequencing of four diverse P. graminis pathotypes was performed to understand the genetic variation and evolution. An average of 63.5 Gb of data per pathotype with about 100× average genomic coverage was achieved with 100-base paired-end sequencing performed with Illumina Hiseq 1000. Genome structural annotations collectively predicted 9273 functional proteins including ~583 extracellular secreted proteins. Approximately 7.4% of the genes showed similarity with the PHI database which is suggestive of their significance in pathogenesis. Genome-wide analysis demonstrated pathotype 117-6 as likely distinct and descended through a different lineage. The 3–6% more SNPs in the regulatory regions and 154 genes under positive selection with their orthologs and under negative selection in the other three pathotypes further supported pathotype 117-6 to be highly diverse in nature. The genomic information generated in the present study could serve as an important source for comparative genomic studies across the genus Puccinia and lead to better rust management in wheat.


Life ◽  
2021 ◽  
Vol 11 (8) ◽  
pp. 803
Author(s):  
Huapu Chen ◽  
Xiaomeng Li ◽  
Yaorong Wang ◽  
Chunhua Zhu ◽  
Hai Huang ◽  
...  

The wild populations of the commercially valuable ornamental fish species, Betta splendens, and its germplasm resources have long been threatened by habitat degradation and contamination with artificially bred fish. Because of the lack of effective marker resources, population genetics research projects are severely hampered. To generate genetic data for developing polymorphic simple sequence repeat (SSR) markers and identifying functional genes, transcriptomic analysis was performed. Illumina paired-end sequencing yielded 105,505,486 clean reads, which were then de novo assembled into 69,836 unigenes. Of these, 35,751 were annotated in the non-redundant, EuKaryotic Orthologous Group, Swiss-Prot, Kyoto Encyclopedia of Genes and Genomes and Gene Ontology databases. A total of 12,751 SSR loci were identified from the transcripts and 7970 primer pairs were designed. One hundred primer pairs were randomly selected for PCR validation and 53 successfully generated target amplification products. Further validation demonstrated that 36% (n = 19) of the 53 amplified loci were polymorphic. These data could not only enrich the genetic information for the identification of functional genes but also effectively facilitate the development of SSR markers. Such knowledge would accelerate further studies on the genetic variation and evolution, comparative genomics, linkage mapping and molecular breeding in B. splendens.


Author(s):  
Hussein Migdadi ◽  
Nizar Haddad ◽  
Ruba AlOmari ◽  
Mohammad Brake ◽  
Mustafa AlShdaifat ◽  
...  

Background: Jordanian Awassi sheep (Ovis aries) is the dominant fat tail sheep breed that appeals to customers because of its various production systems, including fiber, meat and milk. This report is the first whole ewe genome sequence (WGS) of O. aries submitted in the NCBI database from Jordan. Methods: 64 Paired-end sequencing libraries were constructed and subjected to Illumina Hiseq 2500 sequencing system. High-quality reads were aligned against the reference sheep genome and detecting comprehensive sources (SNPs, InDels, SV, CNVs) of genetic variations. We have deposited data sequences at the NCBI under SRA (sequence reads archives) under the accession numbers SRR11128863, PRJNA574879. Result: Genome resequencing of Jordanian Awassi ewe was carried out with approximately 93.88 Gb with a mapping rate and effective mapping depths were 99.28% and 36.32. Around 19 million SNPs, 3,6 million InDels, 35,180 Structure variation and 13,524 copy number variation among the Jordanian ewe genome were detected. This wide range of genetic variation provides a framework for further genetic studies that will help understand the molecular basis underlying phenotypic variation of economically important traits in sheep and improve intrinsic defects in domestic sheep breeds.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Daniel L. Cameron ◽  
Jonathan Baber ◽  
Charles Shale ◽  
Jose Espejo Valle-Inclan ◽  
Nicolle Besselink ◽  
...  

AbstractGRIDSS2 is the first structural variant caller to explicitly report single breakends—breakpoints in which only one side can be unambiguously determined. By treating single breakends as a fundamental genomic rearrangement signal on par with breakpoints, GRIDSS2 can explain 47% of somatic centromere copy number changes using single breakends to non-centromere sequence. On a cohort of 3782 deeply sequenced metastatic cancers, GRIDSS2 achieves an unprecedented 3.1% false negative rate and 3.3% false discovery rate and identifies a novel 32–100 bp duplication signature. GRIDSS2 simplifies complex rearrangement interpretation through phasing of structural variants with 16% of somatic calls phasable using paired-end sequencing.


Plants ◽  
2021 ◽  
Vol 10 (7) ◽  
pp. 1362
Author(s):  
Fahad Al-Qurainy ◽  
Abdel-Rhman Z. Gaafar ◽  
Salim Khan ◽  
Mohammad Nadeem ◽  
Aref M. Alshameri ◽  
...  

Genome size is one of the fundamental cytogenetic features of a species, which is critical for the design and initiation of any genome sequencing projects and can provide essential insights in studying taxonomy, cytogenetics, phylogenesis, and evolutionary studies. However, this key cytogenetic information is almost lacking in the endemic species Reseda pentagyna and the locally rare species Reseda lutea in Saudi Arabia. Therefore, genome size was analyzed by propidium iodide PI flow cytometry and compared to k-mer analysis methods. The standard method for genome size measures (flow cytometry) estimated the genome size of R. lutea and R. pentagyna with nuclei isolation MB01 buffer were found to be 1.91 ± 0.02 and 2.09 ± 0.03 pg/2 °C, respectively, which corresponded approximately to a haploid genome size of 934 and 1.022 Mbp, respectively. For validation, K-mer analysis was performed on both species’ Illumina paired-end sequencing data from both species. Five k-mer analysis approaches were examined for biocomputational estimation of genome size: A general formula and four well-known programs (CovEST, Kmergenie, FindGSE, and GenomeScope). The parameter preferences had a significant impact on GenomeScope and Kmergenie estimates. While the general formula estimations did not differ considerably, with an average genome size of 867.7 and 896. Mbp. The differences across flow cytometry and biocomputational predictions may be due to the high repeat content, particularly long repetitive regions in both genomes, 71% and 57%, which interfered with k-mer analysis. GenomeScope allowed quantification of high heterozygosity levels (1.04 and 1.37%) of R. lutea and R. pentagyna genomes, respectively. Based on our observations, R. lutea may have a tetraploid genome or higher. Our results revealed fundamental cytogenetic information for R. lutea and R. pentagyna, which should be used in future taxonomic studies and whole-genome sequencing.


Sign in / Sign up

Export Citation Format

Share Document