representative sequence
Recently Published Documents


TOTAL DOCUMENTS

32
(FIVE YEARS 12)

H-INDEX

5
(FIVE YEARS 2)

2021 ◽  
Author(s):  
Luke R Thompson ◽  
Sean R Anderson ◽  
Paul A Den Uyl ◽  
Nastassia V Patin ◽  
Grant Sanderson ◽  
...  

Background: Amplicon sequencing (metabarcoding) is a common method to survey diversity of environmental communities whereby a single genetic locus is amplified and sequenced from the DNA of whole or partial organisms, organismal traces (e.g., skin, mucus, feces), or microbes in an environmental sample. Several software packages exist for analyzing amplicon data, among which QIIME 2 has emerged as a popular option because of its broad functionality, plugin architecture, provenance tracking, and interactive visualizations. However, each new analysis requires the user to keep track of input and output file names, parameters, and commands; this lack of automation and standardization is inefficient and creates barriers to meta-analysis and sharing of results. Findings: We developed Tourmaline, a Python-based workflow for QIIME 2 built using the Snakemake workflow management system. Starting from a configuration file that defines parameters and input files--a reference database, a sample metadata file, and a manifest or archive of FASTQ sequences--it runs either the DADA2 or Deblur denoising algorithm, assigns taxonomy to the resulting representative sequences, performs analyses of taxonomic, alpha, and beta diversity, and generates an HTML report summarizing and linking to the output files. Features include automatic determination of trimming parameters using quality scores, representative sequence filtering (taxonomy, length, abundance, prevalence, or identifier), support for multiple taxonomic classification and sequence alignment methods, outlier detection, and automated initialization of a new analysis using previous settings. The workflow runs natively on Linux and macOS or via a Docker container. We ran Tourmaline on a 16S rRNA amplicon dataset from Lake Erie surface water, showing its utility for parameter optimization and the ability to easily evaluate results through the HTML report, QIIME 2 viewer, and R- and Python-based Jupyter notebooks. Conclusions: Reproducible workflows like Tourmaline enable rapid analysis of environmental and biomedical amplicon data, decreasing the time from data generation to actionable results. Tourmaline is available for download at github.com/aomlomics/tourmaline.


2021 ◽  
Author(s):  
Xin Bai ◽  
Xin Guo ◽  
Linjun Wang

Diabatization of one-electron states in flexible molecular aggregates is a great challenge due to the presence of surface crossings between molecular orbital (MO) levels and the complex interaction between MOs of neighboring molecules. In this work, we present an efficient machine learning approach to calculate electronic couplings between quasi-diabatic MOs without the need of nonadiabatic coupling calculations. Using MOs of rigid molecules as references, the MOs that can be directly regarded to be quasi-diabatic in molecular dynamics are selected out, state tracked, and phase corrected. On the basis of this information, artificial neural networks are trained to characterize the structure-dependent onsite energies of quasi-diabatic MOs and the inter-molecular electronic couplings. A representative sequence of DNA is systematically studied as an illustration. Smooth time evolution of electronic couplings in all base pairs is obtained with quasi-diabatic MOs. Especially, our method can calculate electronic couplings between different quasi-diabatic MOs independently, and thus possesses unique advantages in many applications.


2021 ◽  
Vol 8 ◽  
Author(s):  
Jung Min Choi ◽  
Jae Ho Jung ◽  
Ki Hong Kim ◽  
D. Wayne Coats ◽  
Young Ok Kim

A tintinnid species, Helicostomella longa, infected by the parasitic dinoflagellate Euduboscquella triangula n. sp. was discovered from the southern coast of Korea in August of 2015 and 2016. Parasite morphology and development were analyzed by observation of live cells and protargol-stained specimens. The parasite was determined to be a new species in the genus Euduboscquella based on morphological and molecular data. A representative sequence of the novel species clustered in Euduboscquella group I. The morphological and developmental features of E. triangula were distinguished from those of its congeners by: (1) numerous shallow and intertwining grooves on an inconspicuous shield; (2) sporocytes initially forming a short chain, but separating after the second or third division regardless of spore type; (3) production of motile mushroom-shaped dinospores, non-motile spherical spores, and non-motile triangular spores. Dinospores were formed by ca. 28% of infections, while both non-motile spherical and triangular spores occurred at a frequency of ca. 36%. All spore types showed completely identical 18S rDNA sequences. Parasite prevalence was 15.5 and 8.3% on 17 and 24 August of 2015, respectively, with infection intensity on both dates being 1.3.


2019 ◽  
Vol 25 (4) ◽  
pp. 402-406
Author(s):  
Mariana Cunha Stutz ◽  
Renato Carrer Filho ◽  
Geisiane Alves Rocha ◽  
Érico de Campos Dianese ◽  
Marcos Gomes da Cunha

Abstract Zamioculcas zamiifolia (Araceae) is one of the most widely grown exotic species in Brazil as ornamental plants and in landscape design. Despite tolerating transport and being well adapted to low-light environments, this ornamental is attacked by different pathogens. Thus, the aim was to detect and identify the pathogen that causes stem rot in commercial Z. zamiifolia crops. Z. zamiifolia plants exhibiting stem rot symptoms were sent for phytosanitary diagnosis. In a culture medium, the fungal isolate obtained (SR-001) displayed the following morphological characteristics: cotton-like aerial mycelium, septate hyaline hyphae with no spore production, and the formation of small brown spherical sclerotia. To confirm pathogenicity, Z. zamiifolia plants were inoculated with the SR-001 isolate and, after fifteen days, the fungus was re-isolated when the same rot symptoms emerged. The SR-001 isolate was identified as Sclerotium rolfsii and its representative sequence was deposited in GenBank (Access MG694322). This fungal isolate has not been associated with diseases in Z. zamiifolia in Brazil, and this is the first report of the fungus infecting this ornamental plant species in a cultivated area.


2019 ◽  
Vol 2019 ◽  
pp. 1-8 ◽  
Author(s):  
Ting Ji ◽  
Hao-Xuan Cao ◽  
Ran Wu ◽  
Lin-Lin Cui ◽  
Guo-Ming Su ◽  
...  

Parasitic Entamoeba spp. can infect many classes of vertebrates including humans and pigs. Entamoeba suis and zoonotic Entamoeba polecki have been identified in pigs, and swine are implicated as potential reservoirs for Entamoeba histolytica. However, the prevalence of Entamoeba spp. in pigs in southeastern China has not been reported. In this study, 668 fecal samples collected from 6 different regions in Fujian Province, southeastern China, were analyzed to identify three Entamoeba species by nested PCR and sequencing analysis. The overall prevalence of Entamoeba spp. was 55.4% (370/668; 95% CI 51.6% to 59.2%), and the infection rate of E. polecki ST1 was the highest (302/668; 45.2%, 95% CI 41.4% to 49.0%), followed by E. polecki ST3 (228/668; 34.1%, 95% CI 30.5% to 37.7%) and E. suis (87/668; 13.0%, 95% CI 10.5% to 15.6%). E. histolytica was not detected in any samples. Moreover, the coinfection rate of E. polecki ST1 and ST3 was 25.1% (168/668; 95% CI 21.9% to 28.4%), the coinfection rate of E. polecki ST1 and E. suis was 3.7% (25/668; 95% CI 2.3% to 5.2%), the coinfection rate of E. polecki ST3 and E. suis was 0.3% (2/668), and the coinfection rate of E. polecki ST1, E. polecki ST3, and E. suis was 4.0% (27/668; 95% CI 2.5% to 5.5%). A representative sequence (MK347346) was identical to the sequence of E. suis (DQ286372). Two subtype-specific sequences (MK357717 and MK347347) were almost identical to the sequences of E. polecki ST1 (FR686383) and ST3 (AJ566411), respectively. This is the first study to survey the occurrence and to conduct molecular identification of three Entamoeba species in southeastern China. This is the first report regarding mixed infections with E. suis, E. polecki ST1, and E. polecki ST3 in China. More research studies are needed to better understand the transmission and zoonotic potential of Entamoeba spp.


2019 ◽  
Vol 8 (43) ◽  
Author(s):  
Jennifer K. Spinler ◽  
Sabeen Raza ◽  
Jessica K. Runge ◽  
Ruth Ann Luna

Hybrid de novo assembly of Illumina/Nanopore sequence data produced complete circular sequences of the chromosome and a plasmid for the multidrug-resistant Pseudomonas aeruginosa Houston-1 strain. This provides a high-quality representative sequence for a lineage endemic to a pediatric cystic fibrosis care center at Texas Children’s Hospital.


2019 ◽  
Vol 8 (42) ◽  
Author(s):  
Jennifer K. Spinler ◽  
Anne J. Gonzales-Luna ◽  
Sabeen Raza ◽  
Jessica K. Runge ◽  
Ruth Ann Luna ◽  
...  

Hybrid de novo assembly of Illumina/Nanopore sequence data produced a complete circular sequence of the chromosome for a Clostridioides difficile ribotype 255 (RT255) isolate from an elderly patient with recurrent C. difficile infection (CDI). This provides a high-quality representative sequence for the RT255 lineage.


Author(s):  
Yuxin Chen ◽  
Tengjiao Wang ◽  
Wei Chen ◽  
Qiang Li ◽  
Zhen Qiu

Lacking in sequence preserving mechanism, existing heterogeneous information network (HIN) embedding discards the essential type sequence information during embedding. We propose a Type Sequence Preserving HIN Embedding model (SeqHINE) which expands the HIN embedding to sequence level. SeqHINE incorporates the type sequence information via type-aware GRU and preserves representative sequence information by decay function. Abundant experiments show that SeqHINE can outperform state-of-the-art even with 50% less labeled data.


2019 ◽  
Vol 17 (03) ◽  
pp. 1940008 ◽  
Author(s):  
Yoshiaki Sota ◽  
Shigeto Seno ◽  
Hironori Shigeta ◽  
Naoki Osato ◽  
Masafumi Shimoda ◽  
...  

Fusion genes are involved in cancer, and their detection using RNA-Seq is insufficient given the relatively short reading length. Therefore, we proposed a shifted short-read clustering (SSC) method, which focuses on overlapping reads from the same loci and extends them as a representative sequence. To verify their usefulness, we applied the SSC method to RNA-Seq data from four types of cell lines (BT-474, MCF-7, SKBR-3, and T-47D). As the slide width of the SSC method increased to one, two, five, or ten bases, the read length was extended from 201 bases to 217 (108%), 234 (116%), 282 (140%), or 317 (158%) bases, respectively. Furthermore, fusion genes were investigated using STAR-Fusion, a fusion gene detection tool, with and without the SSC method. When one base was shifted by the SSC method, the reads mapped to multiple loci decreased from 9.7% to 4.6%, and the sensitivity of the fusion gene was improved from 47% to 54% on average (BT-474: from 48% to 57%, MCF-7: 49% to 53%, SKBR-3: 50% to 57%, and T-47D: 43% to 50%) compared with original data. When the reads are shifted more, the positive predictive value was also improved. The SSC method could be an effective method for fusion gene detection.


2019 ◽  
Vol 400 (3) ◽  
pp. 367-381 ◽  
Author(s):  
Kristina Straub ◽  
Mona Linde ◽  
Cosimo Kropp ◽  
Samuel Blanquart ◽  
Patrick Babinger ◽  
...  

Abstract For evolutionary studies, but also for protein engineering, ancestral sequence reconstruction (ASR) has become an indispensable tool. The first step of every ASR protocol is the preparation of a representative sequence set containing at most a few hundred recent homologs whose composition determines decisively the outcome of a reconstruction. A common approach for sequence selection consists of several rounds of manual recompilation that is driven by embedded phylogenetic analyses of the varied sequence sets. For ASR of a geranylgeranylglyceryl phosphate synthase, we additionally utilized FitSS4ASR, which replaces this time-consuming protocol with an efficient and more rational approach. FitSS4ASR applies orthogonal filters to a set of homologs to eliminate outlier sequences and those bearing only a weak phylogenetic signal. To demonstrate the usefulness of FitSS4ASR, we determined experimentally the oligomerization state of eight predecessors, which is a delicate and taxon-specific property. Corresponding ancestors deduced in a manual approach and by means of FitSS4ASR had the same dimeric or hexameric conformation; this concordance testifies to the efficiency of FitSS4ASR for sequence selection. FitSS4ASR-based results of two other ASR experiments were added to the Supporting Information. Program and documentation are available at https://gitlab.bioinf.ur.de/hek61586/FitSS4ASR.


Sign in / Sign up

Export Citation Format

Share Document