Sequence Identification with Trees and Co-Occurrence Graphs

Author(s):  
Maximilian Knoll ◽  
Herwig Unger
BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Yiren Wang ◽  
Mashari Alangari ◽  
Joshua Hihath ◽  
Arindam K. Das ◽  
M. P. Anantram

Abstract Background The all-electronic Single Molecule Break Junction (SMBJ) method is an emerging alternative to traditional polymerase chain reaction (PCR) techniques for genetic sequencing and identification. Existing work indicates that the current spectra recorded from SMBJ experimentations contain unique signatures to identify known sequences from a dataset. However, the spectra are typically extremely noisy due to the stochastic and complex interactions between the substrate, sample, environment, and the measuring system, necessitating hundreds or thousands of experimentations to obtain reliable and accurate results. Results This article presents a DNA sequence identification system based on the current spectra of ten short strand sequences, including a pair that differs by a single mismatch. By employing a gradient boosted tree classifier model trained on conductance histograms, we demonstrate that extremely high accuracy, ranging from approximately 96 % for molecules differing by a single mismatch to 99.5 % otherwise, is possible. Further, such accuracy metrics are achievable in near real-time with just twenty or thirty SMBJ measurements instead of hundreds or thousands. We also demonstrate that a tandem classifier architecture, where the first stage is a multiclass classifier and the second stage is a binary classifier, can be employed to boost the single mismatched pair’s identification accuracy to 99.5 %. Conclusions A monolithic classifier, or more generally, a multistage classifier with model specific parameters that depend on experimental current spectra can be used to successfully identify DNA strands.


1989 ◽  
Vol 9 (9) ◽  
pp. 3614-3620 ◽  
Author(s):  
S M Aldritt ◽  
J T Joseph ◽  
D F Wirth

We have identified a gene that encodes the polypeptide cytochrome b in the avian malarial parasite Plasmodium gallinaceum. The gene containing the open reading frame was found to be located on a 6.2-kilobase multimeric extrachromosomal element. The amino acid translation from this gene demonstrated significant similarities to cytochrome b sequences from yeast, mammal, and fungus genomes. We present evidence that the P. gallinaceum cytochrome b transcript is part of a larger primary transcript from the element that is subsequently processed. The message for P. gallinaceum cytochrome b was found to be 1.2 kilobases in size. This is the first report identifying a mitochondrial nucleic acid sequence in malaria-causing organisms and suggests that a functional cytochrome system may exist in these parasites.


2012 ◽  
Vol 102 (10) ◽  
pp. 937-947 ◽  
Author(s):  
S. H. De Boer ◽  
X. Li ◽  
L. J. Ward

Pectobacterium atrosepticum, P. carotovorum subsp. brasiliensis, P. carotovorum subsp. carotovorum, and P. wasabiae were detected in potato stems with blackleg symptoms using species- and subspecies-specific polymerase chain reaction (PCR). The tests included a new assay for P. wasabiae based on the phytase gene sequence. Identification of isolates from diseased stems by biochemical or physiological characterization, PCR, and multi-locus sequence typing (MLST) largely confirmed the PCR detection of Pectobacterium spp. in stem samples. P. atrosepticum was most commonly present but was the sole Pectobacterium sp. detected in only 52% of the diseased stems. P. wasabiae was most frequently present in combination with P. atrosepticum and was the sole Pectobacterium sp. detected in 13% of diseased stems. Pathogenicity of P. wasabiae on potato and its capacity to cause blackleg disease were demonstrated by stem inoculation and its isolation as the sole Pectobacterium sp. from field-grown diseased plants produced from inoculated seed tubers. Incidence of P. carotovorum subsp. brasiliensis was low in diseased stems, and the ability of Canadian strains to cause blackleg in plants grown from inoculated tubers was not confirmed. Canadian isolates of P. carotovorum subsp. brasiliensis differed from Brazilian isolates in diagnostic biochemical tests but conformed to the subspecies in PCR specificity and typing by MLST.


10.5219/892 ◽  
2018 ◽  
Vol 12 (1) ◽  
Author(s):  
Jana Žiarovská ◽  
Lucia Zeleňáková ◽  
Miroslava Kačániová ◽  
Eloy Fernández Cusimamani

1994 ◽  
Vol 116 (4) ◽  
pp. 282-289 ◽  
Author(s):  
Yu-Wen Huang ◽  
K. Srihari ◽  
Jim Adriance ◽  
George Westby

The placement of surface mount components is a time consuming and critical task in the assembly of surface mount Printed Circuit Boards (PCBs). The focus of this research was the identification of “near optimal” solutions for the placement sequence identification problem. The factors considered include the placement machine and the specific PCB, the feeder space available, the need for tooling and nozzle changes, and the actual traveling path of the placement head. Expert (or knowledge based) systems were used as the solution method for this problem. The system developed can cope with single PCBs, panels, 180 deg offset boards (panels), and multiple PCB batches. The prototype knowledge based system developed in this research identifies solutions in (almost) realtime.


Sign in / Sign up

Export Citation Format

Share Document