scholarly journals Demultiplexing barcoded Oxford Nanopore reads with deep convolutional neural networks

2018 ◽  
Author(s):  
Ryan R. Wick ◽  
Louise M. Judd ◽  
Kathryn E. Holt

AbstractMultiplexing, the simultaneous sequencing of multiple barcoded DNA samples on a single flow cell, has made Oxford Nanopore sequencing cost-effective for small genomes. However, it depends on the ability to sort the resulting sequencing reads by barcode, and current demultiplexing tools fail to classify many reads. Here we present Deepbinner, a tool for Oxford Nanopore demultiplexing that uses a deep neural network to classify reads based on the raw electrical read signal. This ‘signal-space’ approach allows for greater accuracy than existing ‘base-space’ tools (Albacore and Porechop) for which signals must first be converted to DNA base calls, itself a complex problem that can introduce noise into the barcode sequence. To assess Deepbinner and existing tools, we performed multiplex sequencing on 12 amplicons chosen for their distinguishability. This allowed us to establish a ground truth classification for each read based on internal sequence alone. Deepbinner had the lowest rate of unclassified reads (7.8%) and the highest demultiplexing precision (98.5% of classified reads were correctly assigned). It can be used alone (to maximise the number of classified reads) or in conjunction with other demultiplexers (to maximise precision and minimise false positive classifications). We also found cross-sample chimeric reads (0.3%) and evidence of barcode switching (0.3%) in our dataset, which likely arise during library preparation and may be detrimental for quantitative studies that use multiplexing. Deepbinner is open source (GPLv3) and available at https://github.com/rrwick/Deepbinner.


2018 ◽  
Author(s):  
Yifei Xu ◽  
Kuiama Lewandowski ◽  
Sheila Lumley ◽  
Steven Pullan ◽  
Richard Vipond ◽  
...  

AbstractMetagenomic sequencing with the Oxford Nanopore MinION sequencer offers potential for point-of-care testing of infectious diseases in clinical settings. To improve cost-effectiveness, multiplexing of several, barcoded samples upon a single flow cell will be required during sequencing. We generated a unique sequencing dataset to assess the extent and source of cross barcode contamination caused by multiplex MinION sequencing. Sequencing libraries for three different viruses, including influenza A, dengue and chikungunya, were prepared separately and sequenced on individual flow cells. We also pooled the respective libraries and performed multiplex sequencing. We identified 0.056% of total reads in the multiplex sequencing data that were assigned to incorrect barcodes. Chimeric reads were the predominant source of this error. Our findings highlight the need for careful filtering of multiplex sequencing data before downstream analysis, and the trade-off between sensitivity and specificity that applies to the barcode demultiplexing methods.



2020 ◽  
Vol 41 (Supplement_2) ◽  
Author(s):  
S Mehta ◽  
S Niklitschek ◽  
F Fernandez ◽  
C Villagran ◽  
J Avila ◽  
...  

Abstract Background EKG interpretation is slowly transitioning to a physician-free, Artificial Intelligence (AI)-driven endeavor. Our continued efforts to innovate follow a carefully laid stepwise approach, as follows: 1) Create an AI algorithm that accurately identifies STEMI against non-STEMI using a 12-lead EKG; 2) Challenging said algorithm by including different EKG diagnosis to the previous experiment, and now 3) To further validate the accuracy and reliability of our algorithm while also improving performance in a prehospital and hospital settings. Purpose To provide an accurate, reliable, and cost-effective tool for STEMI detection with the potential to redirect human resources into other clinically relevant tasks and save the need for human resources. Methods Database: EKG records obtained from Latin America Telemedicine Infarct Network (Mexico, Colombia, Argentina, and Brazil) from April 2014 to December 2019. Dataset: A total of 11,567 12-lead EKG records of 10-seconds length with sampling frequency of 500 [Hz], including the following balanced classes: unconfirmed and angiographically confirmed STEMI, branch blocks, non-specific ST-T abnormalities, normal and abnormal (200+ CPT codes, excluding the ones included in other classes). The label of each record was manually checked by cardiologists to ensure precision (Ground truth). Pre-processing: The first and last 250 samples were discarded as they may contain a standardization pulse. An order 5 digital low pass filter with a 35 Hz cut-off was applied. For each record, the mean was subtracted to each individual lead. Classification: The determined classes were STEMI (STEMI in different locations of the myocardium – anterior, inferior and lateral); Not-STEMI (A combination of randomly sampled normal, branch blocks, non-specific ST-T abnormalities and abnormal records – 25% of each subclass). Training & Testing: A 1-D Convolutional Neural Network was trained and tested with a dataset proportion of 90/10; respectively. The last dense layer outputs a probability for each record of being STEMI or Not-STEMI. Additional testing was performed with a subset of the original dataset of angiographically confirmed STEMI. Results See Figure Attached – Preliminary STEMI Dataset Accuracy: 96.4%; Sensitivity: 95.3%; Specificity: 97.4% – Confirmed STEMI Dataset: Accuracy: 97.6%; Sensitivity: 98.1%; Specificity: 97.2%. Conclusions Our results remain consistent with our previous experience. By further increasing the amount and complexity of the data, the performance of the model improves. Future implementations of this technology in clinical settings look promising, not only in performing swift screening and diagnostic steps but also partaking in complex STEMI management triage. Funding Acknowledgement Type of funding source: None



2021 ◽  
Vol 2021 (1) ◽  
Author(s):  
Xiang Li ◽  
Jianzheng Liu ◽  
Jessica Baron ◽  
Khoa Luu ◽  
Eric Patterson

AbstractRecent attention to facial alignment and landmark detection methods, particularly with application of deep convolutional neural networks, have yielded notable improvements. Neither these neural-network nor more traditional methods, though, have been tested directly regarding performance differences due to camera-lens focal length nor camera viewing angle of subjects systematically across the viewing hemisphere. This work uses photo-realistic, synthesized facial images with varying parameters and corresponding ground-truth landmarks to enable comparison of alignment and landmark detection techniques relative to general performance, performance across focal length, and performance across viewing angle. Recently published high-performing methods along with traditional techniques are compared in regards to these aspects.



2020 ◽  
Vol 12 (13) ◽  
pp. 2137 ◽  
Author(s):  
Ilinca-Valentina Stoica ◽  
Marina Vîrghileanu ◽  
Daniela Zamfir ◽  
Bogdan-Andrei Mihai ◽  
Ionuț Săvulescu

Monitoring uncontained built-up area expansion remains a complex challenge for the development and implementation of a sustainable planning system. In this regard, proper planning requires accurate monitoring tools and up-to-date information on rapid territorial transformations. The purpose of the study was to assess built-up area expansion, comparing two freely available and widely used datasets, respectively, Corine Land Cover and Landsat, to each other, as well as the ground truth, with the goal of identifying the most cost-effective and reliable tool. The analysis was based on the largest post-socialist city in the European Union, the capital of Romania, Bucharest, and its neighboring Ilfov County, from 1990 to 2018. This study generally represents a new approach to measuring the process of urban expansion, offering insights about the strengths and limitations of the two datasets through a multi-level territorial perspective. The results point out discrepancies between the datasets, both at the macro-scale level and at the administrative unit’s level. On the macro-scale level, despite the noticeable differences, the two datasets revealed the spatiotemporal magnitude of the expansion of the built-up area and can be a useful tool for supporting the decision-making process. On the smaller territorial scale, detailed comparative analyses through five case-studies were conducted, indicating that, if used alone, limitations on the information that can be derived from the datasets would lead to inaccuracies, thus significantly limiting their potential to be used in the development of enforceable regulation in urban planning.



Author(s):  
Hao Zhang ◽  
Liangxiao Jiang ◽  
Wenqiang Xu

Crowdsourcing services provide a fast, efficient, and cost-effective means of obtaining large labeled data for supervised learning. Ground truth inference, also called label integration, designs proper aggregation strategies to infer the unknown true label of each instance from the multiple noisy label set provided by ordinary crowd workers. However, to the best of our knowledge, nearly all existing label integration methods focus solely on the multiple noisy label set itself of the individual instance while totally ignoring the intercorrelation among multiple noisy label sets of different instances. To solve this problem, a multiple noisy label distribution propagation (MNLDP) method is proposed in this study. MNLDP first transforms the multiple noisy label set of each instance into its multiple noisy label distribution and then propagates its multiple noisy label distribution to its nearest neighbors. Consequently, each instance absorbs a fraction of the multiple noisy label distributions from its nearest neighbors and yet simultaneously maintains a fraction of its own original multiple noisy label distribution. Promising experimental results on simulated and real-world datasets validate the effectiveness of our proposed method.



Cells ◽  
2021 ◽  
Vol 10 (10) ◽  
pp. 2577
Author(s):  
Imogen A. Wright ◽  
Kayla E. Delaney ◽  
Mary Grace K. Katusiime ◽  
Johannes C. Botha ◽  
Susan Engelbrecht ◽  
...  

HIV-1 proviral single-genome sequencing by limiting-dilution polymerase chain reaction (PCR) amplification is important for differentiating the sequence-intact from defective proviruses that persist during antiretroviral therapy (ART). Intact proviruses may rebound if ART is interrupted and are the barrier to an HIV cure. Oxford Nanopore Technologies (ONT) sequencing offers a promising, cost-effective approach to the sequencing of long amplicons such as near full-length HIV-1 proviruses, but the high diversity of HIV-1 and the ONT sequencing error render analysis of the generated data difficult. NanoHIV is a new tool that uses an iterative consensus generation approach to construct accurate, near full-length HIV-1 proviral single-genome sequences from ONT data. To validate the approach, single-genome sequences generated using NanoHIV consensus building were compared to Illumina® consensus building of the same nine single-genome near full-length amplicons and an average agreement of 99.4% was found between the two sequencing approaches.





Aerospace ◽  
2020 ◽  
Vol 7 (3) ◽  
pp. 31 ◽  
Author(s):  
Dario Modenini ◽  
Anton Bahu ◽  
Giacomo Curzi ◽  
Andrea Togni

To enable a reliable verification of attitude determination and control systems for nanosatellites, the environment of low Earth orbits with almost disturbance-free rotational dynamics must be simulated. This work describes the design solutions adopted for developing a dynamic nanosatellite attitude simulator testbed at the University of Bologna. The facility integrates several subsystems, including: (i) an air-bearing three degree of freedom platform, with automatic balancing system, (ii) a Helmholtz cage for geomagnetic field simulation, (iii) a Sun simulator, and (iv) a metrology vision system for ground-truth attitude generation. Apart from the commercial off-the-shelf Helmholtz cage, the other subsystems required substantial development efforts. The main purpose of this manuscript is to offer some cost-effective solutions for their in-house development, and to show through experimental verification that adequate performances can be achieved. The proposed approach may thus be preferred to the procurement of turn-key solutions, when required by budget constraints. The main outcome of the commissioning phase of the facility are: a residual disturbance torque affecting the air bearing platform of less than 5 × 10−5 Nm, an attitude determination rms accuracy of the vision system of 10 arcmin, and divergence of the Sun simulator light beam of less than 0.5° in a 35 cm diameter area.



1981 ◽  
Vol 103 (2) ◽  
pp. 92-97 ◽  
Author(s):  
C. H. Kostors ◽  
S. P. Vincent

During the optimization studies of a 10-MWe (net) OTEC power system the ammonia turbine design study led to selection of a four-stage, double-flow axial turbine yielding 89.6 percent efficiency at 1800 rpm. To maximize power at off-nominal conditions, variable nozzles are used for the first stage. The turbine is directly connected to a four-pole, 60-Hz synchronous generator having 97.2 percent efficiency. This study was limited to state-of-the-art hardware designs for axial- and radial-flow turbines, single- and double-flow designs, and variations in the base diameter and number of turbine stages. The gasdynamic calculation procedure and the effects of the turbine generator control scheme, turbine blade design, materials (annealed AISI Type 403 steel is chosen for the blading), and seals and bearings are addressed in this paper. The estimated mass-production cost of the turbine-generator set is approximately $150/kWe at 10-MWe (net) size in 1979 dollars. Single-flow-axial and radial-inflow turbines would have lower performance without sufficiently lower cost to be cost-effective, since each percentage point of turbine efficiency is estimated to be worth 17 to $28/kWe.



2012 ◽  
Vol 10 (6) ◽  
pp. 635-645 ◽  
Author(s):  
Mahmood Gholami ◽  
Wubishet A. Bekele ◽  
Joerg Schondelmaier ◽  
Rod J. Snowdon


Sign in / Sign up

Export Citation Format

Share Document