scholarly journals A machine learning model to determine the accuracy of variant calls in capture-based next generation sequencing

BMC Genomics ◽  
2018 ◽  
Vol 19 (1) ◽  
Author(s):  
Jeroen van den Akker ◽  
Gilad Mishne ◽  
Anjali D. Zimmer ◽  
Alicia Y. Zhou
Electronics ◽  
2019 ◽  
Vol 8 (6) ◽  
pp. 607 ◽  
Author(s):  
Ihab Ahmed Najm ◽  
Alaa Khalaf Hamoud ◽  
Jaime Lloret ◽  
Ignacio Bosch

The 5G network is a next-generation wireless form of communication and the latest mobile technology. In practice, 5G utilizes the Internet of Things (IoT) to work in high-traffic networks with multiple nodes/sensors in an attempt to transmit their packets to a destination simultaneously, which is a characteristic of IoT applications. Due to this, 5G offers vast bandwidth, low delay, and extremely high data transfer speed. Thus, 5G presents opportunities and motivations for utilizing next-generation protocols, especially the stream control transmission protocol (SCTP). However, the congestion control mechanisms of the conventional SCTP negatively influence overall performance. Moreover, existing mechanisms contribute to reduce 5G and IoT performance. Thus, a new machine learning model based on a decision tree (DT) algorithm is proposed in this study to predict optimal enhancement of congestion control in the wireless sensors of 5G IoT networks. The model was implemented on a training dataset to determine the optimal parametric setting in a 5G environment. The dataset was used to train the machine learning model and enable the prediction of optimal alternatives that can enhance the performance of the congestion control approach. The DT approach can be used for other functions, especially prediction and classification. DT algorithms provide graphs that can be used by any user to understand the prediction approach. The DT C4.5 provided promising results, with more than 92% precision and recall.


2019 ◽  
Vol 66 (1) ◽  
pp. 239-246 ◽  
Author(s):  
Chao Wu ◽  
Xiaonan Zhao ◽  
Mark Welsh ◽  
Kellianne Costello ◽  
Kajia Cao ◽  
...  

Abstract BACKGROUND Molecular profiling has become essential for tumor risk stratification and treatment selection. However, cancer genome complexity and technical artifacts make identification of real variants a challenge. Currently, clinical laboratories rely on manual screening, which is costly, subjective, and not scalable. We present a machine learning–based method to distinguish artifacts from bona fide single-nucleotide variants (SNVs) detected by next-generation sequencing from nonformalin-fixed paraffin-embedded tumor specimens. METHODS A cohort of 11278 SNVs identified through clinical sequencing of tumor specimens was collected and divided into training, validation, and test sets. Each SNV was manually inspected and labeled as either real or artifact as part of clinical laboratory workflow. A 3-class (real, artifact, and uncertain) model was developed on the training set, fine-tuned with the validation set, and then evaluated on the test set. Prediction intervals reflecting the certainty of the classifications were derived during the process to label “uncertain” variants. RESULTS The optimized classifier demonstrated 100% specificity and 97% sensitivity over 5587 SNVs of the test set. Overall, 1252 of 1341 true-positive variants were identified as real, 4143 of 4246 false-positive calls were deemed artifacts, whereas only 192 (3.4%) SNVs were labeled as “uncertain,” with zero misclassification between the true positives and artifacts in the test set. CONCLUSIONS We presented a computational classifier to identify variant artifacts detected from tumor sequencing. Overall, 96.6% of the SNVs received definitive labels and thus were exempt from manual review. This framework could improve quality and efficiency of the variant review process in clinical laboratories.


Genes ◽  
2018 ◽  
Vol 9 (10) ◽  
pp. 505
Author(s):  
Manfred Grabherr ◽  
Bozena Kaminska ◽  
Jan Komorowski

The massive increase in computational power over the recent years and wider applicationsof machine learning methods, coincidental or not, were paralleled by remarkable advances inhigh-throughput DNA sequencing technologies.[...]


BMC Genomics ◽  
2016 ◽  
Vol 17 (1) ◽  
Author(s):  
Jean-François Spinella ◽  
Pamela Mehanna ◽  
Ramon Vidal ◽  
Virginie Saillour ◽  
Pauline Cassart ◽  
...  

2019 ◽  
Vol 19 (7) ◽  
Author(s):  
Gang Li ◽  
Boyang Ji ◽  
Jens Nielsen

ABSTRACT Understanding genotype–phenotype relationship is fundamental in biology. With the benefit from next-generation sequencing and high-throughput phenotyping methodologies, there have been generated much genome and phenome data for Saccharomyces cerevisiae. This makes it an excellent model system to understand the genotype–phenotype relationship. In this paper, we presented the reconstruction and application of the yeast pan-genome in resolving genotype–phenotype relationship by a machine learning-assisted approach.


Sign in / Sign up

Export Citation Format

Share Document