scholarly journals Deep learning for HGT insertion sites recognition

BMC Genomics ◽  
2020 ◽  
Vol 21 (S11) ◽  
Author(s):  
Chen Li ◽  
Jiaxing Chen ◽  
Shuai Cheng Li

Abstract Background Horizontal Gene Transfer (HGT) refers to the sharing of genetic materials between distant species that are not in a parent-offspring relationship. The HGT insertion sites are important to understand the HGT mechanisms. Recent studies in main agents of HGT, such as transposon and plasmid, demonstrate that insertion sites usually hold specific sequence features. This motivates us to find a method to infer HGT insertion sites according to sequence features. Results In this paper, we propose a deep residual network, DeepHGT, to recognize HGT insertion sites. To train DeepHGT, we extracted about 1.55 million sequence segments as training instances from 262 metagenomic samples, where the ratio between positive instances and negative instances is about 1:1. These segments are randomly partitioned into three subsets: 80% of them as the training set, 10% as the validation set, and the remaining 10% as the test set. The training loss of DeepHGT is 0.4163 and the validation loss is 0.423. On the test set, DeepHGT has achieved the area under curve (AUC) value of 0.8782. Furthermore, in order to further evaluate the generalization of DeepHGT, we constructed an independent test set containing 689,312 sequence segments from another 147 gut metagenomic samples. DeepHGT has achieved the AUC value of 0.8428, which approaches the previous test AUC value. As a comparison, the gradient boosting classifier model implemented in PyFeat achieve an AUC value of 0.694 and 0.686 on the above two test sets, respectively. Furthermore, DeepHGT could learn discriminant sequence features; for example, DeepHGT has learned a sequence pattern of palindromic subsequences as a significantly (P-value=0.0182) local feature. Hence, DeepHGT is a reliable model to recognize the HGT insertion site. Conclusion DeepHGT is the first deep learning model that can accurately recognize HGT insertion sites on genomes according to the sequence pattern.

2021 ◽  
Vol 10 (8) ◽  
pp. 1772
Author(s):  
Hyun-Doo Moon ◽  
Han-Gyeol Choi ◽  
Kyong-Joon Lee ◽  
Dong-Jun Choi ◽  
Hyun-Jin Yoo ◽  
...  

Weight bearing whole-leg radiograph (WLR) is essential to assess lower limb alignment such as weight bearing line (WBL) ratio. The purpose of this study was to develop a deep learning (DL) model that predicts the WBL ratio using knee standing AP alone. Total of 3997 knee AP & WLRs were used. WBL ratio was used for labeling and analysis of prediction accuracy. The WBL ratio was divided into seven categories (0, 0.1, 0.2, 0.3, 0.4, 0.5, and 0.6). After training, performance of the DL model was evaluated. Final performance was evaluated using 386 subjects as a test set. Cumulative score (CS) within error range 0.1 was set with showing maximum CS in the validation set (95% CI, 0.924–0.970). In the test set, mean absolute error was 0.054 (95% CI, 0.048–0.061) and CS was 0.951 (95% CI, 0.924–0.970). Developed DL algorithm could predict the WBL ratio on knee standing AP alone with comparable accuracy as the degree primary physician can assess the alignment. It can be the basis for developing an automated lower limb alignment assessment tool that can be used easily and cost-effectively in primary clinics.


2018 ◽  
pp. 1-8 ◽  
Author(s):  
Okyaz Eminaga ◽  
Nurettin Eminaga ◽  
Axel Semjonow ◽  
Bernhard Breil

Purpose The recognition of cystoscopic findings remains challenging for young colleagues and depends on the examiner’s skills. Computer-aided diagnosis tools using feature extraction and deep learning show promise as instruments to perform diagnostic classification. Materials and Methods Our study considered 479 patient cases that represented 44 urologic findings. Image color was linearly normalized and was equalized by applying contrast-limited adaptive histogram equalization. Because these findings can be viewed via cystoscopy from every possible angle and side, we ultimately generated images rotated in 10-degree grades and flipped them vertically or horizontally, which resulted in 18,681 images. After image preprocessing, we developed deep convolutional neural network (CNN) models (ResNet50, VGG-19, VGG-16, InceptionV3, and Xception) and evaluated these models using F1 scores. Furthermore, we proposed two CNN concepts: 90%-previous-layer filter size and harmonic-series filter size. A training set (60%), a validation set (10%), and a test set (30%) were randomly generated from the study data set. All models were trained on the training set, validated on the validation set, and evaluated on the test set. Results The Xception-based model achieved the highest F1 score (99.52%), followed by models that were based on ResNet50 (99.48%) and the harmonic-series concept (99.45%). All images with cancer lesions were correctly determined by these models. When the focus was on the images misclassified by the model with the best performance, 7.86% of images that showed bladder stones with indwelling catheter and 1.43% of images that showed bladder diverticulum were falsely classified. Conclusion The results of this study show the potential of deep learning for the diagnostic classification of cystoscopic images. Future work will focus on integration of artificial intelligence–aided cystoscopy into clinical routines and possibly expansion to other clinical endoscopy applications.


2019 ◽  
Author(s):  
Jungirl Seok ◽  
Jae-Jin Song ◽  
Ja-Won Koo ◽  
Hee Chan Kim ◽  
Byung Yoon Choi

AbstractObjectivesThe purpose of this study was to create a deep learning model for the detection and segmentation of major structures of the tympanic membrane.MethodsTotal 920 tympanic endoscopic images had been stored were obtained, retrospectively. We constructed a detection and segmentation model using Mask R-CNN with ResNet-50 backbone targeting three clinically meaningful structures: (1) tympanic membrane (TM); (2) malleus with side of tympanic membrane; and (3) suspected perforation area. The images were randomly divided into three sets – taining set, validation set, and test set – at a ratio of 0.6:0.2:0.2, resulting in 548, 187, and 185 images, respectively. After assignment, 548 tympanic membrane images were augmented 50 times each, reaching 27,400 images.ResultsAt the most optimized point of the model, it achieved a mean average precision of 92.9% on test set. When an intersection over Union (IoU) score of greater than 0.5 was used as the reference point, the tympanic membrane was 100% detectable, the accuracy of side of the tympanic membrane based on the malleus segmentation was 88.6% and detection accuracy of suspicious perforation was 91.4%.ConclusionsAnatomical segmentation may allow the inclusion of an explanation provided by deep learning as part of the results. This method is applicable not only to tympanic endoscope, but also to sinus endoscope, laryngoscope, and stroboscope. Finally, it will be the starting point for the development of automated medical records descriptor of endoscope images.


Author(s):  
Jeremy Irvin ◽  
Pranav Rajpurkar ◽  
Michael Ko ◽  
Yifan Yu ◽  
Silviana Ciurea-Ilcus ◽  
...  

Large, labeled datasets have driven deep learning methods to achieve expert-level performance on a variety of medical imaging tasks. We present CheXpert, a large dataset that contains 224,316 chest radiographs of 65,240 patients. We design a labeler to automatically detect the presence of 14 observations in radiology reports, capturing uncertainties inherent in radiograph interpretation. We investigate different approaches to using the uncertainty labels for training convolutional neural networks that output the probability of these observations given the available frontal and lateral radiographs. On a validation set of 200 chest radiographic studies which were manually annotated by 3 board-certified radiologists, we find that different uncertainty approaches are useful for different pathologies. We then evaluate our best model on a test set composed of 500 chest radiographic studies annotated by a consensus of 5 board-certified radiologists, and compare the performance of our model to that of 3 additional radiologists in the detection of 5 selected pathologies. On Cardiomegaly, Edema, and Pleural Effusion, the model ROC and PR curves lie above all 3 radiologist operating points. We release the dataset to the public as a standard benchmark to evaluate performance of chest radiograph interpretation models.


2021 ◽  
Vol 11 ◽  
Author(s):  
Yong Tang ◽  
Yingjun Zheng ◽  
Xinpei Chen ◽  
Weijia Wang ◽  
Qingxi Guo ◽  
...  

BackgroundDevelopment and validation of a deep learning method to automatically segment the peri-ampullary (PA) region in magnetic resonance imaging (MRI) images.MethodsA group of patients with or without periampullary carcinoma (PAC) was included. The PA regions were manually annotated in MRI images by experts. Patients were randomly divided into one training set, one validation set, and one test set. Deep learning methods were developed to automatically segment the PA region in MRI images. The segmentation performance of the methods was compared in the validation set. The model with the highest intersection over union (IoU) was evaluated in the test set.ResultsThe deep learning algorithm achieved optimal accuracies in the segmentation of the PA regions in both T1 and T2 MRI images. The value of the IoU was 0.68, 0.68, and 0.64 for T1, T2, and combination of T1 and T2 images, respectively.ConclusionsDeep learning algorithm is promising with accuracies of concordance with manual human assessment in segmentation of the PA region in MRI images. This automated non-invasive method helps clinicians to identify and locate the PA region using preoperative MRI scanning.


Diagnostics ◽  
2021 ◽  
Vol 11 (11) ◽  
pp. 1943
Author(s):  
Diego R. Cervera ◽  
Luke Smith ◽  
Luis Diaz-Santana ◽  
Meenakshi Kumar ◽  
Rajiv Raman ◽  
...  

The aim of this study was to develop and validate a deep learning-based system to detect peripheral neuropathy (DN) from retinal colour images in people with diabetes. Retinal images from 1561 people with diabetes were used to predictDN diagnosed on vibration perception threshold. A total of 189 had diabetic retinopathy (DR), 276 had DN, and 43 had both DR and DN. 90% of the images were used for training and validation and 10% for testing. Deep neural networks, including Squeezenet, Inception, and Densenet were utilized, and the architectures were tested with and without pre-trained weights. Random transform of images was used during training. The algorithm was trained and tested using three sets of data: all retinal images, images without DR and images with DR. Area under the ROC curve (AUC) was used to evaluate performance. The AUC to predict DN on the whole cohort was 0.8013 (±0.0257) on the validation set and 0.7097 (±0.0031) on the test set. The AUC increased to 0.8673 (±0.0088) in the presence of DR. The retinal images can be used to identify individuals with DN and provides an opportunity to educate patients about their DN status when they attend DR screening.


2004 ◽  
Vol 186 (21) ◽  
pp. 7280-7289 ◽  
Author(s):  
Danielle A. Garsin ◽  
Jonathan Urbach ◽  
Jose C. Huguet-Tapia ◽  
Joseph E. Peters ◽  
Frederick M. Ausubel

ABSTRACT Sequencing the insertion sites of 8,865 Tn917 insertions in Enterococcus faecalis strain OG1RF identified a hot spot in the replication terminus region corresponding to 6% of the genome where 65% of the transposons had inserted. In E. faecalis, Tn917 preferentially inserted at a 29-bp consensus sequence centered on TATAA, a 5-bp sequence that is duplicated during insertion. The regional insertion site preference at the chromosome terminus was not observed in another low-G+C gram-positive bacterium, Listeria monocytogenes, although the consensus insertion sequence was the same. The 8,865 Tn917 insertion sites sequenced in E. faecalis corresponded to only ∼610 different open reading frames, far fewer than the predicted number of 2,400, assuming random insertion. There was no significant preference in orientation of the Tn917 insertions with either transcription or replication. Even though OG1RF has a smaller genome than strain V583 (2.8 Mb versus 3.2 Mb), the only E. faecalis strain whose sequence is in the public domain, over 10% of the Tn917 insertions appear to be in a OG1RF-specific sequence, suggesting that there are significant genomic differences among E. faecalis strains.


Biology ◽  
2021 ◽  
Vol 10 (11) ◽  
pp. 1140
Author(s):  
Woohyuk Jang ◽  
Eui Chul Lee

Owing to climate change and human indiscriminate development, the population of endangered species has been decreasing. To protect endangered species, many countries worldwide have adopted the CITES treaty to prevent the extinction of endangered plants and animals. Moreover, research has been conducted using diverse approaches, particularly deep learning-based animal and plant image recognition methods. In this paper, we propose an automated image classification method for 11 endangered parrot species included in CITES. The 11 species include subspecies that are very similar in appearance. Data images were collected from the Internet and built in cooperation with Seoul Grand Park Zoo to build an indigenous database. The dataset for deep learning training consisted of 70% training set, 15% validation set, and 15% test set. In addition, a data augmentation technique was applied to reduce the data collection limit and prevent overfitting. The performance of various backbone CNN architectures (i.e., VGGNet, ResNet, and DenseNet) were compared using the SSD model. The experiment derived the test set image performance for the training model, and the results show that the DenseNet18 had the best performance with an mAP of approximately 96.6% and an inference time of 0.38 s.


Cancers ◽  
2021 ◽  
Vol 13 (4) ◽  
pp. 913
Author(s):  
Johannes Fahrmann ◽  
Ehsan Irajizad ◽  
Makoto Kobayashi ◽  
Jody Vykoukal ◽  
Jennifer Dennison ◽  
...  

MYC is an oncogenic driver in the pathogenesis of ovarian cancer. We previously demonstrated that MYC regulates polyamine metabolism in triple-negative breast cancer (TNBC) and that a plasma polyamine signature is associated with TNBC development and progression. We hypothesized that a similar plasma polyamine signature may associate with ovarian cancer (OvCa) development. Using mass spectrometry, four polyamines were quantified in plasma from 116 OvCa cases and 143 controls (71 healthy controls + 72 subjects with benign pelvic masses) (Test Set). Findings were validated in an independent plasma set from 61 early-stage OvCa cases and 71 healthy controls (Validation Set). Complementarity of polyamines with CA125 was also evaluated. Receiver operating characteristic area under the curve (AUC) of individual polyamines for distinguishing cases from healthy controls ranged from 0.74–0.88. A polyamine signature consisting of diacetylspermine + N-(3-acetamidopropyl)pyrrolidin-2-one in combination with CA125 developed in the Test Set yielded improvement in sensitivity at >99% specificity relative to CA125 alone (73.7% vs 62.2%; McNemar exact test 2-sided P: 0.019) in the validation set and captured 30.4% of cases that were missed with CA125 alone. Our findings reveal a MYC-driven plasma polyamine signature associated with OvCa that complemented CA125 in detecting early-stage ovarian cancer.


Sign in / Sign up

Export Citation Format

Share Document