scholarly journals Single-sequence and profile-based prediction of RNA solvent accessibility using dilated convolutional neural network

Author(s):  
Anil Kumar Hanumanthappa ◽  
Jaswinder Singh ◽  
Kuldip Paliwal ◽  
Jaspreet Singh ◽  
Yaoqi Zhou

Abstract Motivation RNA solvent accessibility, similar to protein solvent accessibility, reflects the structural regions that are accessible to solvents or other functional biomolecules, and plays an important role for structural and functional characterization. Unlike protein solvent accessibility, only a few tools are available for predicting RNA solvent accessibility despite the fact that millions of RNA transcripts have unknown structures and functions. Also, these tools have limited accuracy. Here, we have developed RNAsnap2 that uses a dilated convolutional neural network with a new feature, based on predicted base-pairing probabilities from LinearPartition. Results Using the same training set from the recent predictor RNAsol, RNAsnap2 provides an 11% improvement in median Pearson Correlation Coefficient (PCC) and 9% improvement in mean absolute errors for the same test set of 45 RNA chains. A larger improvement (22% in median PCC) is observed for 31 newly deposited RNA chains that are non-redundant and independent from the training and the test sets. A single-sequence version of RNAsnap2 (i.e. without using sequence profiles generated from homology search by Infernal) has achieved comparable performance to the profile-based RNAsol. In addition, RNAsnap2 has achieved comparable performance for protein-bound and protein-free RNAs. Both RNAsnap2 and RNAsnap2 (SingleSeq) are expected to be useful for searching structural signatures and locating functional regions of non-coding RNAs. Availability and implementation Standalone-versions of RNAsnap2 and RNAsnap2 (SingleSeq) are available at https://github.com/jaswindersingh2/RNAsnap2. Direct prediction can also be made at https://sparks-lab.org/server/rnasnap2. The datasets used in this research can also be downloaded from the GITHUB and the webserver mentioned above. Supplementary information Supplementary data are available at Bioinformatics online.

2021 ◽  
Author(s):  
James Chung Wai Cheung ◽  
Yiu Chow TAM ◽  
Lok Chun CHAN ◽  
Ping Keung CHAN ◽  
Chunyi WEN

Abstract Objectives To develop a deep convolutional neural network (CNN) for the segmentation of femur and tibia on plain x-ray radiographs, hence enabling an automated measurement of joint space width (JSW) to predict the severity and progression of knee osteoarthritis (KOA). Methods A CNN with ResU-Net architecture was developed for knee X-ray imaging segmentation. The efficiency was evaluated by the Intersection over Union (IoU) score by comparing the outputs with the annotated contour of the distal femur and proximal tibia. By leveraging imaging segmentation, the minimal and multiple JSWs in the tibiofemoral joint were estimated and then validated by radiologists’ measurements in the Osteoarthritis Initiative (OAI) dataset using Pearson correlation and Bland–Altman plot. The estimated JSWs were deployed to predict the radiographic severity and progression of KOA defined by Kellgren-Lawrence (KL) grades using the XGBoost model. The classification performance was assessed using F1 and area under receiver operating curve (AUC). Results The network has attained a segmentation efficiency of 98.9% IoU. Meanwhile, the agreement between the CNN-based estimation and radiologist’s measurement of minimal JSW reached 0.7801 (p < 0.0001). Moreover, the 32-point multiple JSW obtained the highest AUC score of 0.656 to classify KL-grade of KOA. Whereas the 64-point multiple JSWs achieved the best performance in predicting KOA progression defined by KL grade change within 48 months, with AUC of 0.621. The multiple JSWs outperform the commonly used minimum JSW with 0.587 AUC in KL-grade classification and 0.554 AUC in disease progression prediction. Conclusion Fine-grained characterization of joint space width of KOA yields comparable performance to the radiologist in assessing disease severity and progression. We provide a fully automated and efficient radiographic assessment tool for KOA.


Author(s):  
Qingzhen Hou ◽  
Jean Marc Kwasigroch ◽  
Marianne Rooman ◽  
Fabrizio Pucci

Abstract Motivation The solubility of a protein is often decisive for its proper functioning. Lack of solubility is a major bottleneck in high-throughput structural genomic studies and in high-concentration protein production, and the formation of protein aggregates causes a wide variety of diseases. Since solubility measurements are time-consuming and expensive, there is a strong need for solubility prediction tools. Results We have recently introduced solubility-dependent distance potentials that are able to unravel the role of residue–residue interactions in promoting or decreasing protein solubility. Here, we extended their construction by defining solubility-dependent potentials based on backbone torsion angles and solvent accessibility, and integrated them, together with other structure- and sequence-based features, into a random forest model trained on a set of Escherichia coli proteins with experimental structures and solubility values. We thus obtained the SOLart protein solubility predictor, whose most informative features turned out to be folding free energy differences computed from our solubility-dependent statistical potentials. SOLart performances are very good, with a Pearson correlation coefficient between experimental and predicted solubility values of almost 0.7 both in cross-validation on the training dataset and in an independent set of Saccharomyces cerevisiae proteins. On test sets of modeled structures, only a limited drop in performance is observed. SOLart can thus be used with both high-resolution and low-resolution structures, and clearly outperforms state-of-art solubility predictors. It is available through a user-friendly webserver, which is easy to use by non-expert scientists. Availability and implementation The SOLart webserver is freely available at http://babylone.ulb.ac.be/SOLART/. Supplementary information Supplementary data are available at Bioinformatics online.


Symmetry ◽  
2021 ◽  
Vol 14 (1) ◽  
pp. 33
Author(s):  
Yin-Xin Bao ◽  
Quan Shi ◽  
Qin-Qin Shen ◽  
Yang Cao

Accurate traffic status prediction is of great importance to improve the security and reliability of the intelligent transportation system. However, urban traffic status prediction is a very challenging task due to the tight symmetry among the Human–Vehicle–Environment (HVE). The recently proposed spatial–temporal 3D convolutional neural network (ST-3DNet) effectively extracts both spatial and temporal characteristics in HVE, but ignores the essential long-term temporal characteristics and the symmetry of historical data. Therefore, a novel spatial–temporal 3D residual correlation network (ST-3DRCN) is proposed for urban traffic status prediction in this paper. The ST-3DRCN firstly introduces the Pearson correlation coefficient method to extract a high correlation between traffic data. Then, a dynamic spatial feature extraction component is constructed by using 3D convolution combined with residual units to capture dynamic spatial features. After that, based on the idea of long short-term memory (LSTM), a novel architectural unit is proposed to extract dynamic temporal features. Finally, the spatial and temporal features are fused to obtain the final prediction results. Experiments have been performed using two datasets from Chengdu, China (TaxiCD) and California, USA (PEMS-BAY). Taking the root mean square error (RMSE) as the evaluation index, the prediction accuracy of ST-3DRCN on TaxiCD dataset is 21.4%, 21.3%, 11.7%, 10.8%, 4.7%, 3.6% and 2.3% higher than LSTM, convolutional neural network (CNN), 3D-CNN, spatial–temporal residual network (ST-ResNet), spatial–temporal graph convolutional network (ST-GCN), dynamic global-local spatial–temporal network (DGLSTNet), and ST-3DNet, respectively.


2021 ◽  
Vol 2021 ◽  
pp. 1-13
Author(s):  
Jinling Zhang ◽  
Jun Yang ◽  
Min Zhao

To study the influence of different sequences of magnetic resonance imaging (MRI) images on the segmentation of hepatocellular carcinoma (HCC) lesions, the U-Net was improved. Moreover, deep fusion network (DFN), data enhancement strategy, and random data (RD) strategy were introduced, and a multisequence MRI image segmentation algorithm based on DFN was proposed. The segmentation experiments of single-sequence MRI image and multisequence MRI image were designed, and the segmentation result of single-sequence MRI image was compared with those of convolutional neural network (FCN) algorithm. In addition, RD experiment and single-input experiment were also designed. It was found that the sensitivity (0.595 ± 0.145) and DSC (0.587 ± 0.113) obtained by improved U-Net were significantly higher than the sensitivity (0.405 ± 0.098) and DSC (0.468 ± 0.115, P < 0.05 ) obtained by U-Net. The sensitivity of multisequence MRI image segmentation algorithm based on DFN (0.779 ± 0.015) was significantly higher than that of FCN algorithm (0.604 ± 0.056, P < 0.05 ). The multisequence MRI image segmentation algorithm based on the DFN had higher indicators for liver cancer lesions than those of the improved U-Net. When RD was added, it not only increased the DSC of the single-sequence network enhanced by the hepatocyte-specific magnetic resonance contrast agent (Gd-EOB-DTPA) by 1% but also increased the DSC of the multisequence MRI image segmentation algorithm based on DFN by 7.6%. In short, the improved U-Net can significantly improve the recognition rate of small lesions in liver cancer patients. The addition of RD strategy improved the segmentation indicators of liver cancer lesions of the DFN and can fuse image features of multiple sequences, thereby improving the accuracy of lesion segmentation.


2020 ◽  
Vol 2020 (9) ◽  
pp. 168-1-168-7
Author(s):  
Roger Gomez Nieto ◽  
Hernan Dario Benitez Restrepo ◽  
Roger Figueroa Quintero ◽  
Alan Bovik

Video Quality Assessment (VQA) is an essential topic in several industries ranging from video streaming to camera manufacturing. In this paper, we present a novel method for No-Reference VQA. This framework is fast and does not require the extraction of hand-crafted features. We extracted convolutional features of 3-D C3D Convolutional Neural Network and feed one trained Support Vector Regressor to obtain a VQA score. We did certain transformations to different color spaces to generate better discriminant deep features. We extracted features from several layers, with and without overlap, finding the best configuration to improve the VQA score. We tested the proposed approach in LIVE-Qualcomm dataset. We extensively evaluated the perceptual quality prediction model, obtaining one final Pearson correlation of 0:7749±0:0884 with Mean Opinion Scores, and showed that it can achieve good video quality prediction, outperforming other state-of-the-art VQA leading models.


Circulation ◽  
2020 ◽  
Vol 142 (Suppl_3) ◽  
Author(s):  
Priyanka Agarwal ◽  
Anna Shcherbina ◽  
Sharlene Day ◽  
Sara Saberi ◽  
Matthew E Mealiffe ◽  
...  

Introduction: Overall activity characteristics for patients with hypertrophic cardiomyopathy (HCM) have not been quantified previously. The relationship between physical activity quantified by accelerometry and biomarkers, exercise capacity, and quality of life in patients with HCM is also unknown. Methods: MAVERICK-HCM was a double-blind, placebo-controlled, 16-week study of mavacamten in 59 patients with symptomatic non-obstructive HCM. Patients were asked to wear ActiGraph GT9X Link wrist-worn monitors for ≥11 days between screening and day 1, and between weeks 12 and 16. Features derived from raw accelerometry data included average daily accelerometer units (ADAU) and step count. Univariate Pearson correlation coefficients were calculated between accelerometry data and clinical parameters among all patients. A multi-task convolutional neural network (CNN) was trained on raw accelerometry datapoints to jointly predict clinical markers of HCM severity. Test and training sets were derived by randomly segmenting each patient’s triaxial accelerometry data into non-overlapping minute intervals. Results: Fifty patients wore the accelerometer for ≥1 compliant day. Mean wear time was 12 days during screening and 10 days during treatment. Activity measures are summarized and average step count was 3,076 steps at baseline ( Table ). Activity features correlated with peak oxygen uptake (pVO 2 ), log NT-proBNP, and KCCQ score ( Table ). CNN predictions of clinical measures from activity data found Spearman R correlations of 0.82 for pVO 2 , 0.92 for log NT-proBNP, 0.82 for KCCQ, and 0.79 for E/e’. Conclusions: HCM patients in the MAVERICK study averaged only 3,000 steps/day. Markers of physical activity drawn from accelerometry are associated with standard clinical markers of HCM severity. Deep learning models can be constructed to predict markers of HCM severity from patients’ raw accelerometry data.


2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Jing Zhang ◽  
Aiping Liu ◽  
Deng Liang ◽  
Xun Chen ◽  
Min Gao

Discovering shared, invariant feature representations across subjects in electrocardiogram (ECG) classification tasks is crucial for improving the generalization of models to unknown patients. Although deep neural networks have recently been emerging in extracting generalizable ECG features, they usually rely on labeled samples from a large number of subjects to guarantee generalization. Extracting invariant representations to intersubject variabilities from a small number of subjects is still a challenge today due to individual physical differences. To address this problem, we propose an adversarial deep neural network framework for interpatient heartbeat classification by integrating adversarial learning into a convolutional neural network to learn subject-invariant, class-discriminative features. The proposed method was evaluated on the MIT-BIH arrhythmia database which is a publicly available ECG dataset collected from 47 patients. Compared with the state-of-the-art methods, the proposed method achieves the highest performance for detecting supraventricular ectopic beats (SVEBs), which are very challenging to identify, and also gains comparable performance on the detection of ventricular ectopic beats (VEBs). The sensitivities of SVEBs and VEBs are 78.8% and 92.5%, respectively. The precisions of SVEBs and VEBs are 90.8% and 94.3%, respectively. With high performance in the detection of pathological classes (i.e., SVEBs and VEBs), this work provides a promising method for ECG classification tasks when the number of patients is limited.


2020 ◽  
Vol 36 (20) ◽  
pp. 5107-5108
Author(s):  
Francesco Ambrosetti ◽  
Tobias Hegelund Olsen ◽  
Pier Paolo Olimpieri ◽  
Brian Jiménez-García ◽  
Edoardo Milanetti ◽  
...  

Abstract Motivation Monoclonal antibodies are essential tools in the contemporary therapeutic armory. Understanding how these recognize their antigen is a fundamental step in their rational design and engineering. The rising amount of publicly available data is catalyzing the development of computational approaches able to offer valuable, faster and cheaper alternatives to classical experimental methodologies used for the study of antibody–antigen complexes. Results Here, we present proABC-2, an update of the original random-forest antibody paratope predictor, based on a convolutional neural network algorithm. We also demonstrate how the predictions can be fruitfully used to drive the docking in HADDOCK. Availability and implementation The proABC-2 server is freely available at: https://wenmr.science.uu.nl/proabc2/. Supplementary information Supplementary data are available at Bioinformatics online.


Author(s):  
Wanwen Zeng ◽  
Yong Wang ◽  
Rui Jiang

Abstract Motivation Interactions among cis-regulatory elements such as enhancers and promoters are main driving forces shaping context-specific chromatin structure and gene expression. Although there have been computational methods for predicting gene expression from genomic and epigenomic information, most of them neglect long-range enhancer–promoter interactions, due to the difficulty in precisely linking regulatory enhancers to target genes. Recently, HiChIP, a novel high-throughput experimental approach, has generated comprehensive data on high-resolution interactions between promoters and distal enhancers. Moreover, plenty of studies suggest that deep learning achieves state-of-the-art performance in epigenomic signal prediction, and thus promoting the understanding of regulatory elements. In consideration of these two factors, we integrate proximal promoter sequences and HiChIP distal enhancer–promoter interactions to accurately predict gene expression. Results We propose DeepExpression, a densely connected convolutional neural network, to predict gene expression using both promoter sequences and enhancer–promoter interactions. We demonstrate that our model consistently outperforms baseline methods, not only in the classification of binary gene expression status but also in regression of continuous gene expression levels, in both cross-validation experiments and cross-cell line predictions. We show that the sequential promoter information is more informative than the experimental enhancer information; meanwhile, the enhancer–promoter interactions within ±100 kbp around the TSS of a gene are most beneficial. We finally visualize motifs in both promoter and enhancer regions and show the match of identified sequence signatures with known motifs. We expect to see a wide spectrum of applications using HiChIP data in deciphering the mechanism of gene regulation. Availability and implementation DeepExpression is freely available at https://github.com/wanwenzeng/DeepExpression. Supplementary information Supplementary data are available at Bioinformatics online.


Sign in / Sign up

Export Citation Format

Share Document