scholarly journals In-Pero: Exploiting Deep Learning Embeddings of Protein Sequences to Predict the Localisation of Peroxisomal Proteins

2021 ◽  
Vol 22 (12) ◽  
pp. 6409
Author(s):  
Marco Anteghini ◽  
Vitor Martins dos Santos ◽  
Edoardo Saccenti

Peroxisomes are ubiquitous membrane-bound organelles, and aberrant localisation of peroxisomal proteins contributes to the pathogenesis of several disorders. Many computational methods focus on assigning protein sequences to subcellular compartments, but there are no specific tools tailored for the sub-localisation (matrix vs. membrane) of peroxisome proteins. We present here In-Pero, a new method for predicting protein sub-peroxisomal cellular localisation. In-Pero combines standard machine learning approaches with recently proposed multi-dimensional deep-learning representations of the protein amino-acid sequence. It showed a classification accuracy above 0.9 in predicting peroxisomal matrix and membrane proteins. The method is trained and tested using a double cross-validation approach on a curated data set comprising 160 peroxisomal proteins with experimental evidence for sub-peroxisomal localisation. We further show that the proposed approach can be easily adapted (In-Mito) to the prediction of mitochondrial protein localisation obtaining performances for certain classes of proteins (matrix and inner-membrane) superior to existing tools.

2021 ◽  
Author(s):  
Marco Anteghini ◽  
Vitor AP Martins dos Santos ◽  
Edoardo Saccenti

AbstractPeroxisomes are ubiquitous membrane-bound organelles, and aberrant localisation of peroxisomal proteins contributes to the pathogenesis of several disorders. Many computational methods focus on assigning protein sequences to subcellular compartments, but there are no specific tools tailored for the sub-localisation (matrix vs membrane) of peroxisome proteins. We present here In-Pero, a new method for predicting protein sub-peroxisomal cellular localisation. In-Pero combines standard machine learning approaches with recently proposed multi-dimensional deep-learning representations of the protein amino-acid sequence. It showed a classification accuracy above 0.9 in predicting peroxisomal matrix and membrane proteins. The method is trained and tested using a double cross-validation approach on a curated data set comprising 160 peroxisomal proteins with experimental evidence for sub-peroxisomal localisation. We further show that the proposed approach can be easily adapted (In-Mito) to the prediction of mitochondrial protein localisation obtaining performances for certain classes of proteins (matrix and inner-membrane) superior to existing tools. All data sets and codes are available at https://github.com/MarcoAnteghini and at www.systemsbiology.nl


2019 ◽  
Vol 2019 (1) ◽  
pp. 360-368
Author(s):  
Mekides Assefa Abebe ◽  
Jon Yngve Hardeberg

Different whiteboard image degradations highly reduce the legibility of pen-stroke content as well as the overall quality of the images. Consequently, different researchers addressed the problem through different image enhancement techniques. Most of the state-of-the-art approaches applied common image processing techniques such as background foreground segmentation, text extraction, contrast and color enhancements and white balancing. However, such types of conventional enhancement methods are incapable of recovering severely degraded pen-stroke contents and produce artifacts in the presence of complex pen-stroke illustrations. In order to surmount such problems, the authors have proposed a deep learning based solution. They have contributed a new whiteboard image data set and adopted two deep convolutional neural network architectures for whiteboard image quality enhancement applications. Their different evaluations of the trained models demonstrated their superior performances over the conventional methods.


2021 ◽  
Author(s):  
Antoine Bouziat ◽  
Sylvain Desroziers ◽  
Abdoulaye Koroko ◽  
Antoine Lechevallier ◽  
Mathieu Feraille ◽  
...  

<p>Automation and robotics raise growing interests in the mining industry. If not already a reality, it is no more science fiction to imagine autonomous robots routinely participating in the exploration and extraction of mineral raw materials in the near future. Among the various scientific and technical issues to be addressed towards this objective, this study focuses on the automation of real-time characterisation of rock images captured on the field, either to discriminate rock types and mineral species or to detect small elements such as mineral grains or metallic nuggets. To do so, we investigate the potential of methods from the Computer Vision community, a subfield of Artificial Intelligence dedicated to image processing. In particular, we aim at assessing the potential of Deep Learning approaches and convolutional neuronal networks (CNN) for the analysis of field samples pictures, highlighting key challenges before an industrial use in operational contexts.</p><p>In a first initiative, we appraise Deep Learning methods to classify photographs of macroscopic rock samples between 12 lithological families. Using the architecture of reference CNN and a collection of 2700 images, we achieve a prediction accuracy above 90% for new pictures of good photographic quality. Nonetheless we then seek to improve the robustness of the method for on-the-fly field photographs. To do so, we train an additional CNN to automatically separate the rock sample from the background, with a detection algorithm. We also introduce a more sophisticated classification method combining a set of several CNN with a decision tree. The CNN are specifically trained to recognise petrological features such as textures, structures or mineral species, while the decision tree mimics the naturalist methodology for lithological identification.</p><p>In a second initiative, we evaluate Deep Learning techniques to spot and delimitate specific elements in finer-scale images. We use a data set of carbonate thin sections with various species of microfossils. The data comes from a sedimentology study but analogies can be drawn with igneous geology use cases. We train four state-of-the-art Deep Learning methods for object detection with a limited data set of 15 annotated images. The results on 130 other thin section images are then qualitatively assessed by expert geologists, and precisions and inference times quantitatively measured. The four models show good capabilities in detecting and categorising the microfossils. However differences in accuracy and performance are underlined, leading to recommendations for comparable projects in a mining context.</p><p>Altogether, this study illustrates the power of Computer Vision and Deep Learning approaches to automate rock image analysis. However, to make the most of these technologies in mining activities, stimulating research opportunities lies in adapting the algorithms to the geological use cases, embedding as much geological knowledge as possible in the statistical models, and mitigating the number of training data to be manually interpreted beforehand.   </p>


Author(s):  
Shaila S. G. ◽  
Sunanda Rajkumari ◽  
Vadivel Ayyasamy

Deep learning is playing vital role with greater success in various applications, such as digital image processing, human-computer interaction, computer vision and natural language processing, robotics, biological applications, etc. Unlike traditional machine learning approaches, deep learning has effective ability of learning and makes better use of data set for feature extraction. Because of its repetitive learning ability, deep learning has become more popular in the present-day research works.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Dominik Müller ◽  
Frank Kramer

Abstract Background The increased availability and usage of modern medical imaging induced a strong need for automatic medical image segmentation. Still, current image segmentation platforms do not provide the required functionalities for plain setup of medical image segmentation pipelines. Already implemented pipelines are commonly standalone software, optimized on a specific public data set. Therefore, this paper introduces the open-source Python library MIScnn. Implementation The aim of MIScnn is to provide an intuitive API allowing fast building of medical image segmentation pipelines including data I/O, preprocessing, data augmentation, patch-wise analysis, metrics, a library with state-of-the-art deep learning models and model utilization like training, prediction, as well as fully automatic evaluation (e.g. cross-validation). Similarly, high configurability and multiple open interfaces allow full pipeline customization. Results Running a cross-validation with MIScnn on the Kidney Tumor Segmentation Challenge 2019 data set (multi-class semantic segmentation with 300 CT scans) resulted into a powerful predictor based on the standard 3D U-Net model. Conclusions With this experiment, we could show that the MIScnn framework enables researchers to rapidly set up a complete medical image segmentation pipeline by using just a few lines of code. The source code for MIScnn is available in the Git repository: https://github.com/frankkramer-lab/MIScnn.


Mathematics ◽  
2020 ◽  
Vol 8 (9) ◽  
pp. 1606
Author(s):  
Daniela Onita ◽  
Adriana Birlutiu ◽  
Liviu P. Dinu

Images and text represent types of content that are used together for conveying a message. The process of mapping images to text can provide very useful information and can be included in many applications from the medical domain, applications for blind people, social networking, etc. In this paper, we investigate an approach for mapping images to text using a Kernel Ridge Regression model. We considered two types of features: simple RGB pixel-value features and image features extracted with deep-learning approaches. We investigated several neural network architectures for image feature extraction: VGG16, Inception V3, ResNet50, Xception. The experimental evaluation was performed on three data sets from different domains. The texts associated with images represent objective descriptions for two of the three data sets and subjective descriptions for the other data set. The experimental results show that the more complex deep-learning approaches that were used for feature extraction perform better than simple RGB pixel-value approaches. Moreover, the ResNet50 network architecture performs best in comparison to the other three deep network architectures considered for extracting image features. The model error obtained using the ResNet50 network is less by approx. 0.30 than other neural network architectures. We extracted natural language descriptors of images and we made a comparison between original and generated descriptive words. Furthermore, we investigated if there is a difference in performance between the type of text associated with the images: subjective or objective. The proposed model generated more similar descriptions to the original ones for the data set containing objective descriptions whose vocabulary is simpler, bigger and clearer.


2017 ◽  
Author(s):  
Ariel Rokem ◽  
Yue Wu ◽  
Aaron Lee

AbstractDeep learning algorithms have tremendous potential utility in the classification of biomedical images. For example, images acquired with retinal optical coherence tomography (OCT) can be used to accurately classify patients with adult macular degeneration (AMD), and distinguish them from healthy control patients. However, previous research has suggested that large amounts of data are required in order to train deep learning algorithms, because of the large number of parameters that need to be fit. Here, we show that a moderate amount of data (data from approximately 1,800 patients) may be enough to reach close-to-maximal performance in the classification of AMD patients from OCT images. These results suggest that deep learning algorithms can be trained on moderate amounts of data, provided that images are relatively homogenous, and the effective number of parameters is sufficiently small. Furthermore, we demonstrate that in this application, cross-validation with a separate test set that is not used in any part of the training does not differ substantially from cross-validation with a validation data-set used to determine the optimal stopping point for training.


Complexity ◽  
2020 ◽  
Vol 2020 ◽  
pp. 1-15
Author(s):  
Xiu Kan ◽  
Dan Yang ◽  
Le Cao ◽  
Huisheng Shu ◽  
Yuanyuan Li ◽  
...  

As the medium of human-computer interaction, it is crucial to correctly and quickly interpret the motion information of surface electromyography (sEMG). Deep learning can recognize a variety of sEMG actions by end-to-end training. However, most of the existing deep learning approaches have complex structures and numerous parameters, which make the network optimization problem difficult to realize. In this paper, a novel PSO-based optimized lightweight convolution neural network (PLCNN) is designed to improve the accuracy and optimize the model with applications in sEMG signal movement recognition. With the purpose of reducing the structural complexity of the deep neural network, the designed convolution neural network model is mainly composed of three convolution layers and two full connection layers. Meanwhile, the particle swarm optimization (PSO) is used to optimize hyperparameters and improve the autoadaptive ability of the designed sEMG pattern recognition model. To further indicate the potential application, three experiments are designed according to the progressive process of body movements with respect to the Ninapro standard data set. Experiment results demonstrate that the proposed PLCNN recognition method is superior to the four other popular classification methods.


2021 ◽  
Vol 23 (06) ◽  
pp. 10-22
Author(s):  
Ms. Anshika Shukla ◽  
◽  
Mr. Sanjeev Kumar Shukla ◽  

In recent years, there are various methods for source code classification using deep learning approaches have been proposed. The classification accuracy of the method using deep learning is greatly influenced by the training data set. Therefore, it is possible to create a model with higher accuracy by improving the construction method of the training data set. In this study, we propose a dynamic learning data set improvement method for source code classification using deep learning. In the proposed method, we first train and verify the source code classification model using the training data set. Next, we reconstruct the training data set based on the verification result. We create a high-precision model by repeating this learning and reconstruction and improving the learning data set. In the evaluation experiment, the source code classification model was learned using the proposed method, and the classification accuracy was compared with the three baseline methods. As a result, it was found that the model learned using the proposed method has the highest classification accuracy. We also confirmed that the proposed method improves the classification accuracy of the model from 0.64 to 0.96


2021 ◽  
Author(s):  
Mohamed A. Naser ◽  
Kareem A. Wahid ◽  
Abdallah Sherif Radwan Mohamed ◽  
Moamen Abobakr Abdelaal ◽  
Renjie He ◽  
...  

Determining progression-free survival (PFS) for head and neck squamous cell carcinoma (HNSCC) patients is a challenging but pertinent task that could help stratify patients for improved overall outcomes. PET/CT images provide a rich source of anatomical and metabolic data for potential clinical biomarkers that would inform treatment decisions and could help improve PFS. In this study, we participate in the 2021 HECKTOR Challenge to predict PFS in a large dataset of HNSCC PET/CT images using deep learning approaches. We develop a series of deep learning models based on the DenseNet architecture using a negative log-likelihood loss function that utilizes PET/CT images and clinical data as separate input channels to predict PFS in days. Internal model validation based on 10-fold cross-validation using the training data (N=224) yielded C-index values up to 0.622 (without) and 0.842 (with) censoring status considered in C-index computation, respectively. We then implemented model ensembling approaches based on the training data cross-validation folds to predict the PFS of the test set patients (N=101). External validation on the test set for the best ensembling method yielded a C-index value of 0.694. Our results are a promising example of how deep learning approaches can effectively utilize imaging and clinical data for medical outcome prediction in HNSCC, but further work in optimizing these processes is needed.


Sign in / Sign up

Export Citation Format

Share Document