scholarly journals SnapKin: a snapshot deep learning ensemble for kinase-substrate prediction from phosphoproteomics data

2021 ◽  
Author(s):  
Michael Lin ◽  
Di Xiao ◽  
Thomas A. Geddes ◽  
James G. Burchfield ◽  
Benjamin L. Parker ◽  
...  

AbstractMass spectrometry (MS)-based phosphoproteomics enables the quantification of proteome-wide phosphorylation in cells and tissues. A major challenge in MS-based phosphoproteomics lies in identifying the substrates of kinases, as currently only a small fraction of substrates identified can be confidently linked with a known kinase. By leveraging large-scale phosphoproteomics data, machine learning has become an increasingly popular approach for computationally predicting substrates of kinases. However, the small number of high-quality experimentally validated kinase substrates (true positive) and the high data noise in many phosphoproteomics datasets together impact the performance of existing approaches. Here, we aim to develop advanced kinase-substrate prediction methods to address these challenges. Using a collection of seven large phosphoproteomics datasets, including six published datasets and a new muscle differentiation dataset, and both traditional and deep learning models, we first demonstrate that a ‘pseudo-positive’ learning strategy for alleviating small sample size is effective at improving model predictive performance. We next show that a data re-sampling based ensemble learning strategy is useful for improving model stability while further enhancing prediction. Lastly, we introduce an ensemble deep learning model (‘SnapKin’) incorporating the above two learning strategies into a ‘snapshot’ ensemble learning algorithm. We demonstrate that the SnapKin model achieves overall the best performance in kinase-substrate prediction. Together, we propose SnapKin as a promising approach for predicting substrates of kinases from large-scale phosphoproteomics data. SnapKin is freely available at https://github.com/PYangLab/SnapKin.

2021 ◽  
Author(s):  
Jacob Johnson ◽  
Kaneel Senevirathne ◽  
Lawrence Ngo

In this work, we report the results of a deep-learning based liver lesion detection algorithm. While several liver lesion segmentation and classification algorithms have been developed, none of the previous work has focused on detecting suspicious liver lesions. Furthermore, their generalizability remains a pitfall due to their small sample size and sample homogeneity. Here, we developed and validated a highly generalizable deep-learning algorithm for detection of suspicious liver lesions. The algorithm was trained and tested on a diverse dataset containing CT exams from over 2,000 hospital sites in the United States. Our final model achieved an AUROC of 0.84 with a specificity of 0.99 while maintaining a sensitivity of 0.33.


Author(s):  
Sujata Dash ◽  
Bichitrananda Patra

High dimension and small sample size is an inherent problem of gene expression datasets which makes the analysis process more complex. The present study has developed a novel learning scheme that encapsulates a hybrid evolutionary fuzzy-rough feature selection model with an adaptive neural net ensemble. Fuzzy-rough method deals with uncertainty and impreciseness of real valued gene expression dataset and evolutionary search concept optimizes the subset selection process. The efficiency of the hybrid-FRGSNN model is evaluated by the proposed neural net ensemble learning algorithm. Again to prove the learning capability of ensemble algorithm, performance of the component classifiers pairing with FR, GSNN and FRGSNN are compared with proposed hybrid-FRGSNN based ensemble model. In addition to this, efficiency of neural net ensemble is compared with two classical and one advanced ensemble learning algorithms.


Author(s):  
Ryan Schmid ◽  
Jacob Johnson ◽  
Jennifer Ngo ◽  
Christine Lamoureux ◽  
Brian Baker ◽  
...  

AbstractSeveral algorithms have been developed for the detection of pulmonary embolism, though generalizability and bias remain potential weaknesses due to small sample size and sample homogeneity. We developed and validated a highly generalizable deep-learning algorithm, Emboleye, for the detection of PE by using a large and diverse dataset, which included 30,574 computed tomography (CT) exams sourced from over 2,000 hospital sites. On angiography exams, Emboleye demonstrates an AUROC of 0.79 with a specificity of 0.99 while maintaining a sensitivity of 0.37 and PPV of 0.77. On non-angiography CT exams, Emboleye demonstrates an AUROC of 0.77 with a specificity of 0.99 while maintaining a sensitivity of 0.18 and PPV of 0.35.


2008 ◽  
Vol 31 (4) ◽  
pp. 19
Author(s):  
I Pasic ◽  
A Shlien ◽  
A Novokmet ◽  
C Zhang ◽  
U Tabori ◽  
...  

Introduction: OS, a common Li-Fraumeni syndrome (LFS)-associated neoplasm, is a common bone malignancy of children and adolescents. Sporadic OS is also characterized by young age of onset and high genomic instability, suggesting a genetic contribution to disease. This study examined the contribution of novel DNA structural variation elements, CNVs, to OS susceptibility. Given our finding of excessive constitutional DNA CNV in LFS patients, which often coincide with cancer-related genes, we hypothesized that constitutional CNV may also provide clues about the aetiology of LFS-related sporadic neoplasms like OS. Methods: CNV in blood DNA of 26 patients with sporadic OS was compared to that of 263 normal control samples from the International HapMap project, as well as 62 local controls. Analysis was performed on DNA hybridized to Affymetrix genome-wide human SNP array 6.0 by Partek Genomic Suite. Results: There was no detectable difference in average number of CNVs, CNV length, and total structural variation (product of average CNV number and length) between individuals with OS and controls. While this data is preliminary (small sample size), it argues against the presence of constitutional genomic instability in individuals with sporadic OS. Conclusion: We found that the majority of tumours from patients with sporadic OS show CN loss at chr3q13.31, raising the possibility that chr3q13.31 may represent a “driver” region in OS aetiology. In at least one OS tumour, which displays CN loss at chr3q13.31, we demonstrate decreased expression of a known tumour suppressor gene located at chr3q13.31. We are investigating the role ofchr3q13.31 in development of OS.


2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Bangtong Huang ◽  
Hongquan Zhang ◽  
Zihong Chen ◽  
Lingling Li ◽  
Lihua Shi

Deep learning algorithms are facing the limitation in virtual reality application due to the cost of memory, computation, and real-time computation problem. Models with rigorous performance might suffer from enormous parameters and large-scale structure, and it would be hard to replant them onto embedded devices. In this paper, with the inspiration of GhostNet, we proposed an efficient structure ShuffleGhost to make use of the redundancy in feature maps to alleviate the cost of computations, as well as tackling some drawbacks of GhostNet. Since GhostNet suffers from high computation of convolution in Ghost module and shortcut, the restriction of downsampling would make it more difficult to apply Ghost module and Ghost bottleneck to other backbone. This paper proposes three new kinds of ShuffleGhost structure to tackle the drawbacks of GhostNet. The ShuffleGhost module and ShuffleGhost bottlenecks are utilized by the shuffle layer and group convolution from ShuffleNet, and they are designed to redistribute the feature maps concatenated from Ghost Feature Map and Primary Feature Map. Besides, they eliminate the gap of them and extract the features. Then, SENet layer is adopted to reduce the computation cost of group convolution, as well as evaluating the importance of the feature maps which concatenated from Ghost Feature Maps and Primary Feature Maps and giving proper weights for the feature maps. This paper conducted some experiments and proved that the ShuffleGhostV3 has smaller trainable parameters and FLOPs with the ensurance of accuracy. And with proper design, it could be more efficient in both GPU and CPU side.


Doctor Ru ◽  
2021 ◽  
Vol 20 (9) ◽  
pp. 43-47
Author(s):  
E.Yu. Mozheyko ◽  
◽  
O.V. Petryaeva ◽  
◽  
◽  
...  

Objective of the Review: To collect information, analyse and evaluate previous studies in the use of biofeedback in neurological patients. Key Points. Despite the wide practical application and a lot of available publications, the level of evidence of this method is low because of a small sample size and the challenges with biofeedback mechanism description. A review of various types of biocontrol, its mechanisms and developments shows that drug-free therapy using only patient’s resources (organic, psychological, emotional and volitional) can activate the mechanisms of neuroplasticity, which are poorly studied. Still, it does not prevent from using biocontrol for the therapy of patients and/or prevention of various diseases in healthy population. Conclusion. Biofeedback therapy has proven to be a safe, relatively efficient and easy-to-use method. However, organisation of a large-scale double blind randomized trial is one of the predominant directions in the future. Keywords: biofeedback, biocontrol, neurofeedback, biofeedback therapy.


2017 ◽  
Author(s):  
Stefano Beretta ◽  
Mauro Castelli ◽  
Ivo Gonçalves ◽  
Ivan Merelli ◽  
Daniele Ramazzotti

AbstractGene and protein networks are very important to model complex large-scale systems in molecular biology. Inferring or reverseengineering such networks can be defined as the process of identifying gene/protein interactions from experimental data through computational analysis. However, this task is typically complicated by the enormously large scale of the unknowns in a rather small sample size. Furthermore, when the goal is to study causal relationships within the network, tools capable of overcoming the limitations of correlation networks are required. In this work, we make use of Bayesian Graphical Models to attach this problem and, specifically, we perform a comparative study of different state-of-the-art heuristics, analyzing their performance in inferring the structure of the Bayesian Network from breast cancer data.


2021 ◽  
Author(s):  
Benjamin Schwarz ◽  
Korbinian Sager ◽  
Philippe Jousset ◽  
Gilda Currenti ◽  
Charlotte Krawczyk ◽  
...  

<p><span>Fiber-optic cables form an integral part of modern telecommunications infrastructure and are ubiquitous in particular in regions where dedicated seismic instrumentation is traditionally sparse or lacking entirely. Fiber-optic seismology promises to enable affordable and time-extended observations of earth and environmental processes at an unprecedented temporal and spatial resolution. The method’s unique potential for combined large-N and large-T observations implies intriguing opportunities but also significant challenges in terms of data storage, data handling and computation.</span></p><p><span>Our goal is to enable real-time data enhancement, rapid signal detection and wave field characterization without the need for time-demanding user interaction. We therefore combine coherent wave field analysis, an optics-inspired processing framework developed in controlled-source seismology, with state-of-the-art deep convolutional neural network (CNN) architectures commonly used in visual perception. While conventional deep learning strategies have to rely on manually labeled or purely synthetic training datasets, coherent wave field analysis labels field data based on physical principles and enables large-scale and purely data-driven training of the CNN models. The shear amount of data already recorded in various settings makes artificial data generation by numerical modeling superfluous – a task that is often constrained by incomplete knowledge of the embedding medium and an insufficient description of processes at or close to the surface, which are challenging to capture in integrated simulations.</span></p><p><span>Applications to extensive field datasets acquired with dark-fiber infrastructure at a geothermal field in SW Iceland and in a town at the flank of Mt Etna, Italy, reveal that the suggested framework generalizes well across different observational scales and environments, and sheds new light on the origin of a broad range of physically distinct wave fields that can be sensed with fiber-optic technology. Owing to the real-time applicability with affordable computing infrastructure, our analysis lends itself well to rapid on-the-fly data enhancement, wave field separation and compression strategies, thereby promising to have a positive impact on the full processing chain currently in use in fiber-optic seismology.</span></p>


Proceedings ◽  
2019 ◽  
Vol 42 (1) ◽  
pp. 8
Author(s):  
José Pablo Quesada Molina ◽  
Luca Rosafalco ◽  
Stefano Mariani

Deep Learning strategies recently emerged as powerful tools for the characterization of heterogeneous materials. In this work, we discuss an approach for the characterization of the mechanical response of polysilicon films that typically constitute the movable structures of micro-electro-mechanical systems (MEMS). A dataset of microstructures is digitally generated and a neural network is trained to provide the appropriate scattering in the values of the overall stiffness (in terms of the Young’s modulus) of the grain aggregate. Since results are framed within a stochastic procedure, the aim of the learning strategy is not to accurately reproduce the microstructure-informed response of the polysilicon film, but instead to provide a fast tool to be used at the device level for Monte Carlo analysis of the relevant performance indices. Accuracy of the proposed approach is assessed for very small samples of the polycrystalline aggregate to check if size effects are correctly captured.


Sign in / Sign up

Export Citation Format

Share Document