scholarly journals Modern machine learning outperforms GLMs at predicting spikes

2017 ◽  
Author(s):  
Ari S. Benjamin ◽  
Hugo L. Fernandes ◽  
Tucker Tomlinson ◽  
Pavan Ramkumar ◽  
Chris VerSteeg ◽  
...  

AbstractNeuroscience has long focused on finding encoding models that effectively ask “what predicts neural spiking?” and generalized linear models (GLMs) are a typical approach. It is often unknown how much of explainable neural activity is captured, or missed, when fitting a GLM. Here we compared the predictive performance of GLMs to three leading machine learning methods: feedforward neural networks, gradient boosted trees (using XGBoost), and stacked ensembles that combine the predictions of several methods. We predicted spike counts in macaque motor (M1) and somatosensory (S1) cortices from standard representations of reaching kinematics, and in rat hippocampal cells from open field location and orientation. In general, the modern methods (particularly XGBoost and the ensemble) produced more accurate spike predictions and were less sensitive to the preprocessing of features. This discrepancy in performance suggests that standard feature sets may often relate to neural activity in a nonlinear manner not captured by GLMs. Encoding models built with machine learning techniques, which can be largely automated, more accurately predict spikes and can offer meaningful benchmarks for simpler models.

2020 ◽  
Vol 28 (2) ◽  
pp. 253-265 ◽  
Author(s):  
Gabriela Bitencourt-Ferreira ◽  
Amauri Duarte da Silva ◽  
Walter Filgueira de Azevedo

Background: The elucidation of the structure of cyclin-dependent kinase 2 (CDK2) made it possible to develop targeted scoring functions for virtual screening aimed to identify new inhibitors for this enzyme. CDK2 is a protein target for the development of drugs intended to modulate cellcycle progression and control. Such drugs have potential anticancer activities. Objective: Our goal here is to review recent applications of machine learning methods to predict ligand- binding affinity for protein targets. To assess the predictive performance of classical scoring functions and targeted scoring functions, we focused our analysis on CDK2 structures. Methods: We have experimental structural data for hundreds of binary complexes of CDK2 with different ligands, many of them with inhibition constant information. We investigate here computational methods to calculate the binding affinity of CDK2 through classical scoring functions and machine- learning models. Results: Analysis of the predictive performance of classical scoring functions available in docking programs such as Molegro Virtual Docker, AutoDock4, and Autodock Vina indicated that these methods failed to predict binding affinity with significant correlation with experimental data. Targeted scoring functions developed through supervised machine learning techniques showed a significant correlation with experimental data. Conclusion: Here, we described the application of supervised machine learning techniques to generate a scoring function to predict binding affinity. Machine learning models showed superior predictive performance when compared with classical scoring functions. Analysis of the computational models obtained through machine learning could capture essential structural features responsible for binding affinity against CDK2.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Tomoaki Mameno ◽  
Masahiro Wada ◽  
Kazunori Nozaki ◽  
Toshihito Takahashi ◽  
Yoshitaka Tsujioka ◽  
...  

AbstractThe purpose of this retrospective cohort study was to create a model for predicting the onset of peri-implantitis by using machine learning methods and to clarify interactions between risk indicators. This study evaluated 254 implants, 127 with and 127 without peri-implantitis, from among 1408 implants with at least 4 years in function. Demographic data and parameters known to be risk factors for the development of peri-implantitis were analyzed with three models: logistic regression, support vector machines, and random forests (RF). As the results, RF had the highest performance in predicting the onset of peri-implantitis (AUC: 0.71, accuracy: 0.70, precision: 0.72, recall: 0.66, and f1-score: 0.69). The factor that had the most influence on prediction was implant functional time, followed by oral hygiene. In addition, PCR of more than 50% to 60%, smoking more than 3 cigarettes/day, KMW less than 2 mm, and the presence of less than two occlusal supports tended to be associated with an increased risk of peri-implantitis. Moreover, these risk indicators were not independent and had complex effects on each other. The results of this study suggest that peri-implantitis onset was predicted in 70% of cases, by RF which allows consideration of nonlinear relational data with complex interactions.


F1000Research ◽  
2017 ◽  
Vol 6 ◽  
pp. 2012 ◽  
Author(s):  
Hashem Koohy

In the era of explosion in biological data, machine learning techniques are becoming more popular in life sciences, including biology and medicine. This research note examines the rise and fall of the most commonly used machine learning techniques in life sciences over the past three decades.


Metagenomics ◽  
2017 ◽  
Vol 1 (1) ◽  
Author(s):  
Hayssam Soueidan ◽  
Macha Nikolski

AbstractOwing to the complexity and variability of metagenomic studies, modern machine learning approaches have seen increased usage to answer a variety of question encompassing the full range of metagenomic NGS data analysis.We review here the contribution of machine learning techniques for the field of metagenomics, by presenting known successful approaches in a unified framework. This review focuses on five important metagenomic problems:OTU-clustering, binning, taxonomic proffiing and assignment, comparative metagenomics and gene prediction. For each of these problems, we identify the most prominent methods, summarize the machine learning approaches used and put them into perspective of similar methods.We conclude our review looking further ahead at the challenge posed by the analysis of interactions within microbial communities and different environments, in a field one could call “integrative metagenomics”.


Author(s):  
Juan Gómez-Sanchis ◽  
Emilio Soria-Olivas ◽  
Marcelino Martinez-Sober ◽  
Jose Blasco ◽  
Juan Guerrero ◽  
...  

This work presents a new approach for one of the main problems in the analysis of atmospheric phenomena, the prediction of atmospheric concentrations of different elements. The proposed methodology is more efficient than other classical approaches and is used in this work to predict tropospheric ozone concentration. The relevance of this problem stems from the fact that excessive ozone concentrations may cause several problems related to public health. Previous research by the authors of this work has shown that the classical approach to this problem (linear models) does not achieve satisfactory results in tropospheric ozone concentration prediction. The authors’ approach is based on Machine Learning (ML) techniques, which include algorithms related to neural networks, fuzzy systems and advanced statistical techniques for data processing. In this work, the authors focus on one of the main ML techniques, namely, neural networks. These models demonstrate their suitability for this problem both in terms of prediction accuracy and information extraction.


Author(s):  
Vidyullatha P ◽  
D. Rajeswara Rao

<p>Curve fitting is one of the procedures in data analysis and is helpful for prediction analysis showing graphically how the data points are related to one another whether it is in linear or non-linear model. Usually, the curve fit will find the concentrates along the curve or it will just use to smooth the data and upgrade the presence of the plot. Curve fitting checks the relationship between independent variables and dependent variables with the objective of characterizing a good fit model. Curve fitting finds mathematical equation that best fits given information. In this paper, 150 unorganized data points of environmental variables are used to develop Linear and non-linear data modelling which are evaluated by utilizing 3 dimensional ‘Sftool’ and ‘Labfit’ machine learning techniques. In Linear model, the best estimations of the coefficients are realized by the estimation of R- square turns in to one and in Non-Linear models with least Chi-square are the criteria. </p>


2020 ◽  
Vol 11 (2) ◽  
pp. 71-85
Author(s):  
Nhat-Vinh Lu ◽  
Trong-Nhan Vuong ◽  
Duy-Tai Dinh

Sensory evaluation plays an important role in the food and consumer goods industry. In recent years, the application of machine learning techniques to support food sensory evaluation has become popular. Many different machine learning methods have been applied and produced positive results in this field. In this article, the authors propose a new method to support sensory evaluation on multiple criteria based on the use of a correlation-based feature selection technique, combined with machine learning methods such as linear regression, multilayer perceptron, support vector machine, and random forest. Experimental results are based on considering the correlation between physicochemical components and sensory factors on the Saigon beer dataset.


Author(s):  
G.M. Shafiullah ◽  
Adam Thompson ◽  
Peter J. Wolfs ◽  
A.B.M. Shawkat Ali

Emerging wireless sensor networking (WSN) and modern machine learning techniques have encouraged interest in the development of vehicle health monitoring (VHM) systems that ensure secure and reliable operation of the rail vehicle. The performance of rail vehicles running on railway tracks is governed by the dynamic behaviours of railway bogies especially in the cases of lateral instability and track irregularities. In order to ensure safety and reliability of railway in this chapter, a forecasting model has been developed to investigate vertical acceleration behaviour of railway wagons attached to a moving locomotive using modern machine learning techniques. Initially, an energy-efficient data acquisition model has been proposed for WSN applications using popular learning algorithms. Later, a prediction model has been developed to investigate both front and rear body vertical acceleration behaviour. Different types of models can be built using a uniform platform to evaluate their performances and estimate different attributes’ correlation coefficient (CC), root mean square error (RMSE), mean absolute error (MAE), root relative squared error (RRSE), relative absolute error (RAE) and computation complexity for each of the algorithm. Finally, spectral analysis of front and rear body vertical condition is produced from the predicted data using Fast Fourier Transform (FFT) and used to generate precautionary signals and system status which can be used by the locomotive driver for deciding upon necessary actions.


2021 ◽  
Author(s):  
Clémence Corminboeuf ◽  
Alberto Fabrizio ◽  
Ksenia Briling

Abstract Physics-based molecular representations are the cornerstone of all modern machine learning techniques applied to solve chemical problems. While several approaches exist to design ever more accurate fingerprints, the majority resolves in including more physics to construct larger and more complex representations. Here, we present an alternative approach to harness the complexity of chemical information into a lightweight numerical form, naturally invariant under real-space transformations, and seamlessly including the information about the charge state of a molecule. The Spectrum of Approximated Hamiltonian Matrices (SPAHM) leverages the information contained in widely-used and readily-evaluated ``guess'' Hamiltonians to form a robust fingerprint for quantum machine learning. Relying on the origin of the SPAHM fingerprints and a hierarchy of approximate Hamiltonians, we analyze the relative merits of adding physics into molecular representations and find that alternative strategies, focusing more on the machine learning task, represent a clear route towards direct improvements.


Sign in / Sign up

Export Citation Format

Share Document