scholarly journals Sparse Logistic Regression: Comparison of Regularization and Bayesian Implementations

Algorithms ◽  
2020 ◽  
Vol 13 (6) ◽  
pp. 137
Author(s):  
Mattia Zanon ◽  
Giuliano Zambonin ◽  
Gian Antonio Susto ◽  
Seán McLoone

In knowledge-based systems, besides obtaining good output prediction accuracy, it is crucial to understand the subset of input variables that have most influence on the output, with the goal of gaining deeper insight into the underlying process. These requirements call for logistic model estimation techniques that provide a sparse solution, i.e., where coefficients associated with non-important variables are set to zero. In this work we compare the performance of two methods: the first one is based on the well known Least Absolute Shrinkage and Selection Operator (LASSO) which involves regularization with an ℓ 1 norm; the second one is the Relevance Vector Machine (RVM) which is based on a Bayesian implementation of the linear logistic model. The two methods are extensively compared in this paper, on real and simulated datasets. Results show that, in general, the two approaches are comparable in terms of prediction performance. RVM outperforms the LASSO both in term of structure recovery (estimation of the correct non-zero model coefficients) and prediction accuracy when the dimensionality of the data tends to increase. However, LASSO shows comparable performance to RVM when the dimensionality of the data is much higher than number of samples that is p > > n .

Author(s):  
J. Jagan ◽  
Prabhakar Gundlapalli ◽  
Pijush Samui

The determination of liquefaction susceptibility of soil is a paramount project in geotechnical earthquake engineering. This chapter adopts Support Vector Machine (SVM), Relevance Vector Machine (RVM) and Least Square Support Vector Machine (LSSVM) for determination of liquefaction susceptibility based on Cone Penetration Test (CPT) from Chi-Chi earthquake. Input variables of SVM, RVM and LSSVM are Cone Resistance (qc) and Peak Ground Acceleration (amax/g). SVM, RVM and LSSVM have been used as classification tools. The developed SVM, RVM and LSSVM give equations for determination of liquefaction susceptibility of soil. The comparison between the developed models has been carried out. The results show that SVM, RVM and LSSVM are the robust models for determination of liquefaction susceptibility of soil.


Author(s):  
Pascalis Kadaro Matthew ◽  
Abubakar Yahaya

<p>Some few decades ago, penalized regression techniques for linear regression have been developed specifically to reduce the flaws inherent in the prediction accuracy of the classical ordinary least squares (OLS) regression technique. In this paper, we used a diabetes data set obtained from previous literature to compare three of these well-known techniques, namely: Least Absolute Shrinkage Selection Operator (LASSO), Elastic Net and Correlation Adjusted Elastic Net (CAEN). After thorough analysis, it was observed that CAEN generated a less complex model.</p>


Water ◽  
2019 ◽  
Vol 11 (6) ◽  
pp. 1226 ◽  
Author(s):  
Mohammed Falah Allawi ◽  
Faridah Binti Othman ◽  
Haitham Abdulmohsin Afan ◽  
Ali Najah Ahmed ◽  
Md. Shabbir Hossain ◽  
...  

The current study explored the impact of climatic conditions on predicting evaporation from a reservoir. Several models have been developed for evaporation prediction under different scenarios, with artificial intelligence (AI) methods being the most popular. However, the existing models rely on several climatic parameters as inputs to achieve an acceptable accuracy level, some of which have been unavailable in certain case studies. In addition, the existing AI-based models for evaporation prediction have paid less attention to the influence of the time increment rate on the prediction accuracy level. This study investigated the ability of the radial basis function neural network (RBF-NN) and support vector regression (SVR) methods to develop an evaporation rate prediction model for a tropical area at the Layang Reservoir, Johor River, Malaysia. Two scenarios for input architecture were explored in order to examine the effectiveness of different input variable patterns on the model prediction accuracy. For the first scenario, the input architecture considered only the historical evaporation rate time series, while the mean temperature and evaporation rate were used as input variables for the second scenario. For both scenarios, three time-increment series (daily, weekly, and monthly) were considered.


2020 ◽  
Author(s):  
Xuan Liu ◽  
Sara J.C. Gosline ◽  
Lance T. Pflieger ◽  
Pierre Wallet ◽  
Archana Iyer ◽  
...  

AbstractSingle-cell RNA sequencing is an emerging strategy for characterizing the immune cell population in diverse environments including blood, tumor or healthy tissues. While this has traditionally been done with flow or mass cytometry targeting protein expression, scRNA-Seq has several established and potential advantages in that it can profile immune cells and non-immune cells (e.g. cancer cells) in the same sample, identify cell types that lack precise markers for flow cytometry, or identify a potentially larger number of immune cell types and activation states than is achievable in a single flow assay. However, scRNA-Seq is currently limited due to the need to identify the types of each immune cell from its transcriptional profile, which is not only time-consuming but also requires a significant knowledge of immunology. While recently developed algorithms accurately annotate coarse cell types (e.g. T cells vs macrophages), making fine distinctions has turned out to be a difficult challenge. To address this, we developed a machine learning classifier called ImmClassifier that leverages a hierarchical ontology of cell type. We demonstrate that ImmClassifier outperforms other tools (+20% recall, +14% precision) in distinguishing fine-grained cell types (e.g. CD8+ effector memory T cells) with comparable performance on coarse ones. Thus, ImmClassifier can be used to explore more deeply the heterogeneity of the immune system in scRNA-Seq experiments.


Energies ◽  
2019 ◽  
Vol 12 (15) ◽  
pp. 2860 ◽  
Author(s):  
Jee-Heon Kim ◽  
Nam-Chul Seong ◽  
Wonchang Choi

This study was conducted to develop an energy consumption model of a chiller in a heating, ventilation, and air conditioning system using a machine learning algorithm based on artificial neural networks. The proposed chiller energy consumption model was evaluated for accuracy in terms of input layers that include the number of input variables, amount (proportion) of training data, and number of neurons. A standardized reference building was also modeled to generate operational data for the chiller system during extended cooling periods (warm weather months). The prediction accuracy of the chiller’s energy consumption was improved by increasing the number of input variables and adjusting the proportion of training data. By contrast, the effect of the number of neurons on the prediction accuracy was insignificant. The developed chiller model was able to predict energy consumption with 99.07% accuracy based on eight input variables, 60% training data, and 12 neurons.


2019 ◽  
Vol 44 (4) ◽  
pp. 473-503 ◽  
Author(s):  
Peida Zhan ◽  
Hong Jiao ◽  
Kaiwen Man ◽  
Lijun Wang

In this article, we systematically introduce the just another Gibbs sampler (JAGS) software program to fit common Bayesian cognitive diagnosis models (CDMs) including the deterministic inputs, noisy “and” gate model; the deterministic inputs, noisy “or” gate model; the linear logistic model; the reduced reparameterized unified model; and the log-linear CDM (LCDM). Further, we introduce the unstructured latent structural model and the higher order latent structural model. We also show how to extend these models to consider polytomous attributes, the testlet effect, and longitudinal diagnosis. Finally, we present an empirical example as a tutorial to illustrate how to use JAGS codes in R.


2016 ◽  
Vol 25 (10) ◽  
pp. 1825-1833 ◽  
Author(s):  
Ji-Yong An ◽  
Fan-Rong Meng ◽  
Zhu-Hong You ◽  
Xing Chen ◽  
Gui-Ying Yan ◽  
...  

2019 ◽  
Vol 06 (03) ◽  
pp. 363-376 ◽  
Author(s):  
Gharbi Alshammari ◽  
Stelios Kapetanakis ◽  
Abdullah Alshammari ◽  
Nikolaos Polatidis ◽  
Miltos Petridis

Recommender systems help users find relevant items efficiently based on their interests and historical interactions with other users. They are beneficial to businesses by promoting the sale of products and to user by reducing the search burden. Recommender systems can be developed by employing different approaches, including collaborative filtering (CF), demographic filtering (DF), content-based filtering (CBF) and knowledge-based filtering (KBF). However, large amounts of data can produce recommendations that are limited in accuracy because of diversity and sparsity issues. In this paper, we propose a novel hybrid method that combines user–user CF with the attributes of DF to indicate the nearest users, and compare four classifiers against each other. This method has been developed through an investigation of ways to reduce the errors in rating predictions based on users’ past interactions, which leads to improved prediction accuracy in all four classification algorithms. We applied a feature combination method that improves the prediction accuracy and to test our approach, we ran an offline evaluation using the 1M MovieLens dataset, well-known evaluation metrics and comparisons between methods with the results validating our proposed method.


Sign in / Sign up

Export Citation Format

Share Document