Cluster’s Number Free Bayes Prediction of General Framework on Mixture of Regression Models

Haruka Murayama; Shota Saito; Yuji Iikubo; Yuta Nakahara; Toshiyasu Matsushima

doi:10.1007/s44199-021-00001-5

Cluster’s Number Free Bayes Prediction of General Framework on Mixture of Regression Models

Journal of Statistical Theory and Applications ◽

10.1007/s44199-021-00001-5 ◽

2021 ◽

Vol 20 (3) ◽

pp. 425-449

Author(s):

Haruka Murayama ◽

Shota Saito ◽

Yuji Iikubo ◽

Yuta Nakahara ◽

Toshiyasu Matsushima

Keyword(s):

Linear Regression ◽

Regression Model ◽

Linear Regression Model ◽

Regression Models ◽

Real Data ◽

Number Of Clusters ◽

Bayes Prediction ◽

Data Generating Process ◽

Probabilistic Structure ◽

Selection Of

AbstractPrediction based on a single linear regression model is one of the most common way in various field of studies. It enables us to understand the structure of data, but might not be suitable to express the data whose structure is complex. To express the structure of data more accurately, we make assumption that the data can be divided in clusters, and has a linear regression model in each cluster. In this case, we can assume that each explanatory variable has their own role; explaining the assignment to the clusters, explaining the regression to the target variable, or being both of them. Introducing probabilistic structure to the data generating process, we derive the optimal prediction under Bayes criterion and the algorithm which calculates it sub-optimally with variational inference method. One of the advantages of our algorithm is that it automatically weights the probabilities of being each number of clusters in the process of the algorithm, therefore it solves the concern about selection of the number of clusters. Some experiments are performed on both synthetic and real data to demonstrate the above advantages and to discover some behaviors and tendencies of the algorithm.

Download Full-text

Estimation in Dynamic Linear Regression Models with Infinite Variance Errors

Econometric Theory ◽

10.1017/s0266466600007982 ◽

1993 ◽

Vol 9 (4) ◽

pp. 570-588 ◽

Cited By ~ 28

Author(s):

Keith Knight

Keyword(s):

Asymptotic Behavior ◽

Linear Regression ◽

Regression Model ◽

Linear Regression Model ◽

Regression Models ◽

Infinite Variance ◽

Linear Regression Models ◽

Asymptotically Normal ◽

Second Moments ◽

True Values

This paper considers the asymptotic behavior of M-estimates in a dynamic linear regression model where the errors have infinite second moments but the exogenous regressors satisfy the standard assumptions. It is shown that under certain conditions, the estimates of the parameters corresponding to the exogenous regressors are asymptotically normal and converge to the true values at the standard n−½ rate.

Download Full-text

Brazilian Articles in Top-Tier Dental Journals and Influence of International Collaboration on Citation Rates

Brazilian Dental Journal ◽

10.1590/0103-6440201902826 ◽

2019 ◽

Vol 30 (4) ◽

pp. 307-316 ◽

Cited By ~ 3

Author(s):

Ana Paula R Gonçalves ◽

Bruna L Porto ◽

Bruna Rodolfo ◽

Clovis M Faggion Jr ◽

Bernardo A. Agostini ◽

...

Keyword(s):

Linear Regression ◽

Regression Model ◽

International Collaboration ◽

Linear Regression Model ◽

Regression Models ◽

Dental Materials ◽

Original Research ◽

Linear Regression Models ◽

Article Type ◽

Multilevel Linear Regression

Abstract This study investigated the presence of co-authorship from Brazil in articles published in top-tier dental journals and analyzed the influence of international collaboration, article type (original research or review), and funding on citation rates. Articles published between 2015 and 2017 in 38 selected journals from 14 dental subareas were screened in Scopus. Bibliographic information, citation counts, and funding details were recorded for all articles (N=15619). Collaboration with other top-10 publishing countries in dentistry was registered. Annual citations averages (ACA) were calculated. A linear regression model assessed differences in ACA between subareas. Multilevel linear regression models evaluated the influence of article type, funding, and presence of international collaboration in ACA. Brazil was a frequent co-author of articles published in the period (top 3: USA=25.5%; Brazil=13.8%; Germany=9.2%) and the country with most publications in two subareas. The subjects with the biggest share of Brazil are Operative Dentistry/Cariology, Dental Materials, and Endodontics. Brazil was second in total citations, but fifth in citation averages per article. From the total of 2155 articles co-authored by Brazil, 74.8% had no co-authorship from other top-10 publishing countries. USA (17.8%), Italy (4.2%), and UK (3.2%) were the main co-author countries, but the main collaboration country varied between subjects. Implantology and Dental Materials were the subjects with most international co-authorship. Review articles and articles with international collaboration were associated with increased citation rates, whereas the presence of study funding did not influence the citations.

Download Full-text

Ordinal regression model and the linear regression model were superior to the logistic regression models

Journal of Clinical Epidemiology ◽

10.1016/j.jclinepi.2005.09.007 ◽

2006 ◽

Vol 59 (5) ◽

pp. 448-456 ◽

Cited By ~ 42

Author(s):

Colleen M. Norris ◽

William A. Ghali ◽

L. Duncan Saunders ◽

Rollin Brant ◽

Diane Galbraith ◽

...

Keyword(s):

Logistic Regression ◽

Linear Regression ◽

Regression Model ◽

Linear Regression Model ◽

Regression Models ◽

Ordinal Regression ◽

Logistic Regression Models ◽

Ordinal Regression Model

Download Full-text

MLRMPA: An R package of multiple linear regression model population analysis based on a cluster sampling technique for variable selection of high dimensional data

Chemometrics and Intelligent Laboratory Systems ◽

10.1016/j.chemolab.2014.01.010 ◽

2014 ◽

Vol 132 ◽

pp. 124-132 ◽

Cited By ~ 4

Author(s):

Meihong Xie ◽

Fangfang Deng ◽

Xiaoyun Zhang ◽

Yueli Tian ◽

Peizhen Li ◽

...

Keyword(s):

Linear Regression ◽

Regression Model ◽

Linear Regression Model ◽

Population Analysis ◽

Multiple Linear Regression Model ◽

Sampling Technique ◽

R Package ◽

High Dimensional ◽

Cluster Sampling ◽

Selection Of

Download Full-text

Solving Multicollinearity Problem in Linear Regression Model: The Review Suggests New Idea of Partitioning and Extraction of the Explanatory Variables

Journal of Mathematics and Statistics Studies ◽

10.32996/jmss.2021.2.1.2 ◽

2021 ◽

Vol 2 (1) ◽

pp. 12-20

Author(s):

Kayode Ayinde, Olusegun O. Alabi ◽

Ugochinyere Ihuoma Nwosu

Keyword(s):

Linear Regression ◽

Regression Model ◽

Linear Regression Model ◽

Real Data ◽

Least Square ◽

Model Parameters ◽

Explanatory Variables ◽

Ordinary Least Square ◽

Wide Range ◽

High Level

Multicollinearity has remained a major problem in regression analysis and should be sustainably addressed. Problems associated with multicollinearity are worse when it occurs at high level among regressors. This review revealed that studies on the subject have focused on developing estimators regardless of effect of differences in levels of multicollinearity among regressors. Studies have considered single-estimator and combined-estimator approaches without sustainable solution to multicollinearity problems. The possible influence of partitioning the regressors according to multicollinearity levels and extracting from each group to develop estimators that will estimate the parameters of a linear regression model when multicollinearity occurs is a new econometrics idea and therefore requires attention. The results of new studies should be compared with existing methods namely principal components estimator, partial least squares estimator, ridge regression estimator and the ordinary least square estimators using wide range of criteria by ranking their performances at each level of multicollinearity parameter and sample size. Based on a recent clue in literature, it is possible to develop innovative estimator that will sustainably solve the problem of multicollinearity through partitioning and extraction of explanatory variables approaches and identify situations where the innovative estimator will produce most efficient result of the model parameters. The new estimator should be applied to real data and popularized for use.

Download Full-text

Evaluating predictive capabilities in Industry 4.0 framework using Regression Models

10.36227/techrxiv.17129780.v1 ◽

2021 ◽

Author(s):

Shibajyoti Banerjee

Keyword(s):

Linear Regression ◽

Regression Model ◽

Linear Regression Model ◽

Industry 4.0 ◽

Regression Models ◽

Machine Performance

Observing decline in machine performance using a Linear Regression model<br>

Download Full-text

On the least trimmed squares estimators for JS circular regression model

Kuwait Journal of Science ◽

10.48129/kjs.v48i3.10004 ◽

2021 ◽

Vol 48 (3) ◽

Author(s):

Shokrya Saleh Alshqaq ◽

Keyword(s):

Linear Regression ◽

Regression Model ◽

Regression Models ◽

Real Data ◽

Linear Regression Models ◽

Least Trimmed Squares ◽

Leverage Points ◽

Circular Regression ◽

Robust Linear Regression

The least trimmed squares (LTS) estimation has been successfully used in the robust linear regression models. This article extends the LTS estimation to the Jammalamadaka and Sarma (JS) circular regression model. The robustness of the proposed estimator is studied and the used algorithm for computation is discussed. Simulation studied, and real data show that the proposed robust circular estimator effectively fits JS circular models in the presence of vertical outliers and leverage points.

Download Full-text

Significant non-linearity in nitrous oxide chamber data and its effect on calculated annual emissions

Biogeosciences Discussions ◽

10.5194/bgd-6-115-2009 ◽

2009 ◽

Vol 6 (1) ◽

pp. 115-141 ◽

Cited By ~ 13

Author(s):

P. C. Stolk ◽

C. M. J. Jacobs ◽

E. J. Moors ◽

A. Hensen ◽

G. L. Velthof ◽

...

Keyword(s):

Nitrous Oxide ◽

Linear Regression ◽

Regression Model ◽

Linear Regression Model ◽

Regression Models ◽

Goodness Of Fit ◽

Surface Fluxes ◽

Linear Regression Models ◽

Open Questions ◽

Non Linear

Abstract. Chambers are widely used to measure surface fluxes of nitrous oxide (N2O). Usually linear regression is used to calculate the fluxes from the chamber data. Non-linearity in the chamber data can result in an underestimation of the flux. Non-linear regression models are available for these data, but are not commonly used. In this study we compared the fit of linear and non-linear regression models to determine significant non-linearity in the chamber data. We assessed the influence of this significant non-linearity on the annual fluxes. For a two year dataset from an automatic chamber we calculated the fluxes with linear and non-linear regression methods. Based on the fit of the methods 32% of the data was defined significant non-linear. Significant non-linearity was not recognized by the goodness of fit of the linear regression alone. Using non-linear regression for these data and linear regression for the rest, increases the annual flux with 21% to 53% compared to the flux determined from linear regression alone. We suggest that differences this large are due to leakage through the soil. Macropores or a coarse textured soil can add to fast leakage from the chamber. Yet, also for chambers without leakage non-linearity in the chamber data is unavoidable, due to feedback from the increasing concentration in the chamber. To prevent a possibly small, but systematic underestimation of the flux, we recommend comparing the fit of a linear regression model with a non-linear regression model. The non-linear regression model should be used if the fit is significantly better. Open questions are how macropores affect chamber measurements and how optimization of chamber design can prevent this.

Download Full-text

Selection of Variables in a Multiple Linear Regression Model

Applications of Regression Models in Epidemiology ◽

10.1002/9781119212515.ch5 ◽

2017 ◽

pp. 77-86

Keyword(s):

Linear Regression ◽

Regression Model ◽

Multiple Linear Regression ◽

Linear Regression Model ◽

Multiple Linear Regression Model ◽

Selection Of Variables ◽

Selection Of

Download Full-text

Chemical Oxygen Demand Can Be Converted to Gross Energy for Food Items Using a Linear Regression Model

Journal of Nutrition ◽

10.1093/jn/nxaa321 ◽

2020 ◽

Cited By ~ 1

Author(s):

Taylor L Davis ◽

Blake Dirks ◽

Elvis A Carnero ◽

Karen D Corbin ◽

Jonathon Krakoff ◽

...

Keyword(s):

Linear Regression ◽

Regression Model ◽

Chemical Oxygen Demand ◽

Linear Regression Model ◽

Regression Models ◽

Oxygen Demand ◽

Dry Weight ◽

Food Items ◽

Gross Energy ◽

Sample Set

ABSTRACT Background Human and microbial metabolism are distinct disciplines. Terminology, metrics, and methodologies have been developed separately. Therefore, combining the 2 fields to study energetic processes simultaneously is difficult. Objectives When developing a mechanistic framework describing gut microbiome and human metabolism interactions, energy values of food and digestive materials that use consistent and compatible metrics are required. As an initial step toward this goal, we developed and validated a model to convert between chemical oxygen demand (COD) and gross energy (${E_g}$) for >100 food items and ingredients. Methods We developed linear regression models to relate (and be able to convert between) theoretical gross energy (${E_g}^{\prime}$) and chemical oxygen demand (COD′); the latter is a measure of electron equivalents in the food's carbon. We developed an overall regression model for the food items as a whole and separate regression models for the carbohydrate, protein, and fat components. The models were validated using a sample set of computed ${E_g}^{\prime}$ and COD′ values, an experimental sample set using measured ${E_g}$ and COD values, and robust statistical methods. Results The overall linear regression model and the carbohydrate, protein, and fat regression models accurately converted between COD and ${E_g}$, and the component models had smaller error. Because the ratios of COD per gram dry weight were greatest for fats and smallest for carbohydrates, foods with a high fat content also had higher ${E_g}$ values in terms of kcal · g dry weight−1. Conclusion Our models make it possible to analyze human and microbial energetic processes in concert using a single unit of measure, which fills an important need in the food–nutrition–metabolism–microbiome field. In addition, measuring COD and using the regressions to calculate ${E_g}$ can be used instead of measuring ${E_g}$ directly using bomb calorimetry, which saves time and money.

Download Full-text