Improved estimation method for high dimension semimartingale regression models based on discrete data

Author(s):  
Evgeny Pchelintsev ◽  
Serguei Pergamenshchikov ◽  
Maria Leshchinskaya
2021 ◽  
Vol 13 (7) ◽  
pp. 168781402110277
Author(s):  
Yankai Hou ◽  
Zhaosheng Zhang ◽  
Peng Liu ◽  
Chunbao Song ◽  
Zhenpo Wang

Accurate estimation of the degree of battery aging is essential to ensure safe operation of electric vehicles. In this paper, using real-world vehicles and their operational data, a battery aging estimation method is proposed based on a dual-polarization equivalent circuit (DPEC) model and multiple data-driven models. The DPEC model and the forgetting factor recursive least-squares method are used to determine the battery system’s ohmic internal resistance, with outliers being filtered using boxplots. Furthermore, eight common data-driven models are used to describe the relationship between battery degradation and the factors influencing this degradation, and these models are analyzed and compared in terms of both estimation accuracy and computational requirements. The results show that the gradient descent tree regression, XGBoost regression, and light GBM regression models are more accurate than the other methods, with root mean square errors of less than 6.9 mΩ. The AdaBoost and random forest regression models are regarded as alternative groups because of their relative instability. The linear regression, support vector machine regression, and k-nearest neighbor regression models are not recommended because of poor accuracy or excessively high computational requirements. This work can serve as a reference for subsequent battery degradation studies based on real-time operational data.


2020 ◽  
Vol 11 (1) ◽  
pp. 21
Author(s):  
Zahrotul Aflakhah ◽  
Jajang Jajang ◽  
Agustini Tripena Br. Sb.

This research discusses about the Ordinary Least Squares (OLS) method and robust M-estimation method; compare between the Tukey bisquare and Huber weighting from simple linier regression models that contain outliers. Data are generated through simulation with the percentages of outliers and sample sizes. Each data will be formed into a simple linier regression model, then the percentage of outliers, RSE and MAD values are calculated. The results show that RSE and MAD values produced by a simple linear regression model with the OLS method are influenced by the percentage of outliers. However, the regression model of robust M-estimation with sample size 30, 60, 90, 120, and 150 results an unstable RSE values with the change of the percentage of outlier and the MAD values that are not affected by the percentage of outliers and sample size. The robust M-estimation method with Tukey Bisquare weighting is as good as the Huber weighting. Full Article


2021 ◽  
Author(s):  
xiaolong Liu ◽  
Zhen Ma ◽  
Lei Zhang ◽  
Yang Yu ◽  
Maswikiti Ewetse Paul ◽  
...  

Abstract Background Gastric cancer(GC) treated with fluorouracil and cisplatin can cause chemotherapy resistance, which is one of the most common postoperative clinical complications and leads to in poor prognosis. Methods The purpose of this study is to investigate the susceptibility of patients with GC after postoperative chemotherapy based on autophagy-related genes (ATGs). Under the background of TCGA database, for patients with GC undergoing and during chemotherapy,gene expression data was integrated and analyzed. Prognostic genes were screened based on univariate and various analysis regression models. Subjects were divided into two groups: high-risk group and low-risk group. Univariate and various analytical regression models were used to screen for prognostic genes. Median risk score was used for analysis. OS and DFS were evaluated by the product limit estimation method. Subject curve analysis is used to determine the accuracy of the forecast. We also have performed appropriate analysis and conducted some detailed assessments in our work. The differential expression of ATGs was mainly associated with chemotherapy resistance.Results After chemotherapy administration, we have screened 9 ATGs outcomes in the subjects and DFS and OS were precisely predicted by the model of GEO and TCGA databases.Conclusions 9 genes were established as prognostic markers to predict the relationship between ATGs and GC chemotherapy susceptibility, suggesting a better individualized treatment in clinical practice.


2016 ◽  
Author(s):  
Carl Schmertmann

High sampling variability in recorded vital events creates serious problems for small-area mortality estimation. Many existing approaches to fitting local mortality schedules, including those most often used in Brazil, estimate rates by making rigid mathematical assumptions about local age patterns. Such methods assume that all areas within a larger area (for example, microregions within a mesoregion) have identically-shaped log mortality schedules by age. We propose a more flexible statistical estimation method that combines Poisson regression with the TOPALS relational model (DE BEER, 2012). We use the new method to estimate age-specific mortality rates in Brazilian small areas (states, mesoregions, microregions, and municípios) in 2010. Results for Minas Gerais show notable differences in the age patterns of mortality between adjacent small areas, demonstrating the advantages of using a flexible functional form in regression models.


2018 ◽  
Author(s):  
Paul - Christian Bürkner ◽  
Emmanuel Charpentier

Ordinal predictors are commonly used in regression models. They are often incorrectly treated as either nominal or metric, thus under- or overestimating the contained information. Such practices may lead to worse inference and predictions compared to methods which are specifically designed for this purpose. We propose a new method for modeling ordinal predictors that applies in situations in which it is reasonable to assume their effects to be monotonic. The parameterization of such monotonic effects is realized in terms of a scale parameter $b$ representing the direction and size of the effect and a simplex parameter $\zeta$ modeling the normalized differences between categories. This ensures that predictions increase or decrease monotonically, while changes between adjacent categories may vary across categories. This formulation generalizes to interaction terms as well as multilevel structures. Monotonic effects may not only be applied to ordinal predictors, but also to other discrete variables for which a monotonic relationship is plausible. In simulation studies, we show that the model is well calibrated and, in case of monotonicity, has similar or even better predictive performance than other approaches designed to handle ordinal predictors. Using Stan, we developed a Bayesian estimation method for monotonic effects, which allows to incorporate prior information and to check the assumption of monotonicity. We have implemented this method in the R package brms, so that fitting monotonic effects in a fully Bayesian framework is now straightforward.


Author(s):  
С.И. Носков

Описываются свойства методов оценивания параметров регрессионных моделей - наименьших квадратов, модулей, антиробастного, а также их применения для решения конкретных практических проблем. При этом метод наименьших модулей не реагирует на аномальные наблюдения выборки, метод антиробастного оценивания сильно отклоняет линию регрессии в их направлении, метод наименьших квадратов занимает промежуточное положение. Показано, что если целью построения модели является проведение на ее основе многовариантных прогнозных расчетов значений зависимой переменной, то выбор метода численной идентификации параметров модели следует производить на основе анализа характера выбросов. Если есть основания полагать, что подобные им ситуации могут иметь место в будущем, следует выбрать метод антиробастного оценивания, в противном же случае - метод наименьших модулей. Построена регрессионная модель грузооборота Красноярской железной дороги на основе применения всех трех методов оценивания параметров. Проведен анализ причин, имеющих место в 2010 году в ситуации резкого падения величины грузооборота, которая вполне может характеризоваться как аномальное наблюдение в данных. Сделаны рекомендации по выбору метода оценивания параметров в этом случае The article describes the properties of methods for estimating the parameters of regression models - least squares, moduli, anti-robust - as well as their application for solving specific practical problems. At the same time, the method of least modules does not respond to anomalous observations of the sample, the method of anti-robust estimation strongly deviates the regression line in their direction, the method of least squares occupies an intermediate position. I show that if the purpose of constructing a model is to carry out multivariate predictive calculations of the values of the dependent variable on its basis, then the choice of a method for the numerical identification of model parameters should be based on an analysis of the nature of emissions. If there is a reason to believe that similar situations may occur in the future, the anti-robust estimation method should be chosen, otherwise - the least modulus method. I built a regression model of the freight turnover of the Krasnoyarsk railway on the basis of the application of all three methods of parameter estimation. I carried out the analysis of the reasons for the situation of a sharp drop in the value of cargo turnover in 2010, which may well be characterized as anomalous observation in the data. I give recommendations on the choice of the parameter estimation method in this case


Sign in / Sign up

Export Citation Format

Share Document