scholarly journals Revisiting the Optimal Probability Estimator from Small Samples for Data Mining

2019 ◽  
Vol 29 (4) ◽  
pp. 783-796 ◽  
Author(s):  
Bojan Cestnik

Abstract Estimation of probabilities from empirical data samples has drawn close attention in the scientific community and has been identified as a crucial phase in many machine learning and knowledge discovery research projects and applications. In addition to trivial and straightforward estimation with relative frequency, more elaborated probability estimation methods from small samples were proposed and applied in practice (e.g., Laplace’s rule, the m-estimate). Piegat and Landowski (2012) proposed a novel probability estimation method from small samples Eph√2 that is optimal according to the mean absolute error of the estimation result. In this paper we show that, even though the articulation of Piegat’s formula seems different, it is in fact a special case of the m-estimate, where pa =1/2 and m = √2. In the context of an experimental framework, we present an in-depth analysis of several probability estimation methods with respect to their mean absolute errors and demonstrate their potential advantages and disadvantages. We extend the analysis from single instance samples to samples with a moderate number of instances. We define small samples for the purpose of estimating probabilities as samples containing either less than four successes or less than four failures and justify the definition by analysing probability estimation errors on various sample sizes.

2021 ◽  
Vol 13 (15) ◽  
pp. 2862
Author(s):  
Yakun Xie ◽  
Dejun Feng ◽  
Sifan Xiong ◽  
Jun Zhu ◽  
Yangge Liu

Accurately building height estimation from remote sensing imagery is an important and challenging task. However, the existing shadow-based building height estimation methods have large errors due to the complex environment in remote sensing imagery. In this paper, we propose a multi-scene building height estimation method based on shadow in high resolution imagery. First, the shadow of building is classified and described by analyzing the features of building shadow in remote sensing imagery. Second, a variety of shadow-based building height estimation models is established in different scenes. In addition, a method of shadow regularization extraction is proposed, which can solve the problem of mutual adhesion shadows in dense building areas effectively. Finally, we propose a method for shadow length calculation combines with the fish net and the pauta criterion, which means that the large error caused by the complex shape of building shadow can be avoided. Multi-scene areas are selected for experimental analysis to prove the validity of our method. The experiment results show that the accuracy rate is as high as 96% within 2 m of absolute error of our method. In addition, we compared our proposed approach with the existing methods, and the results show that the absolute error of our method are reduced by 1.24 m-3.76 m, which can achieve high-precision estimation of building height.


Author(s):  
Mingyue Zhang ◽  
Xiaobin Fan ◽  
Jing Gan ◽  
Zeng Song ◽  
Bin Zhao

Background: Battery technology has been one of the bottlenecks in electric cars. Whether it is in theory or in practice, the research on battery management is extremely important, especially for battery state-of-charge estimation. In fact, the battery has a strong time change and non-linear properties, which are extremely complex systems. Therefore, accurate estimating the state of charge is a challenging thing. Objective: The study aims to report the latest progress in the studies of the state-of-charge estimation methods for electric vehicle battery. Methods: This paper reviews various representative patents and papers related to the state of charge estimation methods for electric vehicle battery. According to their theoretical and experimental characteristics, the estimation methods were classified into three groups: the traditional estimation algorithm based on the battery experiment, the estimation algorithm based on modern control theory and other estimation algorithm based on the innovative ideas, especially focusing on the algorithms based on control theory. Results: The advantages and disadvantages, current and future developments of the state-of-charge estimation methods are finally provided and discussed. Conclusion: Each kind of state of charge estimation method has its own characteristics, suitable for different occasions. At present, algorithms based on control theory, especially intelligent algorithms, are the focus of research in this field. The future development direction is to establish rich database, improve hardware technology, put up with more perfect battery model, and give full play to the advantages of each algorithm.


Author(s):  
A. S. Ogunsanya ◽  
E. E. E. Akarawak ◽  
W. B. Yahya

In this paper, we compared different Parameter Estimation method of the two parameter Weibull-Rayleigh Distribution (W-RD) namely; Maximum Likelihood Estimation (MLE), Least Square Estimation method (LSE) and three methods of Quartile Estimators. Two of the quartile methods have been applied in literature, while the third method (Q1-M) is introduced in this work. The methods have been applied to simulate data. These methods of estimation were compared using Error, Mean Square Error and Total Deviation (TD) which is also known as Sum Absolute Error Estimate (SAEE). The analytical results show that the performances of all the parameter estimation methods were satisfactory with data set of Weibull-Rayleigh distribution while degree of accuracy is determined by the sample size. The proposed quartile (Q1-M) method has the least Total Deviation and MSE. In addition, the quartile methods perform better than MLE for the simulated data. In particular, the proposed quartile methods (Q1-M) have an added advantage of simplicity in usage than MLE methods.


2020 ◽  
Vol 2020 ◽  
pp. 1-8
Author(s):  
Ernian Zhao ◽  
Qiang Zhou ◽  
Weilian Qu ◽  
Wenming Wang

In this study, several estimation methods of fatigue properties based on different monotonic mechanical parameters were first discussed. The advantages and disadvantages of the Hardness Method proposed by Roessle and Fatemi were investigated and improved through the analysis of a total of 92 fatigue test data. A new Segment Fitting Method from Brinell hardness was then proposed for the fatigue properties estimation, and a total of 96 pieces of fatigue test data under axial, torsional, and multiaxial in-phase loading were collected to verify the applicability of the new proposal. Finally, the prediction accuracy of the new proposal and three exciting estimation methods was compared with the predictions based on the experimental fatigue properties. Based on the results obtained, the newly proposed estimation method has a significant improvement on the relation between fatigue ductility coefficient and Brinell hardness, which consequently improves the fatigue life prediction accuracy with the scatter band of 2, particularly for the materials with low Brinell hardness. The present study can provide a simplified analysis of the preliminary fatigue design of engineering structures.


2016 ◽  
Vol 16 (3) ◽  
pp. 705-717 ◽  
Author(s):  
Hiroshi Takagi ◽  
Wenjie Wu

Abstract. Even though the maximum wind radius (Rmax) is an important parameter in determining the intensity and size of tropical cyclones, it has been overlooked in previous storm surge studies. This study reviews the existing estimation methods for Rmax based on central pressure or maximum wind speed. These over- or underestimate Rmax because of substantial variations in the data, although an average radius can be estimated with moderate accuracy. As an alternative, we propose an Rmax estimation method based on the radius of the 50 kt wind (R50). Data obtained by a meteorological station network in the Japanese archipelago during the passage of strong typhoons, together with the JMA typhoon best track data for 1990–2013, enabled us to derive the following simple equation, Rmax  =  0.23 R50. Application to a recent strong typhoon, the 2015 Typhoon Goni, confirms that the equation provides a good estimation of Rmax, particularly when the central pressure became considerably low. Although this new method substantially improves the estimation of Rmax compared to the existing models, estimation errors are unavoidable because of fundamental uncertainties regarding the typhoon's structure or insufficient number of available typhoon data. In fact, a numerical simulation for the 2013 Typhoon Haiyan as well as 2015 Typhoon Goni demonstrates a substantial difference in the storm surge height for different Rmax. Therefore, the variability of Rmax should be taken into account in storm surge simulations (e.g., Rmax  =  0.15 R50–0.35 R50), independently of the model used, to minimize the risk of over- or underestimating storm surges. The proposed method is expected to increase the predictability of major storm surges and to contribute to disaster risk management, particularly in the western North Pacific, including countries such as Japan, China, Taiwan, the Philippines, and Vietnam.


2014 ◽  
Vol 484-485 ◽  
pp. 547-551
Author(s):  
Jiang Min ◽  
Dong Wei

This paper, on the basis of the author realizing the skill evaluation system based on real environment, discusses several commonly used parameter estimation methods based on item response theory ( IRT ) and analyzes the advantages and disadvantages of each estimation method and their respective application fields. Also, it expounds the research theory and design process of skill adaptive evaluation system based on real environment and the innovation of the system.


2021 ◽  
Author(s):  
Shuwei Dai ◽  
Martha D. Shulski ◽  
Haishun Yang ◽  
Roger W. Elmore

Abstract The concept of thermal time, measured in degree-days, is widely used among the agricultural community in Nebraska to make decisions in corn (Zea Mays L.) production. Instead of the real-time temperatures that are experienced by corn plants, most of the widely available temperature data are limited to daily timescale observations from standard meteorological stations. And a variety of equations are used by different agricultural groups (e.g., researchers, advisors, farmers, and seed companies) to estimate thermal time for corn. Two problems could arise: a) the estimation method is lacking in accuracy; and b) different estimation methods are used for the same purpose by different groups. Consequently, citing these inaccurate and maybe inherently different thermal time results could lead to biased decisions in corn production. The goal of this study is to evaluate six commonly used estimation methods by comparing the estimated thermal time with the hourly-temperature approximated thermal time. We analyzed the root mean square error and mean absolute error for six metrics of total growing season (from May through September) degree-days based on the temperature data from a total of 14 long-term observing locations in Nebraska. In particular, we selected four location-extreme year cases to demonstrate the six methods’ estimation performance on a daily timescale. We found that the most commonly used adjusted Tmax and Tmin rectangle method provided poor estimation in the study area. Instead, single-sine, double-sine, or Tavg-based method was more superior depending on the metric of degree-days.


2020 ◽  
Vol 2020 (66) ◽  
pp. 101-110
Author(s):  
. Azhar Kadhim Jbarah ◽  
Prof Dr. Ahmed Shaker Mohammed

The research is concerned with estimating the effect of the cultivated area of barley crop on the production of that crop by estimating the regression model representing the relationship of these two variables. The results of the tests indicated that the time series of the response variable values is stationary and the series of values of the explanatory variable were nonstationary and that they were integrated of order one ( I(1) ), these tests also indicate that the random error terms are auto correlated and can be modeled according to the mixed autoregressive-moving average models ARMA(p,q), for these results we cannot use the classical estimation method to estimate our regression model, therefore, a fully modified M method was adopted, which is a robust estimation methods, The estimated results indicate a positive significant relation between the production of barley crop and cultivated area.


Sensors ◽  
2020 ◽  
Vol 21 (1) ◽  
pp. 26
Author(s):  
David González-Ortega ◽  
Francisco Javier Díaz-Pernas ◽  
Mario Martínez-Zarzuela ◽  
Míriam Antón-Rodríguez

Driver’s gaze information can be crucial in driving research because of its relation to driver attention. Particularly, the inclusion of gaze data in driving simulators broadens the scope of research studies as they can relate drivers’ gaze patterns to their features and performance. In this paper, we present two gaze region estimation modules integrated in a driving simulator. One uses the 3D Kinect device and another uses the virtual reality Oculus Rift device. The modules are able to detect the region, out of seven in which the driving scene was divided, where a driver is gazing at in every route processed frame. Four methods were implemented and compared for gaze estimation, which learn the relation between gaze displacement and head movement. Two are simpler and based on points that try to capture this relation and two are based on classifiers such as MLP and SVM. Experiments were carried out with 12 users that drove on the same scenario twice, each one with a different visualization display, first with a big screen and later with Oculus Rift. On the whole, Oculus Rift outperformed Kinect as the best hardware for gaze estimation. The Oculus-based gaze region estimation method with the highest performance achieved an accuracy of 97.94%. The information provided by the Oculus Rift module enriches the driving simulator data and makes it possible a multimodal driving performance analysis apart from the immersion and realism obtained with the virtual reality experience provided by Oculus.


Sensors ◽  
2021 ◽  
Vol 21 (9) ◽  
pp. 2952
Author(s):  
Latifa Nabila Harfiya ◽  
Ching-Chun Chang ◽  
Yung-Hui Li

Monitoring continuous BP signal is an important issue, because blood pressure (BP) varies over days, minutes, or even seconds for short-term cases. Most of photoplethysmography (PPG)-based BP estimation methods are susceptible to noise and only provides systolic blood pressure (SBP) and diastolic blood pressure (DBP) prediction. Here, instead of estimating a discrete value, we focus on different perspectives to estimate the whole waveform of BP. We propose a novel deep learning model to learn how to perform signal-to-signal translation from PPG to arterial blood pressure (ABP). Furthermore, using a raw PPG signal only as the input, the output of the proposed model is a continuous ABP signal. Based on the translated ABP signal, we extract the SBP and DBP values accordingly to ease the comparative evaluation. Our prediction results achieve average absolute error under 5 mmHg, with 70% confidence for SBP and 95% confidence for DBP without complex feature engineering. These results fulfill the standard from Association for the Advancement of Medical Instrumentation (AAMI) and the British Hypertension Society (BHS) with grade A. From the results, we believe that our model is applicable and potentially boosts the accuracy of an effective signal-to-signal continuous blood pressure estimation.


Sign in / Sign up

Export Citation Format

Share Document