scholarly journals Reclassifying inferential statistics into diagnostic and predictive statistics with an application on gynecologic cancer

2020 ◽  
Vol 9 (4) ◽  
pp. 146-150

Statisticians use to classify Statistics into two main parts, namely Descriptive and Inferential Statistics. Here, we suggest reclassifying Inferential Statistics into two parts, namely Diagnostic Statistics and Predictive Statistics. Based on that we will have four levels to analyze data (Descriptive, Diagnostic, Predictive and Perspective Statistics). Descriptive statistics mainly related to Graphs, Frequency tables, Measures of Central Tendency, Measures of Variation and Measures of Shape. Diagnostic statistics mainly related to the effects of the Independent variables (inputs) on the Dependent (Target) variable based on the Tests of Correlation or Association, Tests for Means differences and Tests for Classification. Predictive statistics mainly related to Estimation, Regression techniques and Time series Analysis for the Dependent (Target) variable. Perspective statistics mainly related to the previous three levels and acts as a prescription to how to solve or prevent the problem. In this paper, we will clarify the statistical tests used in each level of statistical analysis and will give an example on a real data related to Gynecologic Cancer

2021 ◽  
Author(s):  
Rosa F Ropero ◽  
M Julia Flores ◽  
Rafael Rumí

<p>Environmental data often present missing values or lack of information that make modelling tasks difficult. Under the framework of SAICMA Research Project, a flood risk management system is modelled for Andalusian Mediterranean catchment using information from the Andalusian Hydrological System. Hourly data were collected from October 2011 to September 2020, and present two issues:</p><ul><li>In Guadarranque River, for the dam level variable there is no data from May to August 2020, probably because of sensor damage.</li> <li>No information about river level is collected in the lower part of Guadiaro River, which make difficult to estimate flood risk in the coastal area.</li> </ul><p>In order to avoid removing dam variable from the entire model (or those missing months), or even reject modelling one river system, this abstract aims to provide modelling solutions based on Bayesian networks (BNs) that overcome this limitation.</p><p><em>Guarranque River. Missing values.</em></p><p>Dataset contains 75687 observations for 6 continuous variables. BNs regression models based on fixed structures (Naïve Bayes, NB, and Tree Augmented Naïve, TAN) were learnt using the complete dataset (until September 2019) with the aim of predicting the dam level variable as accurately as possible. A scenario was carried out with data from October 2019 to March 2020 and compared the prediction made for the target variable with the real data. Results show both NB (rmse: 6.29) and TAN (rmse: 5.74) are able to predict the behaviour of the target variable.</p><p>Besides, a BN based on expert’s structural learning was learnt with real data and both datasets with imputed values by NB and TAN. Results show models learnt with imputed data (NB: 3.33; TAN: 3.07) improve the error rate of model with respect to real data (4.26).</p><p><em>Guadairo River. Lack of information.</em></p><p>Dataset contains 73636 observations with 14 continuous variables. Since rainfall variables present a high percentage of zero values (over 94%), they were discretised by Equal Frequency method with 4 intervals. The aim is to predict flooding risk in the coastal area but no data is collected from this area. Thus, an unsupervised classification based on hybrid BNs was performed. Here, target variable classifies all observations into a set of homogeneous groups and gives, for each observation, the probability of belonging to each group. Results show a total of 3 groups:</p><ul><li>Group 0, “Normal situation”: with rainfall values equal to 0, and mean of river level very low.</li> <li>Group 1, “Storm situation”: mean rainfall values are over 0.3 mm and all river level variables duplicate the mean with respect to group 0.</li> <li>Group 2, “Extreme situation”: Both rainfall and river level means values present the highest values far away from both previous groups.</li> </ul><p>Even when validation shows this methodology is able to identify extreme events, further work is needed. In this sense, data from autumn-winter season (from October 2020 to March 2021) will be used. Including this new information it would be possible to check if last extreme events (flooding event during December and Filomenastorm during January) are identified.</p><p> </p><p> </p><p> </p>


2020 ◽  
Vol 5 (2) ◽  
pp. 209
Author(s):  
Fadil Iskandar

This research aims to find out how implementation of the financial compensation and performance job of PT Penggadaian (Persero) Branch  of Jambi. Next also to analyze implamentation and how the influence of financial compensation on performance job in PT Penggadaian (Persero) of Banch Jambi. This research uses descriptive method quantitative correlational research with the form that describes the relationship of independent variables with dependent variables. The research on the analysis tools using simple regression with hypothesis prove with statistical tests t. The results showed that a significant effect on performance financial compensation of performance job which are characterized by tcount > ttable and Prog. sig < α (0.05). While the correlation value i.e. 64% of these mean that have relationship very closely between financial compensation variables with variable performance job.


2020 ◽  
Vol 7 (11) ◽  
pp. 1626
Author(s):  
Ivany Lestari Goutama ◽  
Hendsun . ◽  
Yohanes Firmansyah ◽  
Ernawati Su

Background: Cardiovascular relative risk (CVRISK) is the latest cardiovascular relative risk score to evaluate the magnitude of cardiovascular risk in healthy people regardless of age and cardiovascular risk severity. The aim of the study is to determine the correlation between each independent variables of CVRISK score in individuals with and without history of cardiovascular diseases (CVD).Methods: The study design is cross-sectional study. We conducted it online through social media using Google forms from June to August 2020. Participants include all productive age groups from 16 to 60 years. The data were processed using excel and statistically tested. Descriptive data analysis uses tabulated data which is displayed in numbers or proportions (categorical) and single data distribution (numeric). Statistical association analysis uses the categorical-correlation test with 2 statistical tests that use eta on nominal-ordinal variables and contingency coefficients on nominal-nominal variables.Results: There is a strong autocorrelation between hypertension and high tryglyceride levels (p value 0.001; correlation 0.549; risks 30.14%), nutritional status and low-density lipoprotein cholesterol (LDL-C) levels in CVD group (p value 0.002; correlation 0.774; risks 59.90%) and non-CVD group (p value 0.000; correlation 0.757; risks 57.3%). Hypertension and risky LDL-C levels firmly proves a very strong correlations and significant relationship in CVD groups (p value 0.014; correlation 0.947; risks 89.68%).Conclusions: There is a correlation that varies from weak to very strong among the independent variables in the CVRISK scoring of the participants. Further research is needed to determine the potentiality of CVRISK as an early prevention in determining the cardiovascular risk of individuals with and without history of CVD.


2019 ◽  
Vol 6 (3) ◽  
pp. 16-19
Author(s):  
Abdul Ghafoor Kazi ◽  
Rafia Almani Baloach ◽  
Bilal Ahmed Khan ◽  
Mehwish Syeda

The purpose of this research was to analyze the impact of Non-monetary rewards on employee job performance in private banking sector of Hyderabad Pakistan. The study focused on factors such as recognition, career development, flexible working schedule (independent variables) and employee performance (dependent variable). The relationship between dependent and independent variables are empirically verified through statistical methods. The statistical tests like reliability test and multiple regression statistics were used for data analysis. Primary method was adopted for the collection of data in the form of questionnaire. Total respondents were 50 that were physically contacted. In reliability test all variables (03 independent and one dependent variable) were found reliable with good and excellent remarks. The value of Beta indicated positive relationship with dependent variable i.e. employee performance. In multiple regression analysis, independent variables recognition, career development and flexible working schedule were found insignificant.


2016 ◽  
Vol 13 (2) ◽  
pp. 191
Author(s):  
Dea Nurfika Sari ◽  
Haryanto Haryanto

The aims of this study is to examine factors that affect (determinants) internal audit effectiveness in the public sector, Inspectorate office at Province Special Region of Yogyakarta. This study is a replication of the research that has been done by Alzeban and Gwilliam in Saudi Arabia. There are 4 (four) independent variables that affect internal audit effectiveness as dependent variable. There are competence of internal auditor, the relationship between internal and external auditor, auditee support to internal audit activity, and independence of internal auditor. The population in this research is 51 internal auditor working in Inspectorate office at Province Special Region of Yogyakarta. This study uses primary data in the form of questionnaire. All of questionnaire can be processed. The datawere collected were processed using PLS analysis with SmartPLS 3 program. Statistical tests showed that three of four independent variables, there are the competence of the internal auditor, the auditee support and the independence of the internal auditor affect the effectiveness of the internal audit. while relationship between the internal auditor with the external auditor does not affect the internal audit effectiveness Keywords: Internal auditor effectiveness, competence of internal auditor, relationship between internal and external auditor, auditee support to internal audit activity, independence of internal auditor.


Mathematics ◽  
2020 ◽  
Vol 8 (4) ◽  
pp. 604 ◽  
Author(s):  
Victor Korolev ◽  
Andrey Gorshenin

Mathematical models are proposed for statistical regularities of maximum daily precipitation within a wet period and total precipitation volume per wet period. The proposed models are based on the generalized negative binomial (GNB) distribution of the duration of a wet period. The GNB distribution is a mixed Poisson distribution, the mixing distribution being generalized gamma (GG). The GNB distribution demonstrates excellent fit with real data of durations of wet periods measured in days. By means of limit theorems for statistics constructed from samples with random sizes having the GNB distribution, asymptotic approximations are proposed for the distributions of maximum daily precipitation volume within a wet period and total precipitation volume for a wet period. It is shown that the exponent power parameter in the mixing GG distribution matches slow global climate trends. The bounds for the accuracy of the proposed approximations are presented. Several tests for daily precipitation, total precipitation volume and precipitation intensities to be abnormally extremal are proposed and compared to the traditional PoT-method. The results of the application of this test to real data are presented.


Author(s):  
Pakize Taylan

The aim of parametric regression models like linear regression and nonlinear regression are to produce a reasonable relationship between response and independent variables based on the assumption of linearity and predetermined nonlinearity in the regression parameters by finite set of parameters. Nonparametric regression techniques are widely-used statistical techniques, and they not only relax the assumption of linearity in the regression parameters, but they also do not need a predetermined functional form as nonlinearity for the relationship between response and independent variables. It is capable of handling higher dimensional problem and sizes of sample than regression that considers parametric models because the data should provide both the model building and the model estimates. For this purpose, firstly, PRSS problems for MARS, ADMs, and CR will be constructed. Secondly, the solution of the generated problems will be obtained with CQP, one of the famous methods of convex optimization, and these solutions will be called CMARS, CADMs, and CKR, respectively.


1989 ◽  
Vol 69 (3) ◽  
pp. 597-610 ◽  
Author(s):  
JOHN E. RICHARDS ◽  
THOMAS E. BATES

This study was conducted to determine which of eight different extractants best described the K-supplying capacities of nine southern Ontario soils. The total amount of K extracted by eight crops of alfalfa grown in the greenhouse was related to soil K by regression techniques. The following procedures extracted increasingly higher average amounts of soil K: water; electro-ultrafiltration (EUF) at 20 °C (EUF-K-20) (37 μg K g−1); EUF at 80 °C (EUF-K-80) (83 μg K g−1); 2M NaCl (89 μg K g−1); 1M NH4OAc (131 μg K g−1); 0.1 M HNO3 (163 μg K g−1); seven sequential 7-min extractions with boiling 1 M HNO3 (Mactotal) (940 μg K g−1); and 0.2 M sodium tetra-phenyl boron (NaTPhB) (3248 μg K g−1). Of the eight extractants tested, the amount of K removed by EUF was the most closely associated to total K uptake; a multiple regression model with the logarithm of EUF-K-20 and the logarithm of EUF-K-80 as independent variables explained 97% of the total variation in K uptake. The K extracted by 2 M NaCl and by 0.1 M HNO3 also explained more variation in total K uptake (r2 = 0.86, and 0.92, respectively) than did 1 M NH4OAc (r2 = 0.61), which is currently used in the Ontario soil test program. The other extractants did not offer an improvement over NH4OAc, with the exception of the most soluble fraction of Mactotal (StepK). Extraction of soil K with electro-ultrafiltration may offer a more precise estimation of the K supplying capacities of southern Ontario soils than the currently used extractant. Key words: Electro-ultrafiltration, nonexchangeable K, ammonium acetate, soil test


2005 ◽  
Vol 53 (2) ◽  
pp. 177-190 ◽  
Author(s):  
Martin J. Bergee ◽  
Jamila L. McWhirter

With this study, we replicated and extended the work of Bergee and Platt (2003). Analyzing ratings outcomes of 7,355 small-ensemble and solo events from two consecutive midwestern state festivals (2001 and 2002), Bergee and Platt found statistically significant differences in the main effects of time of day, type of event (solo/ensemble), and school size. In the replication phase of the present study, we used their procedures to analyze data from the 2003 festival ( N=3,853), finding significant differences in the same three main effects and also performing medium (vocal/instrumental). In both studies, the type of event by performing medium interaction was significant. The extension phase consisted of applying logistic regression techniques to the fitting of a theoretical model of prediction. Two variables were added to the original four-geographical location and district level of expenditure per average daily attendance. All main effects except geographical location (eliminated owing to high collinearity), plus the type of event by performing medium interaction, emerged as strong predictors of ratings outcomes. Afternoon scheduling, entering from a large, relatively high-expenditure school, and performing as a vocalist and a soloist significantly predicted the highest rating. January 18, 2005 March 15, 2005.


2020 ◽  
Vol 223 (3) ◽  
pp. 2009-2026
Author(s):  
Frederik Link ◽  
Georg Rümpker ◽  
Ayoub Kaviani

SUMMARY We present a technique to derive robust estimates for the crustal thickness and elastic properties, including anisotropy, from shear wave splitting of converted phases in receiver functions. We combine stacking procedures with a correction scheme for the splitting effect of the crustal converted Ps-phase and its first reverberation, the PpPs-phase, where we also allow for a predefined dipping Moho. The incorporation of two phases stabilizes the analysis procedure and allows to simultaneously solve for the crustal thickness, the ratio of average P- to S-wave velocities, the percentage of anisotropy and the fast-axis direction. The stacking is based on arrival times and polarizations computed using a ray-based algorithm. Synthetic tests show the robustness of the technique and its applicability to tectonic settings where dip of the Moho is significant. These tests also demonstrate that the effects of a dipping layer boundary may overprint a possible anisotropic signature. To constrain the uncertainty of our results we perform statistical tests based on a bootstrapping approach. We distinguish between different model classes by comparing the coherency of the stacked amplitudes after moveout correction. We apply the new technique to real-data examples from different tectonic regimes and show that coherency of the stacked receiver functions can be improved, when anisotropy and a dipping Moho are included in the analysis. The examples underline the advantages of statistical analyses when dealing with stacking procedures and potentially ambiguous solutions.


Sign in / Sign up

Export Citation Format

Share Document