scholarly journals Measuring Interactions in Categorical Datasets Using Multivariate Symmetrical Uncertainty

Entropy ◽  
2021 ◽  
Vol 24 (1) ◽  
pp. 64
Author(s):  
Santiago Gómez-Guerrero ◽  
Inocencio Ortiz ◽  
Gustavo Sosa-Cabrera ◽  
Miguel García-Torres ◽  
Christian E. Schaerer

Interaction between variables is often found in statistical models, and it is usually expressed in the model as an additional term when the variables are numeric. However, when the variables are categorical (also known as nominal or qualitative) or mixed numerical-categorical, defining, detecting, and measuring interactions is not a simple task. In this work, based on an entropy-based correlation measure for n nominal variables (named as Multivariate Symmetrical Uncertainty (MSU)), we propose a formal and broader definition for the interaction of the variables. Two series of experiments are presented. In the first series, we observe that datasets where some record types or combinations of categories are absent, forming patterns of records, which often display interactions among their attributes. In the second series, the interaction/non-interaction behavior of a regression model (entirely built on continuous variables) gets successfully replicated under a discretized version of the dataset. It is shown that there is an interaction-wise correspondence between the continuous and the discretized versions of the dataset. Hence, we demonstrate that the proposed definition of interaction enabled by the MSU is a valuable tool for detecting and measuring interactions within linear and non-linear models.

1983 ◽  
Vol 15 (6) ◽  
pp. 801-813 ◽  
Author(s):  
B Fingleton

Log-linear models are an appropriate means of determining the magnitude and direction of interactions between categorical variables that in common with other statistical models assume independent observations. Spatial data are often dependent rather than independent and thus the analysis of spatial data by log-linear models may erroneously detect interactions between variables that are spurious and are the consequence of pairwise correlations between observations. A procedure is described in this paper to accommodate these effects that requires only very minimal assumptions about the nature of the autocorrelation process given systematic sampling at intersection points on a square lattice.


2000 ◽  
Vol 23 (4) ◽  
pp. 869-875 ◽  
Author(s):  
José Marcelo Soriano Viana

It was studied the parametric restrictions of the diallel analysis model of Griffing, method 2 (parents and F1 generations) and model 1 (fixed), in order to address the questions: i) does the statistical model need to be restricted? ii) do the restrictions satisfy the genetic parameter values? and iii) do they make the analysis and interpretation easier? Objectively, these questions can be answered as: i) yes, ii) not all of them, and iii) the analysis is easier, but the interpretation is the same as in the model with restrictions that satisfy the parameter values. The main conclusions were that: the statistical models for combining ability analysis are necessarily restricted; in the Griffing model (method 2, model 1), the restrictions relative to the specific combining ability (SCA) effects, <img src="http:/img/fbpe/gmb/v23n4/6246s1.gif" align="absmiddle"> and <img src="http:/img/fbpe/gmb/v23n4/6246s2.gif" align="absmiddle"> for all j, do not satisfy the parametric values, and the same inferences should be established from the analyses using the model with restrictions that satisfy the parametric values of SCA effects and that suggested by Griffing. A consequence of the restrictions of the Griffing model is to allow the definition of formulas for estimating the effects, their variances and the variances of contrasts of effects, as well as for calculating orthogonal sums of squares.


2015 ◽  
Vol 8 (3) ◽  
pp. 80 ◽  
Author(s):  
Carlos M. Ardila ◽  
Isabel C. Guzmán

<p><strong>BACKGROUND:</strong> It has been reported that clinical results of mechanical periodontal treatment could differ between subjects and among different sites of the tooth in the patient. The objective of this multilevel analysis is to investigate clinical factors at subject and sites of the tooth that influence variations in clinical attachment (CAL) increase and probing depth (PD) diminution of adjunctive moxifloxacin (MOX) at six months post-treatment in generalized aggressive periodontitis.</p> <p><strong>METHODS:</strong> This clinical trial included 40 patients randomly distributed to two therapy protocols: scaling and root planing alone or combined with MOX. Multilevel linear models for continuous variables were formulated to evaluate the clinical impact of the hierarchical configuration of periodontal data.</p> <p><strong>RESULTS:</strong> Six months following therapy, the divergences between both protocols were statistically significant in PD diminution and CAL increase, favouring the MOX therapy (p&lt;0.001). Besides, the multilevel analysis revealed that adjunctive MOX at the subject level, non-molar and the interaction non-molar x MOX at the tooth level, interproximal sites and the interaction interproximal sites x MOX at the site level, were statistically significant factors in determining CAL increase and PD diminution.</p> <p><strong>CONCLUSIONS:</strong> The main cause of variability in CAL gain and PD reduction following adjunctive MOX was attributable to the tooth level. Adjunctive MOX and their interactions with non-molar and interproximal sites showed higher clinical benefits at the tooth and site levels which could be essential for PD reduction and CAL gain in generalized aggressive periodontitis subjects.</p>


2021 ◽  
Vol 18 ◽  
pp. 163-170
Author(s):  
Lorenc Koçiu ◽  
Kledian Kodra

Using the econometric models, this paper addresses the ability of Albanian Small and Medium-sizedEnterprises (SMEs) to identify the risks they face. To write this paper, we studied SMEs operating in theGjirokastra region. First, qualitative data gathered through a questionnaire was used. Next, the 5-level Likertscale was used to measure it. Finally, the data was processed through statistical software SPSS version 21,using the binary logistic regression model, which reveals the probability of occurrence of an event when allindependent variables are included. Logistic regression is an integral part of a category of statistical models,which are called General Linear Models. Logistic regression is used to analyze problems in which one or moreindependent variables interfere, which influences the dichotomous dependent variable. In such cases, the latteris seen as the random variable and is dependent on them. To evaluate whether Albanian SMEs can identifyrisks, we analyzed the factors that SMEs perceive as directly affecting the risks they face. At the end of thepaper, we conclude that Albanian SMEs can identify risk


Author(s):  
Alberto Gianinetti

Entropy quantification can be performed under the assumption that both the position of a particle in space and its level of energy can be defined as corresponding to one among many enumerable states, even if their number is hugely high. This means that, if absolute values of entropy have to be computed, neither energy nor space should be continuous variables, even though entropy changes can be calculated in any case. Remarkably, quantum theory just says that’s the case, because at a very short scale both energy and space seem to behave like discrete quantities rather than as continuous ones. So, a general string theory, which represents the evolution of quantum theory, appears to be the natural, preferable theoretical framework for the definition of entropy.


Mathematics ◽  
2020 ◽  
Vol 8 (6) ◽  
pp. 953
Author(s):  
Rashad A. R. Bantan ◽  
Christophe Chesneau ◽  
Farrukh Jamal ◽  
Mohammed Elgarhy

This paper develops the exponentiated Mfamily of continuous distributions, aiming to provide new statistical models for data fitting purposes. It stands out from the other families, as it depends on two baseline distributions, with the use of ratio and power transforms in the definition of the main cumulative distribution function. Thanks to the joint action of the possibly different baseline distributions, flexible statistical models can be created, motivating a complete study in this regard. Thus, we discuss the theoretical properties of the new family, with emphasis on those of potential interest to the overall probability and statistics. Then, a new three-parameter lifetime distribution is derived, with the choices of the inverse exponential and exponential distributions as baselines. After pointing out the great flexibility of the related model, we apply it to analyze an actual dataset of current interest: the daily COVID-19 cases observed in Pakistan from 21 March to 29 May 2020 (inclusive). As notable results, we demonstrate that the proposed model is the best among the 15 top ranked models in the literature, including the inverse exponential and exponential models, several modern extensions of them depending on more parameters, and the “unexponentiated” version of the proposed model as well. As future perspectives, the proposed model can be of interest to analyze data on COVID-19 cases in other countries, for possible comparison studies.


Metals ◽  
2019 ◽  
Vol 9 (9) ◽  
pp. 959 ◽  
Author(s):  
Leo S. Carlsson ◽  
Peter B. Samuelsson ◽  
Pär G. Jönsson

Statistical modeling, also known as machine learning, has gained increased attention in part due to the Industry 4.0 development. However, a review of the statistical models within the scope of steel processes has not previously been conducted. This paper reviews available statistical models in the literature predicting the Electrical Energy (EE) consumption of the Electric Arc Furnace (EAF). The aim was to structure published data and to bring clarity to the subject in light of challenges and considerations that are imposed by statistical models. These include data complexity and data treatment, model validation and error reporting, choice of input variables, and model transparency with respect to process metallurgy. A majority of the models are never tested on future heats, which essentially renders the models useless in a practical industrial setting. In addition, nonlinear models outperform linear models but lack transparency with regards to which input variables are influencing the EE consumption prediction. Some input variables that heavily influence the EE consumption are rarely used in the models. The scrap composition and additive materials are two such examples. These observed shortcomings have to be correctly addressed in future research applying statistical modeling on steel processes. Lastly, the paper provides three key recommendations for future research applying statistical modeling on steel processes.


1996 ◽  
Vol 19 (4) ◽  
pp. 221-231
Author(s):  
G. Amici

Four non-linear and five linear models for predicting the creatinine dialysate/plasma ratio (CRD/P) and the glucose dialysate/initial concentration ratio (GLD/Do) were evaluated in a group of 31 patients on peritoneal dialysis and subjected to the peritoneal equilibration test (PET 3.86%, 240'). PET results and classification were compared to obtain a definition of patient peritoneal transport characteristics. The monomolecular and rectangular hyperbola non-linear models, the Lineweaver-Burk, Hanes-Woolf and Dadone linear transformations were considered for the CRD/P fitting. A monoexponential and two-exponential decay plus the semilogarithmic transformations were considered for the GLD/Do. These models are simple, accurate and functionally homogeneous. Further studies are advisable however on the individual peritoneal transport classification, since ∼30% of the patients were in different categories for CRD/P and GLD/Do and the fittings do not give better classification results.


1980 ◽  
Vol 48 (2) ◽  
pp. 382-385 ◽  
Author(s):  
R. W. Mazzone ◽  
S. Kornblau ◽  
C. M. Durand

Glutaraldehyde is widely used to chemically fix lungs for analysis of pulmonary structure-function relations. Accurate interpretation of observations on fixed tissue requires a clear definition of any artifacts, such as tissue shrinkage, resulting from fixation with glutaraldehyde. Two experimental procedures were used in this study to examine possible shrinkage artifacts resulting from fixation of lung by glutaraldehyde. In the first, isolated perfused dog lungs were rapidly frozen at different transpulmonary pressures. Samples were then freeze substituted at -50 degrees C using 70% ethylene glycol with and without fixatives present. In the second series of experiments, the left lungs of mongrel dogs were fixed by vascular perfusion with glutaraldehyde at different transpulmonary pressures. In both series of experiments any changes in linear dimensions resulting from the fixation procedure were measured. Also, the presence of aldehyde was demonstrated by a positive reaction with Schiff reagent. The results demonstrate that lung tissue fixed either by vascular perfusion or freeze substitution tends to shrink to about the same extent. This shrinkage is reasonably constant at about 9% for transpulmonary pressures of 5 and 15 cmH2O and increases to about 15% when the transpulmonary pressure reaches 25 cmH20.


2020 ◽  
Vol 27 (4) ◽  
pp. 353-372
Author(s):  
Alejandro Romero ◽  
Francisco Bellas ◽  
José A. Becerra ◽  
Richard J. Duro

Designing robots has usually implied knowing beforehand the tasks to be carried out and in what domains. However, in the case of fully autonomous robots this is not possible. Autonomous robots need to operate in an open-ended manner, that is, deciding on the most interesting goals to achieve in domains that are not known at design time. This obviously poses a challenge from the point of view of designing the robot control structure. In particular, the main question that arises is how to endow the robot with a designer defined purpose and with means to translate that purpose into operational decisions without any knowledge of what situations the robot will find itself in. In this paper, we provide a formalization of motivation from an engineering perspective that allows for the structured design of purposeful robots. This formalization is based on a definition of the concepts of robot needs and drives, which are related through experience to the appropriate goals in specific domains. To illustrate the process, a motivational system to guide the operation of a real robot is constructed using this approach. A series of experiments carried out over it are discussed providing some insights on the design of purposeful motivated operation.


Sign in / Sign up

Export Citation Format

Share Document