Repeated responses in misclassification binary regression: A Bayesian approach

Binary regression models generally assume that the response variable is measured perfectly. However, in some situations, the outcome is subject to misclassification: a success may be erroneously classified as a failure or vice versa. Many methods, described in existing literature, have been developed to deal with misclassification, but we demonstrate that these methods may lead to serious inferential problems when only a single evaluation of the individual is taken. Thus, this study proposes to incorporate repeated and independent responses in misclassification binary regression models, considering the total number of successes obtained or even the simple majority classification. We use subjective prior distributions, as our conditional means prior, to evaluate and compare models. A data augmentation approach, Gibbs sampling, and Adaptive Rejection Metropolis Sampling are used for posterior inferences. Simulation studies suggested that repeated measures significantly improve the posterior estimates, in that these estimates are closer to those obtained in a case with no misclassifications with a lower standard deviation. Finally, we illustrate the usefulness of the new methodology with the analysis about defects in eyeglass lenses.

Download Full-text

An application of two sided power distribution in Bayesian analysis of paired comparison of relative importance of predictors in linear regression models

International Journal of Advanced Statistics and Probability ◽

10.14419/ijasp.v3i2.4573 ◽

2015 ◽

Vol 3 (2) ◽

pp. 169 ◽

Cited By ~ 4

Author(s):

Xiaoyin Wang

Keyword(s):

Power Distribution ◽

Regression Models ◽

Paired Comparison ◽

Multiple Regression Model ◽

Link Function ◽

Relative Importance ◽

Simulation Studies ◽

Research Practice ◽

Linear Regression Models ◽

The Individual

<p>The purpose of determining the relative importance of predictors is to expose the extent of the individual contribution of a predictor in the presence of other predictors within a selected model. The goal of this article is to expand the current research practice by developing a statistical paired comparison model with Two Sided Power (TSP) link function in the Bayesian framework to evaluate the relative importance of each predictor in a multiple regression model. Results from simulation studies and empirical example reveal that the proposed Two Sided Power link function provides similar conclusions as the commonly used logit link function, but has more advantages from both practical and computational perspectives.</p>

Download Full-text

Automatic Labeled Dialogue Generation for Nursing Record Systems

Journal of Personalized Medicine ◽

10.3390/jpm10030062 ◽

2020 ◽

Vol 10 (3) ◽

pp. 62

Author(s):

Tittaya Mairittha ◽

Nattaya Mairittha ◽

Sozo Inoue

Keyword(s):

Data Augmentation ◽

Short Term Memory ◽

Generative Models ◽

Abstract Knowledge ◽

Augmentation Techniques ◽

Nursing Record ◽

Long Short Term Memory ◽

The Individual ◽

High Level ◽

Embedding Methods

The integration of digital voice assistants in nursing residences is becoming increasingly important to facilitate nursing productivity with documentation. A key idea behind this system is training natural language understanding (NLU) modules that enable the machine to classify the purpose of the user utterance (intent) and extract pieces of valuable information present in the utterance (entity). One of the main obstacles when creating robust NLU is the lack of sufficient labeled data, which generally relies on human labeling. This process is cost-intensive and time-consuming, particularly in the high-level nursing care domain, which requires abstract knowledge. In this paper, we propose an automatic dialogue labeling framework of NLU tasks, specifically for nursing record systems. First, we apply data augmentation techniques to create a collection of variant sample utterances. The individual evaluation result strongly shows a stratification rate, with regard to both fluency and accuracy in utterances. We also investigate the possibility of applying deep generative models for our augmented dataset. The preliminary character-based model based on long short-term memory (LSTM) obtains an accuracy of 90% and generates various reasonable texts with BLEU scores of 0.76. Secondly, we introduce an idea for intent and entity labeling by using feature embeddings and semantic similarity-based clustering. We also empirically evaluate different embedding methods for learning good representations that are most suitable to use with our data and clustering tasks. Experimental results show that fastText embeddings produce strong performances both for intent labeling and on entity labeling, which achieves an accuracy level of 0.79 and 0.78 f1-scores and 0.67 and 0.61 silhouette scores, respectively.

Download Full-text

An Evaluation of the Effectiveness of Image-based Texture Features Extracted from Static B-mode Ultrasound Images in Distinguishing between Benign and Malignant Ovarian Masses

Ultrasonic Imaging ◽

10.1177/0161734621998091 ◽

2021 ◽

pp. 016173462199809

Author(s):

Dhurgham Al-karawi ◽

Hisham Al-Assam ◽

Hongbo Du ◽

Ahmad Sayasneh ◽

Chiara Landolfo ◽

...

Keyword(s):

Gabor Filter ◽

Empirical Evaluation ◽

Texture Features ◽

Image Texture ◽

Support Vector ◽

Simple Majority ◽

Ultrasound Scan ◽

Histograms Of Oriented Gradients ◽

The Individual ◽

Ovarian Masses

Significant successes in machine learning approaches to image analysis for various applications have energized strong interest in automated diagnostic support systems for medical images. The evolving in-depth understanding of the way carcinogenesis changes the texture of cellular networks of a mass/tumor has been informing such diagnostics systems with use of more suitable image texture features and their extraction methods. Several texture features have been recently applied in discriminating malignant and benign ovarian masses by analysing B-mode images from ultrasound scan of the ovary with different levels of performance. However, comparative performance evaluation of these reported features using common sets of clinically approved images is lacking. This paper presents an empirical evaluation of seven commonly used texture features (histograms, moments of histogram, local binary patterns [256-bin and 59-bin], histograms of oriented gradients, fractal dimensions, and Gabor filter), using a collection of 242 ultrasound scan images of ovarian masses of various pathological characteristics. The evaluation examines not only the effectiveness of classification schemes based on the individual texture features but also the effectiveness of various combinations of these schemes using the simple majority-rule decision level fusion. Trained support vector machine classifiers on the individual texture features without any specific pre-processing, achieve levels of accuracy between 75% and 85% where the seven moments and the 256-bin LBP are at the lower end while the Gabor filter is at the upper end. Combining the classification results of the top k ( k = 3, 5, 7) best performing features further improve the overall accuracy to a level between 86% and 90%. These evaluation results demonstrate that each of the investigated image-based texture features provides informative support in distinguishing benign or malignant ovarian masses.

Download Full-text

Effects of the Carrier Phrase on Word Recognition Performances by Younger and Older Listeners Using Two Stimulus Paradigms

Journal of the American Academy of Audiology ◽

10.3766/jaaa.19061 ◽

2020 ◽

Vol 31 (06) ◽

pp. 412-441 ◽

Cited By ~ 1

Author(s):

Richard H. Wilson ◽

Victoria A. Sanchez

Keyword(s):

Word Recognition ◽

Target Word ◽

Speaker Recognition ◽

Repeated Measures ◽

Recognition Performance ◽

Test Word ◽

Root Mean Square Amplitude ◽

Repeated Measures Design ◽

The Mean ◽

The Individual

Abstract Background In the 1950s, with monitored live voice testing, the vu meter time constant and the short durations and amplitude modulation characteristics of monosyllabic words necessitated the use of the carrier phrase amplitude to monitor (indirectly) the presentation level of the words. This practice continues with recorded materials. To relieve the carrier phrase of this function, first the influence that the carrier phrase has on word recognition performance needs clarification, which is the topic of this study. Purpose Recordings of Northwestern University Auditory Test No. 6 by two female speakers were used to compare word recognition performances with and without the carrier phrases when the carrier phrase and test word were (1) in the same utterance stream with the words excised digitally from the carrier (VA-1 speaker) and (2) independent of one another (VA-2 speaker). The 50-msec segment of the vowel in the target word with the largest root mean square amplitude was used to equate the target word amplitudes. Research Design A quasi-experimental, repeated measures design was used. Study Sample Twenty-four young normal-hearing adults (YNH; M = 23.5 years; pure-tone average [PTA] = 1.3-dB HL) and 48 older hearing loss listeners (OHL; M = 71.4 years; PTA = 21.8-dB HL) participated in two, one-hour sessions. Data Collection and Analyses Each listener had 16 listening conditions (2 speakers × 2 carrier phrase conditions × 4 presentation levels) with 100 randomized words, 50 different words by each speaker. Each word was presented 8 times (2 carrier phrase conditions × 4 presentation levels [YNH, 0- to 24-dB SL; OHL, 6- to 30-dB SL]). The 200 recorded words for each condition were randomized as 8, 25-word tracks. In both test sessions, one practice track was followed by 16 tracks alternated between speakers and randomized by blocks of the four conditions. Central tendency and repeated measures analyses of variance statistics were used. Results With the VA-1 speaker, the overall mean recognition performances were 6.0% (YNH) and 8.3% (OHL) significantly better with the carrier phrase than without the carrier phrase. These differences were in part attributed to the distortion of some words caused by the excision of the words from the carrier phrases. With the VA-2 speaker, recognition performances on the with and without carrier phrase conditions by both listener groups were not significantly different, except for one condition (YNH listeners at 8-dB SL). The slopes of the mean functions were steeper for the YNH listeners (3.9%/dB to 4.8%/dB) than for the OHL listeners (2.4%/dB to 3.4%/dB) and were <1%/dB steeper for the VA-1 speaker than for the VA-2 speaker. Although the mean results were clear, the variability in performance differences between the two carrier phrase conditions for the individual participants and for the individual words was striking and was considered in detail. Conclusion The current data indicate that word recognition performances with and without the carrier phrase (1) were different when the carrier phrase and target word were produced in the same utterance with poorer performances when the target words were excised from their respective carrier phrases (VA-1 speaker), and (2) were the same when the carrier phrase and target word were produced as independent utterances (VA-2 speaker).

Download Full-text

Design effects for binary regression models fitted to dependent data

Statistics in Medicine ◽

10.1002/sim.4780121307 ◽

1993 ◽

Vol 12 (13) ◽

pp. 1259-1268 ◽

Cited By ~ 26

Author(s):

John M. Neuhaus ◽

Mark R. Segal

Keyword(s):

Regression Models ◽

Dependent Data ◽

Binary Regression ◽

Design Effects

Download Full-text

Modeling of Tractor Fuel Consumption

Energies ◽

10.3390/en14082300 ◽

2021 ◽

Vol 14 (8) ◽

pp. 2300

Author(s):

Bronisław Andrzej Kolator

Keyword(s):

Fuel Consumption ◽

Soil Conditions ◽

Simulation Studies ◽

Working Capacity ◽

Operational Parameters ◽

System A ◽

The Individual ◽

Adjustment System ◽

Unit Performance ◽

Efficiency Indicators

In this paper, the energy diagnostic of tractor performance consists in evaluating the energy (fuel consumption per hectare—dm3 ha−1) for a given agricultural operation and in combining it with working capacity, also called productivity (area productivity—ha h−1). One of the methods of solving this problem is the identification of the functioning process of the machine unit. A model of the process of the machine unit performance was developed, considering the operation of the rear linkage system of the implement with the force control adjustment system. In order to analyze the system, a mathematical model of the system function was built: tractor-implement-soil, defining the physical connections and interdependencies between the individual subsystems of the system. Based on this model, a simulation model was developed and implemented in the Matlab/Simulink environment. The Simulink package was used to test the performance of the machine set. The efficiency indicators according to the adopted criteria were calculated in the evaluation block. To evaluate the process, the technical and operational parameters of the tractor, the type and parameters of the tool, and soil properties were taken into account. The results of simulation studies obtained on a validated model are consistent with experimental data from appropriate soil conditions.

Download Full-text

PERBANDINGAN REGRESI ZERO INFLATED POISSON (ZIP) DAN REGRESI ZERO INFLATED NEGATIVE BINOMIAL (ZINB) PADA DATA OVERDISPERSION (Studi Kasus: Angka Kematian Ibu di Provinsi Bali)

E-Jurnal Matematika ◽

10.24843/mtk.2016.v05.i04.p132 ◽

2016 ◽

Vol 5 (4) ◽

pp. 133

Author(s):

NI PUTU PREMA DEWANTI ◽

MADE SUSILAWATI ◽

I GUSTI AYU MADE SRINADI

Keyword(s):

Mortality Rate ◽

Maternal Mortality ◽

Regression Models ◽

Negative Binomial ◽

Mean Value ◽

Response Variable ◽

Maternal Mortality Rate ◽

Excess Zeros ◽

Zero Values ◽

Independent Variable

Poisson regression is a nonlinear regression which is often used for count data and has equidispersion assumption (variance value equal to mean value). However in practice, equidispersion assumption is often violated. One of it violations is overdispersion (variance value greater than the mean value). One of the causes of overdipersion is excessive number of zero values on the response variable (excess zeros). There are many methods to handle overdispersion because of excess zeros. Two of them are Zero Inflated Poisson (ZIP) regression and Zero Inflated Negative Binomial (ZINB) regression. The purpose of this research is to determine which regression models is better in handling overdispersion data. The data that can be analyzed using the ZIP and ZINB regression is maternal mortality rate in the Province of Bali. Maternal mortality rate data has proportion of zeros value more than 50% on the response variable. In this research, ZINB regression better than ZIP regression for modeling maternal mortality rate. The independent variable that affects the number of maternal mortality rate in the Province of Bali is the percentage of mothers who carry a pregnancy visit, with ZINB regression models and .

Download Full-text

Revisiting Gaussian copulas to handle endogenous regressors

Journal of the Academy of Marketing Science ◽

10.1007/s11747-021-00805-y ◽

2021 ◽

Author(s):

Jan-Michael Becker ◽

Dorian Proksch ◽

Christian M. Ringle

Keyword(s):

Regression Models ◽

Statistical Power ◽

Multilevel Models ◽

Gaussian Copula ◽

Simulation Studies ◽

Validity Of Results ◽

Endogenous Regressors ◽

Copula Approach ◽

Original Presentation ◽

And Performance

AbstractMarketing researchers are increasingly taking advantage of the instrumental variable (IV)-free Gaussian copula approach. They use this method to identify and correct endogeneity when estimating regression models with non-experimental data. The Gaussian copula approach’s original presentation and performance demonstration via a series of simulation studies focused primarily on regression models without intercept. However, marketing and other disciplines’ researchers mainly use regression models with intercept. This research expands our knowledge of the Gaussian copula approach to regression models with intercept and to multilevel models. The results of our simulation studies reveal a fundamental bias and concerns about statistical power at smaller sample sizes and when the approach’s primary assumptions are not fully met. This key finding opposes the method’s potential advantages and raises concerns about its appropriate use in prior studies. As a remedy, we derive boundary conditions and guidelines that contribute to the Gaussian copula approach’s proper use. Thereby, this research contributes to ensuring the validity of results and conclusions of empirical research applying the Gaussian copula approach.

Download Full-text

Effects of Pre-Activation with Variable Intra-Repetition Resistance on Throwing Velocity in Female Handball Players: A Methodological Proposal

Journal of Human Kinetics ◽

10.2478/hukin-2021-0022 ◽

2021 ◽

Vol 77 (1) ◽

pp. 235-244

Author(s):

Darío Martínez-García ◽

Ángela Rodríguez-Perea ◽

Álvaro Huerta-Ojeda ◽

Daniel Jerez-Mayorga ◽

Daniel Aguilar-Martínez ◽

...

Keyword(s):

Statistical Analysis ◽

Repeated Measures ◽

Initial Velocity ◽

Acute Effect ◽

Rest Interval ◽

Maximum Voluntary Isometric Contraction ◽

Variable Resistance ◽

Ball Velocity ◽

The Mean ◽

The Individual

Abstract The purpose of this study was to investigate the acute effect of pre-activation with Variable Intra-Repetition Resistance and isometry on the overhead throwing velocity in handball players. Fourteen female handball players took part in the study (age: 21.2 ± 2.7 years, experience: 10.9 ± 3.5 years). For Post-Activation Potentiation, two pre-activation methods were used: (I) Variable Intra-Repetition Resistance: 1 x 5 maximum repetitions at an initial velocity of 0.6 m·s-1 and a final velocity of 0.9 m·s-1; (II) Isometry: 1 x 5 s of maximum voluntary isometric contraction. Both methods were "standing unilateral bench presses" with the dominant arm, using a functional electromechanical dynamometer. The variable analysed was the mean of the three overhead throws. Ball velocity was measured with a radar (Stalker ATS). The statistical analysis was performed using ANOVA with repeated measures. No significant differences were found for either method (variable resistance intra-repetition: p = 0.194, isometry: p = 0.596). Regarding the individual responses, the analysis showed that 86% of the sample increased throwing velocity with the variable resistance intra-repetition method, while 93% of the sample increased throwing velocity with the isometric method. Both the variable intra-repetition resistance and isometric methods show improvements in ball velocity in female handball players. However, the authors recommend checking individual responses, since the results obtained were influenced by the short rest interval between the pre-activation and the experimental sets.

Download Full-text

Differences in Sickness Allowance Receipt between Swedish Speakers and Finnish Speakers in Finland

Finnish Yearbook of Population Research ◽

10.23979/fypr.66598 ◽

2017 ◽

Vol 52 ◽

pp. 43-58 ◽

Cited By ~ 3

Author(s):

Kaarina S. Reini ◽

Jan Saarela

Keyword(s):

Regression Models ◽

Health Condition ◽

Logistic Regression Models ◽

Individual Level ◽

Objective Health ◽

Language Groups ◽

Health Measure ◽

All Cause Mortality ◽

The Difference ◽

The Individual

Previous research has documented lower disability retirement and mortality rates of Swedish speakers as compared with Finnish speakers in Finland. This paper is the first to compare the two language groups with regard to the receipt of sickness allowance, which is an objective health measure that reflects a less severe poor health condition. Register-based data covering the years 1988-2011 are used. We estimate logistic regression models with generalized estimating equations to account for repeated observations at the individual level. We find that Swedish-speaking men have approximately 30 percent lower odds of receiving sickness allowance than Finnish-speaking men, whereas the difference in women is about 15 percent. In correspondence with previous research on all-cause mortality at working ages, we find no language-group difference in sickness allowance receipt in the socially most successful subgroup of the population.

Download Full-text