A Robustness Test Protocol for Applied QCA: Theory and R Software Application

2021 ◽  
pp. 004912412110361
Author(s):  
Ioana-Elena Oana ◽  
Carsten Q. Schneider

The robustness of qualitative comparative analysis (QCA) results features high on the agenda of methodologists and practitioners. This article aims at advancing this debate on several fronts. First, in line with the extant literature, we take a comprehensive view on robustness arguing that decisions on calibration, consistency, and frequency thresholds should all be tested. Second, we introduce the notion of “sensitivity range” as the range of values for any of these parameters within which the solution formula remains unchanged. Third, we argue that interpreting robustness is more intricate than simply checking if solutions remain unchanged. Beyond sensitivity ranges, researchers should assess robustness by evaluating changes in parameters of fit and the classification of cases as robust, shaky, or possible. Fourth, we enable researchers to perform more than one robustness test at a time by proposing the notions of a “test set”: the overlap between conceptually plausible alternative solutions that can be generated; and of a “robust core”: that part of a QCA solution that withstands the robustness checks. Fifth, we present functionalities implemented in the R package SetMethods that enable researchers to put in practice our proposals. These advancements are integrated into a comprehensive QCA Robustness Test Protocol consisting of three main tests: sensitivity ranges, fit-oriented robustness, and case-oriented robustness. We illustrate the protocol’s implementation with an example on high life expectancy across the globe.

2021 ◽  
Author(s):  
Ioana-Elena Oana ◽  
Carsten Q. Schneider ◽  
Eva Thomann

A comprehensive introduction and teaching resource for state-of-the-art Qualitative Comparative Analysis (QCA) using R software. This guide facilitates the efficient teaching, independent learning, and use of QCA with the best available software, reducing the time and effort required when encountering not just the logic of a new method, but also new software. With its applied and practical focus, the book offers a genuinely simple and intuitive resource for implementing the most complete protocol of QCA. To make the lives of students, teachers, researchers, and practitioners as easy as possible, the book includes learning goals, core points, empirical examples, and tips for good practices. The freely available online material provides a rich body of additional resources to aid users in their learning process. Beyond performing core analyses with the R package QCA, the book also facilitates a close integration with the R package SetMethods allowing for a host of additional protocols for building a more solid and well-rounded QCA.


Cancers ◽  
2021 ◽  
Vol 13 (7) ◽  
pp. 1615
Author(s):  
Ines P. Nearchou ◽  
Hideki Ueno ◽  
Yoshiki Kajiwara ◽  
Kate Lillard ◽  
Satsuki Mochizuki ◽  
...  

The categorisation of desmoplastic reaction (DR) present at the colorectal cancer (CRC) invasive front into mature, intermediate or immature type has been previously shown to have high prognostic significance. However, the lack of an objective and reproducible assessment methodology for the assessment of DR has been a major hurdle to its clinical translation. In this study, a deep learning algorithm was trained to automatically classify immature DR on haematoxylin and eosin digitised slides of stage II and III CRC cases (n = 41). When assessing the classifier’s performance on a test set of patient samples (n = 40), a Dice score of 0.87 for the segmentation of myxoid stroma was reported. The classifier was then applied to the full cohort of 528 stage II and III CRC cases, which was then divided into a training (n = 396) and a test set (n = 132). Automatically classed DR was shown to have superior prognostic significance over the manually classed DR in both the training and test cohorts. The findings demonstrated that deep learning algorithms could be applied to assist pathologists in the detection and classification of DR in CRC in an objective, standardised and reproducible manner.


2021 ◽  
Vol 09 (06) ◽  
pp. E955-E964
Author(s):  
Ganggang Mu ◽  
Yijie Zhu ◽  
Zhanyue Niu ◽  
Shigang Ding ◽  
Honggang Yu ◽  
...  

Abstract Background and study aims Endoscopy plays a crucial role in diagnosis of gastritis. Endoscopists have low accuracy in diagnosing atrophic gastritis with white-light endoscopy (WLE). High-risk factors (such as atrophic gastritis [AG]) for carcinogenesis demand early detection. Deep learning (DL)-based gastritis classification with WLE rarely has been reported. We built a system for improving the accuracy of diagnosis of AG with WLE to assist with this common gastritis diagnosis and help lessen endoscopist fatigue. Methods We collected a total of 8141 endoscopic images of common gastritis, other gastritis, and non-gastritis in 4587 cases and built a DL -based system constructed with UNet + + and Resnet-50. A system was developed to sort common gastritis images layer by layer: The first layer included non-gastritis/common gastritis/other gastritis, the second layer contained AG/non-atrophic gastritis, and the third layer included atrophy/intestinal metaplasia and erosion/hemorrhage. The convolutional neural networks were tested with three separate test sets. Results Rates of accuracy for classifying non-atrophic gastritis/AG, atrophy/intestinal metaplasia, and erosion/hemorrhage were 88.78 %, 87.40 %, and 93.67 % in internal test set, 91.23 %, 85.81 %, and 92.70 % in the external test set ,and 95.00 %, 92.86 %, and 94.74 % in the video set, respectively. The hit ratio with the segmentation model was 99.29 %. The accuracy for detection of non-gastritis/common gastritis/other gastritis was 93.6 %. Conclusions The system had decent specificity and accuracy in classification of gastritis lesions. DL has great potential in WLE gastritis classification for assisting with achieving accurate diagnoses after endoscopic procedures.


2021 ◽  
Author(s):  
Qingqing Chen ◽  
Ate Poorthuis

Identifying meaningful locations, such as home or work, from human mobility data has become an increasingly common prerequisite for geographic research. Although location-based services (LBS) and other mobile technology have rapidly grown in recent years, it can be challenging to infer meaningful places from such data, which - compared to conventional datasets – can be devoid of context. Existing approaches are often developed ad-hoc and can lack transparency and reproducibility. To address this, we introduce an R software package for inferring home locations from LBS data. The package implements pre-existing algorithms and provides building blocks to make writing algorithmic ‘recipes’ more convenient. We evaluate this approach by analyzing a de-identified LBS dataset from Singapore that aims to balance ethics and privacy with the research goal of identifying meaningful locations. We show that ensemble approaches, combining multiple algorithms, can be especially valuable in this regard as the resulting patterns of inferred home locations closely correlate with the distribution of residential population. We hope this package, and others like it, will contribute to an increase in use and sharing of comparable algorithms, research code and data. This will increase transparency and reproducibility in mobility analyses and further the ongoing discourse around ethical big data research.


2019 ◽  
Author(s):  
Cheynna Crowley ◽  
Yuchen Yang ◽  
Yunjiang Qiu ◽  
Benxia Hu ◽  
Armen Abnousi ◽  
...  

AbstractHi-C experiments have been widely adopted to study chromatin spatial organization, which plays an essential role in genome function. We have recently identified frequently interacting regions (FIREs) and found that they are closely associated with cell-type-specific gene regulation. However, computational tools for detecting FIREs from Hi-C data are still lacking. In this work, we present FIREcaller, a stand-alone, user-friendly R package for detecting FIREs from Hi-C data. FIREcaller takes raw Hi-C contact matrices as input, performs within-sample and cross-sample normalization, and outputs continuous FIRE scores, dichotomous FIREs, and super-FIREs. Applying FIREcaller to Hi-C data from various human tissues, we demonstrate that FIREs and super-FIREs identified, in a tissue-specific manner, are closely related to gene regulation, are enriched for enhancer-promoter (E-P) interactions, tend to overlap with regions exhibiting epigenomic signatures of cis-regulatory roles, and aid the interpretation or GWAS variants. The FIREcaller package is implemented in R and freely available at https://yunliweb.its.unc.edu/FIREcaller.Highlights– Frequently Interacting Regions (FIREs) can be used to identify tissue and cell-type-specific cis-regulatory regions.– An R software, FIREcaller, has been developed to identify FIREs and clustered FIREs into super-FIREs.


2022 ◽  
pp. 096703352110618
Author(s):  
Orlando CH Tavares ◽  
Tiago R Tavares ◽  
Carlos R Pinheiro Junior ◽  
Luciélio M da Silva ◽  
Paulo GS Wadt ◽  
...  

The southwestern region of the Amazon has great environmental variability, presents a great complexity of pedoenvironments due to its rich variability of geological and geomorphological environments, as well as for being a transition region with other two Brazilian biomes. In this study, the use of pedometric tools (the Algorithms for Quantitative Pedology (AQP) R package and diffuse reflectance spectroscopy) was evaluated for the characterization of 15 soil profiles in southwestern Amazon. The AQP statistical package—which evaluates the soil in-depth based on slicing functions—indicated a wide range of variation in soil attributes, especially in the superficial horizons. In addition, the results obtained in the similarity analysis corroborated with the description of physical, chemical components and oxide contents in-depth, aiding the classification of soil profiles. The in-depth characterization of visible-near infrared spectra allowed inference of the pedogenetic processes of some profiles, setting precedents for future work aiming to establish analytical strategies for soil classification in southwestern Amazon based on spectral data.


2004 ◽  
Vol 22 (3) ◽  
pp. 534-537 ◽  
Author(s):  
Andrés F. López Camelo ◽  
Perla A. Gómez

Color in tomato is the most important external characteristic to assess ripeness and postharvest life, and is a major factor in the consumer's purchase decision. Degree of ripening is usually estimated by color charts. Colorimeters, on the other hand, express colors in numerical terms along the L*, a* and b* axes (from white to black, green to red and blue to yellow, respectively) within the CIELAB color sphere which are usually mathematically combined to calculate the color indexes. Color indexes and their relationship to the visual color classification of tomato fruits vine ripened were compared. L*, a* and b* data (175 observations from eleven cultivars) from visually classified fruits at harvest in six ripening stages according to the USDA were used to calculate hue, chroma, color index, color difference with pure red, a*/b* and (a*/b*)². ANOVA analysis were performed and means compared by Duncan's MRT. Color changes throughout tomato ripening were the result of significant changes in the values of L*, a* and b*. Under the conditions of this study, hue, color index, color difference and a*/b* expressed essentially the same, and the color categories were significantly different in terms of human perception, with hue showing higher range of values. Chroma was not a good parameter to express tomato ripeness, but could be used as a good indicator of consumer acceptance when tomatoes are fully ripened. The (a*/b*)² relationship had the same limitations as chroma. For vine ripened fruits, hue, color index, color difference and a*/b* could be used as objective ripening indexes. It would be interesting to find out what the best index would be if ripening took place under inadequate conditions of temperature and ilumination.


Sensors ◽  
2020 ◽  
Vol 20 (8) ◽  
pp. 2381
Author(s):  
Dan Li ◽  
Kaifeng Zhang ◽  
Zhenbo Li ◽  
Yifei Chen

The statistical data of different kinds of behaviors of pigs can reflect their health status. However, the traditional behavior statistics of pigs were obtained and then recorded from the videos through human eyes. In order to reduce labor and time consumption, this paper proposed a pig behavior recognition network with a spatiotemporal convolutional network based on the SlowFast network architecture for behavior classification of five categories. Firstly, a pig behavior recognition video dataset (PBVD-5) was built by cutting short clips from 3-month non-stop shooting videos, which was composed of five categories of pig’s behavior: feeding, lying, motoring, scratching and mounting. Subsequently, a SlowFast network based spatiotemporal convolutional network for the pig’s multi-behavior recognition (PMB-SCN) was proposed. The results of the networks with variant architectures of the PMB-SCN were implemented and the optimal architecture was compared with the state-of-the-art single stream 3D convolutional network in our dataset. Our 3D pig behavior recognition network showed a top-1 accuracy of 97.63% and a views accuracy of 96.35% on the test set of PBVD and a top-1 accuracy of 91.87% and a views accuracy of 84.47% on a new test set collected from a completely different pigsty. The experimental results showed that this network provided remarkable ability of generalization and possibility for the subsequent pig detection and behavior recognition simultaneously.


2017 ◽  
Vol 41 (4) ◽  
pp. 378-389
Author(s):  
Ben Dêivide de Oliveira Batista ◽  
Daniel Furtado Ferreira ◽  
Lucas Monteiro Chaves

ABSTRACT The distribution of externally studentized midrange was created based on the original studentization procedures of Student and was inspired in the distribution of the externally studentized range. The large use of the externally studentized range in multiple comparisons was also a motivation for developing this new distribution. This work aimed to derive analytic equations to distribution of the externally studentized midrange, obtaining the cumulative distribution, probability density and quantile functions and generating random values. This is a new distribution that the authors could not find any report in the literature. A second objective was to build an R package for obtaining numerically the probability density, cumulative distribution and quantile functions and make it available to the scientific community. The algorithms were proposed and implemented using Gauss-Legendre quadrature and the Newton-Raphson method in R software, resulting in the SMR package, available for download in the CRAN site. The implemented routines showed high accuracy proved by using Monte Carlo simulations and by comparing results with different number of quadrature points. Regarding to the precision to obtain the quantiles for cases where the degrees of freedom are close to 1 and the percentiles are close to 100%, it is recommended to use more than 64 quadrature points.


2011 ◽  
Vol 460-461 ◽  
pp. 667-672
Author(s):  
Yun Zhao ◽  
Xing Xu ◽  
Yong He

The main objective of this paper is to classify four kinds of automobile lubricant by near-infrared (NIR) spectral technology and to observe whether NIR spectroscopy could be used for predicting water content. Principle component analysis (PCA) was applied to reduce the information from the spectral data and first two PCs were used to cluster the samples. Partial least square (PLS), least square support vector machine (LS-SVM), and Gaussian processes classification (GPC) were employed to develop prediction models. There were 120 samples for training set and test set. Two LS-SVM models with first five PCs and first six PCs were built, respectively, and accuracy of the model with five PCs is adequate with less calculation. The results from the experiment indicate that the LS-SVM model outperforms the PLS model and GPC model outperforms the LS-SVM model.


Sign in / Sign up

Export Citation Format

Share Document