scholarly journals Using demographics toward efficient data classification in citizen science: a Bayesian approach

2019 ◽  
Vol 5 ◽  
pp. e239
Author(s):  
Pietro De Lellis ◽  
Shinnosuke Nakayama ◽  
Maurizio Porfiri

Public participation in scientific activities, often called citizen science, offers a possibility to collect and analyze an unprecedentedly large amount of data. However, diversity of volunteers poses a challenge to obtain accurate information when these data are aggregated. To overcome this problem, we propose a classification algorithm using Bayesian inference that harnesses diversity of volunteers to improve data accuracy. In the algorithm, each volunteer is grouped into a distinct class based on a survey regarding either their level of education or motivation to citizen science. We obtained the behavior of each class through a training set, which was then used as a prior information to estimate performance of new volunteers. By applying this approach to an existing citizen science dataset to classify images into categories, we demonstrate improvement in data accuracy, compared to the traditional majority voting. Our algorithm offers a simple, yet powerful, way to improve data accuracy under limited effort of volunteers by predicting the behavior of a class of individuals, rather than attempting at a granular description of each of them.

2020 ◽  
Vol 2020 (10) ◽  
pp. 64-1-64-5
Author(s):  
Mustafa I. Jaber ◽  
Christopher W. Szeto ◽  
Bing Song ◽  
Liudmila Beziaeva ◽  
Stephen C. Benz ◽  
...  

In this paper, we propose a patch-based system to classify non-small cell lung cancer (NSCLC) diagnostic whole slide images (WSIs) into two major histopathological subtypes: adenocarcinoma (LUAD) and squamous cell carcinoma (LUSC). Classifying patients accurately is important for prognosis and therapy decisions. The proposed system was trained and tested on 876 subtyped NSCLC gigapixel-resolution diagnostic WSIs from 805 patients – 664 in the training set and 141 in the test set. The algorithm has modules for: 1) auto-generated tumor/non-tumor masking using a trained residual neural network (ResNet34), 2) cell-density map generation (based on color deconvolution, local drain segmentation, and watershed transformation), 3) patch-level feature extraction using a pre-trained ResNet34, 4) a tower of linear SVMs for different cell ranges, and 5) a majority voting module for aggregating subtype predictions in unseen testing WSIs. The proposed system was trained and tested on several WSI magnifications ranging from x4 to x40 with a best ROC AUC of 0.95 and an accuracy of 0.86 in test samples. This fully-automated histopathology subtyping method outperforms similar published state-of-the-art methods for diagnostic WSIs.


2021 ◽  
pp. 101313
Author(s):  
Julie Mugford ◽  
Elena Moltchanova ◽  
Michael Plank ◽  
Jon Sullivan ◽  
Andrea Byrom ◽  
...  

2020 ◽  
Vol 36 (Supplement_2) ◽  
pp. i675-i683
Author(s):  
Sudhir Kumar ◽  
Antonia Chroni ◽  
Koichiro Tamura ◽  
Maxwell Sanderford ◽  
Olumide Oladeinde ◽  
...  

Abstract Summary Metastases cause a vast majority of cancer morbidity and mortality. Metastatic clones are formed by dispersal of cancer cells to secondary tissues, and are not medically detected or visible until later stages of cancer development. Clone phylogenies within patients provide a means of tracing the otherwise inaccessible dynamic history of migrations of cancer cells. Here, we present a new Bayesian approach, PathFinder, for reconstructing the routes of cancer cell migrations. PathFinder uses the clone phylogeny, the number of mutational differences among clones, and the information on the presence and absence of observed clones in primary and metastatic tumors. By analyzing simulated datasets, we found that PathFinder performes well in reconstructing clone migrations from the primary tumor to new metastases as well as between metastases. It was more challenging to trace migrations from metastases back to primary tumors. We found that a vast majority of errors can be corrected by sampling more clones per tumor, and by increasing the number of genetic variants assayed per clone. We also identified situations in which phylogenetic approaches alone are not sufficient to reconstruct migration routes. In conclusion, we anticipate that the use of PathFinder will enable a more reliable inference of migration histories and their posterior probabilities, which is required to assess the relative preponderance of seeding of new metastasis by clones from primary tumors and/or existing metastases. Availability and implementation PathFinder is available on the web at https://github.com/SayakaMiura/PathFinder.


2014 ◽  
Vol 55 ◽  
Author(s):  
Jonas Mockus ◽  
Irina Vinogradova

Many real applications are using uncertain data This include expert decisions based on their subjective opinions, The uncertainty can be evaluated applying fuzzy sets theory or the methods of mathematical statistics. In this paper it is proposed to use the Bayesian approach by different distribution functions defining the expert opinion and some prior information. The results are illustrated evaluating the quality of distant education courses.


Stats ◽  
2019 ◽  
Vol 2 (1) ◽  
pp. 111-120 ◽  
Author(s):  
Dewi Rahardja

We construct a point and interval estimation using a Bayesian approach for the difference of two population proportion parameters based on two independent samples of binomial data subject to one type of misclassification. Specifically, we derive an easy-to-implement closed-form algorithm for drawing from the posterior distributions. For illustration, we applied our algorithm to a real data example. Finally, we conduct simulation studies to demonstrate the efficiency of our algorithm for Bayesian inference.


Author(s):  
A. TETERUKOVSKIY

A problem of automatic detection of tracks in aerial photos is considered. We adopt a Bayesian approach and base our inference on an a priori knowledge of the structure of tracks. The probability of a pixel to belong to a track depends on how the pixel gray level differs from the gray levels of pixels in the neighborhood and on additional prior information. Several suggestions on how to formalize the prior knowledge about the shape of the tracks are made. The Gibbs sampler is used to construct the most probable configuration of tracks in the area. The method is applied to aerial photos with cell size of 1 sq. m. Even for detection of trails of width comparable with or smaller than the cell size, positive results can be achieved.


2015 ◽  
Vol 10 (S314) ◽  
pp. 67-68
Author(s):  
Jinhee Lee ◽  
Inseok Song

AbstractWe present a refined moving group membership diagnostics scheme based on Bayesian inference. Compared to the BANYAN II method, we improved the calculation by updating bona fide members of a moving group, field star treatment, and uniform spatial distribution of moving group members. Here, we present the detailed description of our method and the new results for Bayesian membership calculation. Comparison of our method with BANYAN II shows probability differences up to ~90%. We conclude that more cautious consideration is needed in moving group membership based on Bayesian inference.


2005 ◽  
Vol 08 (01) ◽  
pp. 1-12 ◽  
Author(s):  
FRANCISCO VENEGAS-MARTÍNEZ

This paper develops a Bayesian model for pricing derivative securities with prior information on volatility. Prior information is given in terms of expected values of levels and rates of precision: the inverse of variance. We provide several approximate formulas, for valuing European call options, on the basis of asymptotic and polynomial approximations of Bessel functions.


Sign in / Sign up

Export Citation Format

Share Document