scholarly journals Inferring tree causal models of cancer progression with probability raising

2013 ◽  
Author(s):  
Loes Olde Loohuis ◽  
Giulio Caravagna ◽  
Alex Graudenzi ◽  
Daniele Ramazzotti ◽  
Giancarlo Mauri ◽  
...  

Existing techniques to reconstruct tree models of progression for accumulative processes, such as cancer, seek to estimate causation by combining correlation and a frequentist notion of temporal priority. In this paper, we define a novel theoretical framework called CAPRESE (CAncer PRogression Extraction with Single Edges) to reconstruct such models based on the notion of probabilistic causation defined by Suppes. We consider a general reconstruction setting complicated by the presence of noise in the data due to biological variation, as well as experimental or measurement errors. To improve tolerance to noise we define and use a shrinkage-like estimator. We prove the correctness of our algorithm by showing asymptotic convergence to the correct tree under mild constraints on the level of noise. Moreover, on synthetic data, we show that our approach outperforms the state-of-the-art, that it is efficient even with a relatively small number of samples and that its performance quickly converges to its asymptote as the number of samples increases. For real cancer datasets obtained with different technologies, we highlight biologically significant differences in the progressions inferred with respect to other competing techniques and we also show how to validate conjectured biological relations with progression models.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
João Lobo ◽  
Rui Henriques ◽  
Sara C. Madeira

Abstract Background Three-way data started to gain popularity due to their increasing capacity to describe inherently multivariate and temporal events, such as biological responses, social interactions along time, urban dynamics, or complex geophysical phenomena. Triclustering, subspace clustering of three-way data, enables the discovery of patterns corresponding to data subspaces (triclusters) with values correlated across the three dimensions (observations $$\times$$ × features $$\times$$ × contexts). With increasing number of algorithms being proposed, effectively comparing them with state-of-the-art algorithms is paramount. These comparisons are usually performed using real data, without a known ground-truth, thus limiting the assessments. In this context, we propose a synthetic data generator, G-Tric, allowing the creation of synthetic datasets with configurable properties and the possibility to plant triclusters. The generator is prepared to create datasets resembling real 3-way data from biomedical and social data domains, with the additional advantage of further providing the ground truth (triclustering solution) as output. Results G-Tric can replicate real-world datasets and create new ones that match researchers needs across several properties, including data type (numeric or symbolic), dimensions, and background distribution. Users can tune the patterns and structure that characterize the planted triclusters (subspaces) and how they interact (overlapping). Data quality can also be controlled, by defining the amount of missing, noise or errors. Furthermore, a benchmark of datasets resembling real data is made available, together with the corresponding triclustering solutions (planted triclusters) and generating parameters. Conclusions Triclustering evaluation using G-Tric provides the possibility to combine both intrinsic and extrinsic metrics to compare solutions that produce more reliable analyses. A set of predefined datasets, mimicking widely used three-way data and exploring crucial properties was generated and made available, highlighting G-Tric’s potential to advance triclustering state-of-the-art by easing the process of evaluating the quality of new triclustering approaches.



2021 ◽  
Vol 11 (10) ◽  
pp. 4570
Author(s):  
Oliver Rothkamm ◽  
Johannes Gürtler ◽  
Jürgen Czarske ◽  
Robert Kuschmierz

Tomographic reconstruction allows for the recovery of 3D information from 2D projection data. This commonly requires a full angular scan of the specimen. Angular restrictions that exist, especially in technical processes, result in reconstruction artifacts and unknown systematic measurement errors. We investigate the use of neural networks for extrapolating the missing projection data from holographic sound pressure measurements. A bias flow liner was studied for active sound dampening in aviation. We employed a dense U-Net trained on synthetic data and compared reconstructions of simulated and measured data with and without extrapolation. In both cases, the neural network based approach decreases the mean and maximum measurement deviations by a factor of two. These findings can enable quantitative measurements in other applications suffering from limited angular access as well.



Entropy ◽  
2021 ◽  
Vol 23 (6) ◽  
pp. 674
Author(s):  
Kushani De De Silva ◽  
Carlo Cafaro ◽  
Adom Giffin

Attaining reliable gradient profiles is of utmost relevance for many physical systems. In many situations, the estimation of the gradient is inaccurate due to noise. It is common practice to first estimate the underlying system and then compute the gradient profile by taking the subsequent analytic derivative of the estimated system. The underlying system is often estimated by fitting or smoothing the data using other techniques. Taking the subsequent analytic derivative of an estimated function can be ill-posed. This becomes worse as the noise in the system increases. As a result, the uncertainty generated in the gradient estimate increases. In this paper, a theoretical framework for a method to estimate the gradient profile of discrete noisy data is presented. The method was developed within a Bayesian framework. Comprehensive numerical experiments were conducted on synthetic data at different levels of noise. The accuracy of the proposed method was quantified. Our findings suggest that the proposed gradient profile estimation method outperforms the state-of-the-art methods.



Symmetry ◽  
2019 ◽  
Vol 11 (2) ◽  
pp. 227
Author(s):  
Eckart Michaelsen ◽  
Stéphane Vujasinovic

Representative input data are a necessary requirement for the assessment of machine-vision systems. For symmetry-seeing machines in particular, such imagery should provide symmetries as well as asymmetric clutter. Moreover, there must be reliable ground truth with the data. It should be possible to estimate the recognition performance and the computational efforts by providing different grades of difficulty and complexity. Recent competitions used real imagery labeled by human subjects with appropriate ground truth. The paper at hand proposes to use synthetic data instead. Such data contain symmetry, clutter, and nothing else. This is preferable because interference with other perceptive capabilities, such as object recognition, or prior knowledge, can be avoided. The data are given sparsely, i.e., as sets of primitive objects. However, images can be generated from them, so that the same data can also be fed into machines requiring dense input, such as multilayered perceptrons. Sparse representations are preferred, because the author’s own system requires such data, and in this way, any influence of the primitive extraction method is excluded. The presented format allows hierarchies of symmetries. This is important because hierarchy constitutes a natural and dominant part in symmetry-seeing. The paper reports some experiments using the author’s Gestalt algebra system as symmetry-seeing machine. Additionally included is a comparative test run with the state-of-the-art symmetry-seeing deep learning convolutional perceptron of the PSU. The computational efforts and recognition performance are assessed.



2020 ◽  
Author(s):  
Alceu Bissoto ◽  
Sandra Avila

Melanoma is the most lethal type of skin cancer. Early diagnosis is crucial to increase the survival rate of those patients due to the possibility of metastasis. Automated skin lesion analysis can play an essential role by reaching people that do not have access to a specialist. However, since deep learning became the state-of-the-art for skin lesion analysis, data became a decisive factor in pushing the solutions further. The core objective of this M.Sc. dissertation is to tackle the problems that arise by having limited datasets. In the first part, we use generative adversarial networks to generate synthetic data to augment our classification model’s training datasets to boost performance. Our method generates high-resolution clinically-meaningful skin lesion images, that when compound our classification model’s training dataset, consistently improved the performance in different scenarios, for distinct datasets. We also investigate how our classification models perceived the synthetic samples and how they can aid the model’s generalization. Finally, we investigate a problem that usually arises by having few, relatively small datasets that are thoroughly re-used in the literature: bias. For this, we designed experiments to study how our models’ use data, verifying how it exploits correct (based on medical algorithms), and spurious (based on artifacts introduced during image acquisition) correlations. Disturbingly, even in the absence of any clinical information regarding the lesion being diagnosed, our classification models presented much better performance than chance (even competing with specialists benchmarks), highly suggesting inflated performances.



A Data mining is the method of extracting useful information from various repositories such as Relational Database, Transaction database, spatial database, Temporal and Time-series database, Data Warehouses, World Wide Web. Various functionalities of Data mining include Characterization and Discrimination, Classification and prediction, Association Rule Mining, Cluster analysis, Evolutionary analysis. Association Rule mining is one of the most important techniques of Data Mining, that aims at extracting interesting relationships within the data. In this paper we study various Association Rule mining algorithms, also compare them by using synthetic data sets, and we provide the results obtained from the experimental analysis



MaRBLe ◽  
2019 ◽  
Vol 1 ◽  
Author(s):  
Roelien Van der Wel

This paper discusses different strategies of climate change denial and focusses on the specific case of Dutch politician Thierry Baudet. Much of the literature concerning climate change denial focusses on Anglo-American cases, therefore more research non-English speaking countries is necessary. The theoretical framework describes the state of the art concerning climate change denialism and its links to occurring phenomena in Western societies and politics such as post-truth and populism. Afterwards, by conducting a deductive analysis of  Thierry Baudet’s climate denialism in the Netherlands, a more thorough understanding of the different strategies proposed by Stefan Rahmstorf  and Engels et al. is reached. Although all four categories are detected in Baudet’s denialism, consensus denial seems to be the most prevalent. The analysis of his usage of the notion of a climate apocalypse, combined with the analysis of his specific focus on consensus denial, broadens the understanding of how climate change denial can relate to populism. 



2021 ◽  
Vol 7 ◽  
pp. e495
Author(s):  
Saleh Albahli ◽  
Hafiz Tayyab Rauf ◽  
Abdulelah Algosaibi ◽  
Valentina Emilia Balas

Artificial intelligence (AI) has played a significant role in image analysis and feature extraction, applied to detect and diagnose a wide range of chest-related diseases. Although several researchers have used current state-of-the-art approaches and have produced impressive chest-related clinical outcomes, specific techniques may not contribute many advantages if one type of disease is detected without the rest being identified. Those who tried to identify multiple chest-related diseases were ineffective due to insufficient data and the available data not being balanced. This research provides a significant contribution to the healthcare industry and the research community by proposing a synthetic data augmentation in three deep Convolutional Neural Networks (CNNs) architectures for the detection of 14 chest-related diseases. The employed models are DenseNet121, InceptionResNetV2, and ResNet152V2; after training and validation, an average ROC-AUC score of 0.80 was obtained competitive as compared to the previous models that were trained for multi-class classification to detect anomalies in x-ray images. This research illustrates how the proposed model practices state-of-the-art deep neural networks to classify 14 chest-related diseases with better accuracy.



2021 ◽  
pp. 1-20
Author(s):  
Hanna Bäck ◽  
Marc Debus ◽  
Jorge M. Fernandes

The contribution of this chapter to our volume is fourfold. First, we look at why we should study legislative debates and how scholars may benefit from representation, legislative politics, party politics, and electoral studies by incorporating debates in their analysis. In so doing, we unpack their functions in liberal democracies. Second, the chapter offers a state of the art of the burgeoning field of legislative debates. We focus on the normative scholarly discussion about legislative debates and their importance for deliberation and democratic outputs. In addition, we dwell on Proksch and Slapin’s model as a watershed in the empirical study of legislative debates, particularly due to its capacity to travel and its usefulness in understanding how different institutional settings have an impact of speechmaking. Third, the chapter presents the theoretical framework, the key hypotheses guiding the volume, and our empirical approach to legislative debates. Fourth, the chapter concludes with the plan of the book.



Sign in / Sign up

Export Citation Format

Share Document