End-to-end simulations of the MUon RAdiography of VESuvius experiment

Abstract The MUon RAdiography of VESuvius (MURAVES) project aims at the study of the summital cone of Mt. Vesuvius, an active volcano near Naples (Italy), by measuring its density profile through muon flux attenuation. Its data, combined with those from gravimetric and seismic measurement campaigns, will be used for better defining the volcanic plug at the bottom of the crater. We report on the development of an end-to-end simulation framework, in order to perform accurate investigations of the effects of the experimental constraints and to compare simulations, under various model hypotheses, with the actual observations. The detector simulation setup is developed using GEANT4 and a study of cosmic particle generators has been conducted to identify the most suitable one for our simulation framework. To mimic the real data, GEANT4 raw hits are converted to clusters through a simulated digitization: energy deposits are first summed per scintillator bar, and then converted to number of photoelectrons with a data-driven procedure. This is followed by the same clustering algorithm and same tracking code as in real data. We also report on the study of muon transport through rock using PUMAS and GEANT4. In this paper we elaborate on the rationale for our technical choices, including trade-off between speed and accuracy. The developments reported here are of general interest in muon radiography and can be applied in similar cases.

Download Full-text

Analysis of Residual Dependencies of Independent Components Extracted from fMRI Data

Computational Intelligence and Neuroscience ◽

10.1155/2016/2961727 ◽

2016 ◽

Vol 2016 ◽

pp. 1-15

Author(s):

N. Vanello ◽

E. Ricciardi ◽

L. Landini

Keyword(s):

Clustering Algorithm ◽

A Priori ◽

Similarity Measures ◽

Real Data ◽

Fmri Data ◽

Order Selection ◽

Functional Magnetic Resonance ◽

Model Order Selection ◽

Model Order ◽

Independent Components

Independent component analysis (ICA) of functional magnetic resonance imaging (fMRI) data can be employed as an exploratory method. The lack in the ICA model of strong a priori assumptions about the signal or about the noise leads to difficult interpretations of the results. Moreover, the statistical independence of the components is only approximated. Residual dependencies among the components can reveal informative structure in the data. A major problem is related to model order selection, that is, the number of components to be extracted. Specifically, overestimation may lead to component splitting. In this work, a method based on hierarchical clustering of ICA applied to fMRI datasets is investigated. The clustering algorithm uses a metric based on the mutual information between the ICs. To estimate the similarity measure, a histogram-based technique and one based on kernel density estimation are tested on simulated datasets. Simulations results indicate that the method could be used to cluster components related to the same task and resulting from a splitting process occurring at different model orders. Different performances of the similarity measures were found and discussed. Preliminary results on real data are reported and show that the method can group task related and transiently task related components.

Download Full-text

Self-Adaptive K-Means Based on a Covering Algorithm

Complexity ◽

10.1155/2018/7698274 ◽

2018 ◽

Vol 2018 ◽

pp. 1-16 ◽

Cited By ~ 1

Author(s):

Yiwen Zhang ◽

Yuanyuan Zhou ◽

Xing Guo ◽

Jintao Wu ◽

Qiang He ◽

...

Keyword(s):

Large Scale ◽

Clustering Algorithm ◽

Real Data ◽

Second Phase ◽

Data Sets ◽

Number Of Clusters ◽

Large Scale Data ◽

Long Time ◽

Two Phases ◽

Selection Of

The K-means algorithm is one of the ten classic algorithms in the area of data mining and has been studied by researchers in numerous fields for a long time. However, the value of the clustering number k in the K-means algorithm is not always easy to be determined, and the selection of the initial centers is vulnerable to outliers. This paper proposes an improved K-means clustering algorithm called the covering K-means algorithm (C-K-means). The C-K-means algorithm can not only acquire efficient and accurate clustering results but also self-adaptively provide a reasonable numbers of clusters based on the data features. It includes two phases: the initialization of the covering algorithm (CA) and the Lloyd iteration of the K-means. The first phase executes the CA. CA self-organizes and recognizes the number of clusters k based on the similarities in the data, and it requires neither the number of clusters to be prespecified nor the initial centers to be manually selected. Therefore, it has a “blind” feature, that is, k is not preselected. The second phase performs the Lloyd iteration based on the results of the first phase. The C-K-means algorithm combines the advantages of CA and K-means. Experiments are carried out on the Spark platform, and the results verify the good scalability of the C-K-means algorithm. This algorithm can effectively solve the problem of large-scale data clustering. Extensive experiments on real data sets show that the accuracy and efficiency of the C-K-means algorithm outperforms the existing algorithms under both sequential and parallel conditions.

Download Full-text

A novel reconstruction framework for an imaging calorimeter for HL-LHC

EPJ Web of Conferences ◽

10.1051/epjconf/202125103013 ◽

2021 ◽

Vol 251 ◽

pp. 03013

Author(s):

Leonardo Cristella ◽

Keyword(s):

Clustering Algorithm ◽

Particle Identification ◽

Machine Learning Techniques ◽

Radiation Tolerance ◽

Reconstruction Algorithms ◽

High Luminosity ◽

Silicon Sensors ◽

Learning Techniques ◽

Recent Developments ◽

Speed And Accuracy

To sustain the harsher conditions of the high-luminosity LHC, the CMS collaboration is designing a novel endcap calorimeter system. The new calorimeter will predominantly use silicon sensors to achieve sufficient radiation tolerance and will maintain highly-granular information in the readout to help mitigate the effects of pileup. In regions characterised by lower radiation levels, small scintillator tiles with individual on-tile SiPM readout are employed. A unique reconstruction framework (TICL: The Iterative CLustering) is being developed to fully exploit the granularity and other significant detector features, such as particle identification and precision timing, with a view to mitigate pileup in the very dense environment of HL-LHC. The inputs to the framework are clusters of energy deposited in individual calorimeter layers. Clusters are formed by a density-based algorithm. Recent developments and tunes of the clustering algorithm will be presented. To help reduce the expected pressure on the computing resources in the HL-LHC era, the algorithms and their data structures are designed to be executed on GPUs. Preliminary results will be presented on decreases in clustering time when using GPUs versus CPUs. Ideas for machine-learning techniques to further improve the speed and accuracy of reconstruction algorithms will be presented.

Download Full-text

Clustering Mashups by Integrating Structural and Semantic Similarities Using Fuzzy AHP

International Journal of Web Services Research ◽

10.4018/ijwsr.2021010103 ◽

2021 ◽

Vol 18 (1) ◽

pp. 34-57

Author(s):

Weifeng Pan ◽

Xinxin Xu ◽

Hua Ming ◽

Carl K. Chang

Keyword(s):

Semantic Similarity ◽

Clustering Algorithm ◽

Latent Dirichlet Allocation ◽

Fuzzy Ahp ◽

Real Data ◽

Structural Similarity ◽

Analytic Hierarchy ◽

Data Set ◽

Novel Approach ◽

Hierarchy Process

Mashup technology has become a promising way to develop and deliver applications on the web. Automatically organizing Mashups into functionally similar clusters helps improve the performance of Mashup discovery. Although there are many approaches aiming to cluster Mashups, they solely focus on utilizing semantic similarities to guide the Mashup clustering process and are unable to utilize both the structural and semantic information in Mashup profiles. In this paper, a novel approach to cluster Mashups into groups is proposed, which integrates structural similarity and semantic similarity using fuzzy AHP (fuzzy analytic hierarchy process). The structural similarity is computed from usage histories between Mashups and Web APIs using SimRank algorithm. The semantic similarity is computed from the descriptions and tags of Mashups using LDA (latent dirichlet allocation). A clustering algorithm based on the genetic algorithm is employed to cluster Mashups. Comprehensive experiments are performed on a real data set collected from ProgrammableWeb. The results show the effectiveness of the approach when compared with two kinds of conventional approaches.

Download Full-text

A new stochastic gradient descent possibilistic clustering algorithm

AI Communications ◽

10.3233/aic-210125 ◽

2021 ◽

pp. 1-18

Author(s):

Angeliki Koutsimpela ◽

Konstantinos D. Koutroumbas

Keyword(s):

Cost Function ◽

Gradient Descent ◽

Clustering Algorithm ◽

Clustering Algorithms ◽

Real Data ◽

Stochastic Gradient ◽

Stochastic Gradient Descent ◽

Data Sets ◽

Convergence Results ◽

Possibilistic Clustering

Several well known clustering algorithms have their own online counterparts, in order to deal effectively with the big data issue, as well as with the case where the data become available in a streaming fashion. However, very few of them follow the stochastic gradient descent philosophy, despite the fact that the latter enjoys certain practical advantages (such as the possibility of (a) running faster than their batch processing counterparts and (b) escaping from local minima of the associated cost function), while, in addition, strong theoretical convergence results have been established for it. In this paper a novel stochastic gradient descent possibilistic clustering algorithm, called O- PCM 2 is introduced. The algorithm is presented in detail and it is rigorously proved that the gradient of the associated cost function tends to zero in the L 2 sense, based on general convergence results established for the family of the stochastic gradient descent algorithms. Furthermore, an additional discussion is provided on the nature of the points where the algorithm may converge. Finally, the performance of the proposed algorithm is tested against other related algorithms, on the basis of both synthetic and real data sets.

Download Full-text

Development of the detector simulation framework for the Wideband Hybrid X-ray Imager onboard FORCE

Nuclear Instruments and Methods in Physics Research Section A Accelerators Spectrometers Detectors and Associated Equipment ◽

10.1016/j.nima.2020.164433 ◽

2020 ◽

Vol 979 ◽

pp. 164433

Author(s):

Hiromasa Suzuki ◽

Tsubasa Tamba ◽

Hirokazu Odaka ◽

Aya Bamba ◽

Kouichi Hagino ◽

...

Keyword(s):

Simulation Framework ◽

X Ray ◽

Detector Simulation

Download Full-text

An efficient trajectory-clustering algorithm based on an index tree

Transactions of the Institute of Measurement and Control ◽

10.1177/0142331211423284 ◽

2011 ◽

Vol 34 (7) ◽

pp. 850-861 ◽

Cited By ~ 15

Author(s):

Guan Yuan ◽

Shixiong Xia ◽

Lei Zhang ◽

Yong Zhou ◽

Cheng Ji

Keyword(s):

Radio Frequency Identification ◽

Clustering Algorithm ◽

Real Data ◽

Structural Similarity ◽

Location Based Services ◽

Similarity Function ◽

Data Sets ◽

Trajectory Clustering ◽

Trajectory Data ◽

Index Tree

With the development of location-based services, such as the Global Positioning System and Radio Frequency Identification, a great deal of trajectory data can be collected. Therefore, how to mine knowledge from these data has become an attractive topic. In this paper, we propose an efficient trajectory-clustering algorithm based on an index tree. Firstly, an index tree is proposed to store trajectories and their similarity matrix, with which trajectories can be retrieved efficiently; secondly, a new conception of trajectory structure is introduced to analyse both the internal and external features of trajectories; then, trajectories are partitioned into trajectory segments according to their corners; furthermore, the similarity between every trajectory segment pairs is compared by presenting the structural similarity function; finally, trajectory segments are grouped into different clusters according to their location in the different levels of the index tree. Experimental results on real data sets demonstrate not only the efficiency and effectiveness of our algorithm, but also the great flexibility that feature sensitivity can be adjusted by different parameters, and the cluster results are more practically significant.

Download Full-text

Identifying and Assessing Interesting Subgroups in a Heterogeneous Population

BioMed Research International ◽

10.1155/2015/462549 ◽

2015 ◽

Vol 2015 ◽

pp. 1-13

Author(s):

Woojoo Lee ◽

Andrey Alexeyenko ◽

Maria Pernemalm ◽

Justine Guegan ◽

Philippe Dessen ◽

...

Keyword(s):

Prognostic Factor ◽

Clustering Algorithm ◽

Therapy Response ◽

Real Data ◽

Estimation Procedure ◽

Heterogeneous Population ◽

Clustering Methods ◽

Expression Studies ◽

Cluster Generation ◽

Gene Expression Studies

Biological heterogeneity is common in many diseases and it is often the reason for therapeutic failures. Thus, there is great interest in classifying a disease into subtypes that have clinical significance in terms of prognosis or therapy response. One of the most popular methods to uncover unrecognized subtypes is cluster analysis. However, classical clustering methods such ask-means clustering or hierarchical clustering are not guaranteed to produce clinically interesting subtypes. This could be because the main statistical variability—the basis of cluster generation—is dominated by genes not associated with the clinical phenotype of interest. Furthermore, a strong prognostic factor might be relevant for a certain subgroup but not for the whole population; thus an analysis of the whole sample may not reveal this prognostic factor. To address these problems we investigate methods to identify and assess clinically interesting subgroups in a heterogeneous population. The identification step uses a clustering algorithm and to assess significance we use a false discovery rate- (FDR-) based measure. Under the heterogeneity condition the standard FDR estimate is shown to overestimate the true FDR value, but this is remedied by an improved FDR estimation procedure. As illustrations, two real data examples from gene expression studies of lung cancer are provided.

Download Full-text

Evaluate Fabric Wrinkle Grade Based on Subtractive Clustering Adaptive Network Fuzzy Inference Systems

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.332-334.1505 ◽

2011 ◽

Vol 332-334 ◽

pp. 1505-1510

Author(s):

Xiao Bo Yang

Keyword(s):

Network Inference ◽

Clustering Algorithm ◽

Fuzzy Inference ◽

Real Data ◽

Subtractive Clustering ◽

Fuzzy Inference Systems ◽

Adaptive Network ◽

Inference System ◽

Fuzzy Neural ◽

Inference Systems

In this paper, a new method of subtractive clustering adaptive network fuzzy inference systems is proposed to assess degree of wrinkle in the fabric. The clustering center can be gotten through subtractive clustering algorithm, which is the base to set up adaptive network inference systems. Firstly, subtractive clustering algorithm is used to confirm the structure of fuzzy neural network, then, fuzzy inference system is used to process pattern recognition. Finally, four kinds of fabric wrinkle feature parameters are used to verify the results on real fabric. The results show the applicability of the proposed method to real data.

Download Full-text

Validation of Geant4 Pixel Detector Simulation Framework by Measurements With the Medipix Family Detectors

IEEE Transactions on Nuclear Science ◽

10.1109/tns.2016.2555958 ◽

2016 ◽

Vol 63 (3) ◽

pp. 1874-1881

Author(s):

David Krapohl ◽

Armin Schubel ◽

Erik Frojdh ◽

Goran Thungstrom ◽

Christer Frojdh

Keyword(s):

Pixel Detector ◽

Simulation Framework ◽

Detector Simulation

Download Full-text