Fusion of Scores in a Detection Context Based on Alpha Integration

We present a new method for fusing scores corresponding to different detectors (two-hypotheses case). It is based on alpha integration, which we have adapted to the detection context. Three optimization methods are presented: least mean square error, maximization of the area under the ROC curve, and minimization of the probability of error. Gradient algorithms are proposed for the three methods. Different experiments with simulated and real data are included. Simulated data consider the two-detector case to illustrate the factors influencing alpha integration and demonstrate the improvements obtained by score fusion with respect to individual detector performance. Two real data cases have been considered. In the first, multimodal biometric data have been processed. This case is representative of scenarios in which the probability of detection is to be maximized for a given probability of false alarm. The second case is the automatic analysis of electroencephalogram and electrocardiogram records with the aim of reproducing the medical expert detections of arousal during sleeping. This case is representative of scenarios in which probability of error is to be minimized. The general superior performance of alpha integration verifies the interest of optimizing the fusing parameters.

Download Full-text

An Adaptive Spectrum-Sensing Algorithm for Cognitive Radio Networks based on the Sample Covariance Matrix

Defence Science Journal ◽

10.14429/dsj.67.10506 ◽

2017 ◽

Vol 67 (3) ◽

pp. 325 ◽

Cited By ~ 2

Author(s):

Chhagan Charan ◽

Rajoo Pandey

Keyword(s):

Covariance Matrix ◽

Spectrum Sensing ◽

Primary User ◽

Adaptive Threshold ◽

Probability Of Detection ◽

Superior Performance ◽

Probability Of Error ◽

Noise Uncertainty ◽

Low Snr ◽

Sample Covariance

<p>A novel adaptive threshold spectrum sensing technique based on the covariance matrix of received signal samples is proposed. The adaptive threshold in terms of signal to noise ratio (SNR) and spectrum utilisation ratio of primary user is derived. It considers both the probability of detection and the probability false alarm to minimise the overall decision error probability. The energy- based spectrum sensing scheme shows high vulnerability under noise uncertainty and low SNR. The existing covariance-based spectrum sensing technique overcomes the noise uncertainty problem but its performance deteriorates under low SNR. The proposed covariance-based scheme effectively addresses the low SNR problem. The superior performance of this scheme over the existing covariance-based detection method is confirmed by the simulation results in terms of probability of detection, probability of error, and requirement of samples for reliable detection of spectrum.</p>

Download Full-text

Joint detection of germline and somatic copy number events in matched tumor–normal sample pairs

Bioinformatics ◽

10.1093/bioinformatics/btz429 ◽

2019 ◽

Vol 35 (23) ◽

pp. 4955-4961

Author(s):

Yongzhuang Liu ◽

Jian Liu ◽

Yadong Wang

Keyword(s):

Copy Number ◽

Simulated Data ◽

Real Data ◽

Copy Number Variations ◽

Superior Performance ◽

Supplementary Information ◽

Normal Sample ◽

Joint Detection ◽

Novel Approach ◽

Powerful Approach

Abstract Motivation Whole-genome sequencing (WGS) of tumor–normal sample pairs is a powerful approach for comprehensively characterizing germline copy number variations (CNVs) and somatic copy number alterations (SCNAs) in cancer research and clinical practice. Existing computational approaches for detecting copy number events cannot detect germline CNVs and SCNAs simultaneously, and yield low accuracy for SCNAs. Results In this study, we developed TumorCNV, a novel approach for jointly detecting germline CNVs and SCNAs from WGS data of the matched tumor–normal sample pair. We compared TumorCNV with existing copy number event detection approaches using the simulated data and real data for the COLO-829 melanoma cell line. The experimental results showed that TumorCNV achieved superior performance than existing approaches. Availability and implementation The software TumorCNV is implemented using a combination of Java and R, and it is freely available from the website at https://github.com/yongzhuang/TumorCNV. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Determining the number of components in the PARAFAC model with a nonnegative tensor structure: A simulated EEG data study

10.31234/osf.io/8r3h2 ◽

2021 ◽

Author(s):

Zuzana Rošťáková ◽

Roman Rosipal

Keyword(s):

Data Structure ◽

Input Parameter ◽

Simulated Data ◽

Real Data ◽

Cortical Activation ◽

Superior Performance ◽

Head Model ◽

Advantages And Disadvantages ◽

Latent Components ◽

Nonnegative Data

Background and Objective: Parallel factor analysis (PARAFAC) is a powerful tool for detecting latent components in higher-order arrays (tensors). As an essential input parameter, the number of latent components should be set in advance. However, any component number selection method already proposed in the literature became a rule of thumb. The study demonstrates the advantages and disadvantages of twelve different methods applied to well-controlled simulated data with a nonnegative structure that mimics the character of a real electroencephalogram.Methods: Existing studies have compared the methods’ performance on simulated data with a simplified structure. It was shown that the obtained results are not directly generalizable to real data. Using a real head model and cortical activation, our study focuses on nontrivial and nonnegative simulated data that resemble real electroencephalogram properties as closely as possible. Different noise levels and disruptions from the optimal structure are considered. Moreover, we validate a new method for component number selection, which we have already successfully applied to real electroencephalogram tasks. We also demonstrate that the existing approaches must be adapted whenever a nonnegative data structure is assumed. Results: We identified four methods that produce promising but not ideal results on nontrivial simulated data and present superior performance in electroencephalogram analysis practice.Conclusions: Component number selection in PARAFAC is a complex and unresolved problem. The nonnegative data structure assumption makes the problem more challenging. Although several methods have shown promising results, the issue remains open, and new approaches are needed.

Download Full-text

ADS-B Crowd-Sensor Network and Two-Step Kalman Filter for GNSS and ADS-B Cyber-Attack Detection

Sensors ◽

10.3390/s21154992 ◽

2021 ◽

Vol 21 (15) ◽

pp. 4992

Author(s):

Mauro Leonardi ◽

Gheorghe Sirbu

Keyword(s):

Kalman Filter ◽

Sensor Network ◽

Traffic Control ◽

Low Cost ◽

Real Data ◽

Satellite System ◽

Attack Detection ◽

Probability Of Detection ◽

Cyber Attack ◽

Probability Of False Alarm

Automatic Dependent Surveillance-Broadcast is an Air Traffic Control system in which aircraft transmit their own information (identity, position, velocity, etc.) to ground sensors for surveillance purposes. This system has many advantages compared to the classical surveillance radars: easy and low-cost implementation, high accuracy of data, and low renewal time, but also limitations: dependency on the Global Navigation Satellite System, a simple unencrypted and unauthenticated protocol. For these reasons, the system is exposed to attacks like jamming/spoofing of the on-board GNSS receiver or false ADS-B messages’ injection. After a mathematical model derivation of different types of attacks, we propose the use of a crowd sensor network capable of estimating the Time Difference Of Arrival of the ADS-B messages together with a two-step Kalman filter to detect these attacks (on-board GNSS/ADS-B tampering, false ADS-B message injection, GNSS Spoofing/Jamming). Tests with real data and simulations showed that the algorithm can detect all these attacks with a very high probability of detection and low probability of false alarm.

Download Full-text

A New Extension of Thinning-Based Integer-Valued Autoregressive Models for Count Data

Entropy ◽

10.3390/e23010062 ◽

2020 ◽

Vol 23 (1) ◽

pp. 62

Author(s):

Zhengwei Liu ◽

Fukang Zhu

Keyword(s):

Likelihood Estimation ◽

Real Data ◽

Autoregressive Models ◽

Superior Performance ◽

Data Sets ◽

Binomial Thinning ◽

Free Case ◽

Two Parameters ◽

Conditional Maximum ◽

Thinning Operator

The thinning operators play an important role in the analysis of integer-valued autoregressive models, and the most widely used is the binomial thinning. Inspired by the theory about extended Pascal triangles, a new thinning operator named extended binomial is introduced, which is a general case of the binomial thinning. Compared to the binomial thinning operator, the extended binomial thinning operator has two parameters and is more flexible in modeling. Based on the proposed operator, a new integer-valued autoregressive model is introduced, which can accurately and flexibly capture the dispersed features of counting time series. Two-step conditional least squares (CLS) estimation is investigated for the innovation-free case and the conditional maximum likelihood estimation is also discussed. We have also obtained the asymptotic property of the two-step CLS estimator. Finally, three overdispersed or underdispersed real data sets are considered to illustrate a superior performance of the proposed model.

Download Full-text

Separation of Chromatographic Co-Eluted Compounds by Clustering and by Functional Data Analysis

Metabolites ◽

10.3390/metabo11040214 ◽

2021 ◽

Vol 11 (4) ◽

pp. 214

Author(s):

Aneta Sawikowska ◽

Anna Piasecka ◽

Piotr Kachlicki ◽

Paweł Krajewski

Keyword(s):

Simulated Data ◽

Principal Component ◽

Real Data ◽

Functional Principal Component Analysis ◽

Additional Advantage ◽

Time Alignment ◽

Peak Separation ◽

Biological Mixtures ◽

Overlapping Peaks ◽

Retention Time Alignment

Peak overlapping is a common problem in chromatography, mainly in the case of complex biological mixtures, i.e., metabolites. Due to the existence of the phenomenon of co-elution of different compounds with similar chromatographic properties, peak separation becomes challenging. In this paper, two computational methods of separating peaks, applied, for the first time, to large chromatographic datasets, are described, compared, and experimentally validated. The methods lead from raw observations to data that can form inputs for statistical analysis. First, in both methods, data are normalized by the mass of sample, the baseline is removed, retention time alignment is conducted, and detection of peaks is performed. Then, in the first method, clustering is used to separate overlapping peaks, whereas in the second method, functional principal component analysis (FPCA) is applied for the same purpose. Simulated data and experimental results are used as examples to present both methods and to compare them. Real data were obtained in a study of metabolomic changes in barley (Hordeum vulgare) leaves under drought stress. The results suggest that both methods are suitable for separation of overlapping peaks, but the additional advantage of the FPCA is the possibility to assess the variability of individual compounds present within the same peaks of different chromatograms.

Download Full-text

A Closed-Form Solution to Planar Feature-Based Registration of LiDAR Point Clouds

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi10070435 ◽

2021 ◽

Vol 10 (7) ◽

pp. 435

Author(s):

Yongbo Wang ◽

Nanshan Zheng ◽

Zhengfu Bian

Keyword(s):

Closed Form ◽

Closed Form Solution ◽

Simulated Data ◽

Real Data ◽

Point Clouds ◽

Form Solution ◽

Spatial Transformation ◽

Dual Quaternions ◽

Feature Based ◽

Planar Feature

Since pairwise registration is a necessary step for the seamless fusion of point clouds from neighboring stations, a closed-form solution to planar feature-based registration of LiDAR (Light Detection and Ranging) point clouds is proposed in this paper. Based on the Plücker coordinate-based representation of linear features in three-dimensional space, a quad tuple-based representation of planar features is introduced, which makes it possible to directly determine the difference between any two planar features. Dual quaternions are employed to represent spatial transformation and operations between dual quaternions and the quad tuple-based representation of planar features are given, with which an error norm is constructed. Based on L2-norm-minimization, detailed derivations of the proposed solution are explained step by step. Two experiments were designed in which simulated data and real data were both used to verify the correctness and the feasibility of the proposed solution. With the simulated data, the calculated registration results were consistent with the pre-established parameters, which verifies the correctness of the presented solution. With the real data, the calculated registration results were consistent with the results calculated by iterative methods. Conclusions can be drawn from the two experiments: (1) The proposed solution does not require any initial estimates of the unknown parameters in advance, which assures the stability and robustness of the solution; (2) Using dual quaternions to represent spatial transformation greatly reduces the additional constraints in the estimation process.

Download Full-text

Penalized partial least squares for pleiotropy

BMC Bioinformatics ◽

10.1186/s12859-021-03968-1 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Camilo Broc ◽

Therese Truong ◽

Benoit Liquet

Keyword(s):

Least Squares ◽

Partial Least Squares ◽

Association Studies ◽

A Priori ◽

Simulated Data ◽

Real Data ◽

Genome Wide Association Studies ◽

Genetic Associations ◽

Multiple Traits ◽

Application Fields

Abstract Background The increasing number of genome-wide association studies (GWAS) has revealed several loci that are associated to multiple distinct phenotypes, suggesting the existence of pleiotropic effects. Highlighting these cross-phenotype genetic associations could help to identify and understand common biological mechanisms underlying some diseases. Common approaches test the association between genetic variants and multiple traits at the SNP level. In this paper, we propose a novel gene- and a pathway-level approach in the case where several independent GWAS on independent traits are available. The method is based on a generalization of the sparse group Partial Least Squares (sgPLS) to take into account groups of variables, and a Lasso penalization that links all independent data sets. This method, called joint-sgPLS, is able to convincingly detect signal at the variable level and at the group level. Results Our method has the advantage to propose a global readable model while coping with the architecture of data. It can outperform traditional methods and provides a wider insight in terms of a priori information. We compared the performance of the proposed method to other benchmark methods on simulated data and gave an example of application on real data with the aim to highlight common susceptibility variants to breast and thyroid cancers. Conclusion The joint-sgPLS shows interesting properties for detecting a signal. As an extension of the PLS, the method is suited for data with a large number of variables. The choice of Lasso penalization copes with architectures of groups of variables and observations sets. Furthermore, although the method has been applied to a genetic study, its formulation is adapted to any data with high number of variables and an exposed a priori architecture in other application fields.

Download Full-text

Calibration of Camera and Flash LiDAR System with a Triangular Pyramid Target

Applied Sciences ◽

10.3390/app11020582 ◽

2021 ◽

Vol 11 (2) ◽

pp. 582

Author(s):

Zean Bu ◽

Changku Sun ◽

Peng Wang ◽

Hang Dong

Keyword(s):

Simulated Data ◽

Real Data ◽

Calibration Method ◽

Multiple Sensors ◽

Triangular Pyramid ◽

World Coordinate System ◽

Flash Lidar ◽

Novel Method ◽

3D Information ◽

Incremental Validation

Calibration between multiple sensors is a fundamental procedure for data fusion. To address the problems of large errors and tedious operation, we present a novel method to conduct the calibration between light detection and ranging (LiDAR) and camera. We invent a calibration target, which is an arbitrary triangular pyramid with three chessboard patterns on its three planes. The target contains both 3D information and 2D information, which can be utilized to obtain intrinsic parameters of the camera and extrinsic parameters of the system. In the proposed method, the world coordinate system is established through the triangular pyramid. We extract the equations of triangular pyramid planes to find the relative transformation between two sensors. One capture of camera and LiDAR is sufficient for calibration, and errors are reduced by minimizing the distance between points and planes. Furthermore, the accuracy can be increased by more captures. We carried out experiments on simulated data with varying degrees of noise and numbers of frames. Finally, the calibration results were verified by real data through incremental validation and analyzing the root mean square error (RMSE), demonstrating that our calibration method is robust and provides state-of-the-art performance.

Download Full-text

Prediction of Fuel Poverty Potential Risk Index Using Six Regression Algorithms: A Case-Study of Chilean Social Dwellings

Sustainability ◽

10.3390/su13052426 ◽

2021 ◽

Vol 13 (5) ◽

pp. 2426

Author(s):

David Bienvenido-Huertas ◽

Jesús A. Pulido-Arcas ◽

Carlos Rubio-Bellido ◽

Alexis Pérez-Fargallo

Keyword(s):

Low Income ◽

Potential Risk ◽

Energy Use ◽

Risk Index ◽

Computing Time ◽

Simulated Data ◽

Real Data ◽

Support Vector ◽

Energy Poverty ◽

Regression Algorithms

In recent times, studies about the accuracy of algorithms to predict different aspects of energy use in the building sector have flourished, being energy poverty one of the issues that has received considerable critical attention. Previous studies in this field have characterized it using different indicators, but they have failed to develop instruments to predict the risk of low-income households falling into energy poverty. This research explores the way in which six regression algorithms can accurately forecast the risk of energy poverty by means of the fuel poverty potential risk index. Using data from the national survey of socioeconomic conditions of Chilean households and generating data for different typologies of social dwellings (e.g., form ratio or roof surface area), this study simulated 38,880 cases and compared the accuracy of six algorithms. Multilayer perceptron, M5P and support vector regression delivered the best accuracy, with correlation coefficients over 99.5%. In terms of computing time, M5P outperforms the rest. Although these results suggest that energy poverty can be accurately predicted using simulated data, it remains necessary to test the algorithms against real data. These results can be useful in devising policies to tackle energy poverty in advance.

Download Full-text