scholarly journals Technical note: Uncertainty in multi-source partitioning using large tracer data sets

Author(s):  
Alicia Correa ◽  
Diego Ochoa-Tocachi ◽  
Christian Birkel

Abstract. The availability of large tracer data sets opened up the opportunity to investigate multiple source contributions to a mixture. However, the source contributions may be uncertain and apart from Bayesian approaches to estimate such source uncertainty only sound methods for two and three sources. We expand these methods developing an uncertainty estimation method for four sources based on multiple tracers as input data. Taylor series approximation is used to solve the set of linear mass balance equations. We illustrate the method with an example from hydrology, where we use a large tracer set from four water sources contributing to streamflow in a tropical, high-elevation catchment. However, our uncertainty estimation method can be generalized to any number of tracers across a range of disciplines.

2019 ◽  
Vol 23 (12) ◽  
pp. 5059-5068 ◽  
Author(s):  
Alicia Correa ◽  
Diego Ochoa-Tocachi ◽  
Christian Birkel

Abstract. The availability of large tracer data sets opened up the opportunity to investigate multiple source contributions to a mixture. However, the source contributions may be uncertain and, apart from Bayesian approaches, to date there are only solid methods to estimate such uncertainties for two and three sources. We introduce an alternative uncertainty estimation method for four sources based on multiple tracers as input data. Taylor series approximation is used to solve the set of linear mass balance equations. We illustrate the method to compute individual uncertainties in the calculation of source contributions to a mixture, with an example from hydrology, using a 14-tracer set from water sources and streamflow from a tropical, high-elevation catchment. Moreover, this method has the potential to be generalized to any number of tracers across a range of disciplines.


Author(s):  
Cong Gao ◽  
Ping Yang ◽  
Yanping Chen ◽  
Zhongmin Wang ◽  
Yue Wang

AbstractWith large deployment of wireless sensor networks, anomaly detection for sensor data is becoming increasingly important in various fields. As a vital data form of sensor data, time series has three main types of anomaly: point anomaly, pattern anomaly, and sequence anomaly. In production environments, the analysis of pattern anomaly is the most rewarding one. However, the traditional processing model cloud computing is crippled in front of large amount of widely distributed data. This paper presents an edge-cloud collaboration architecture for pattern anomaly detection of time series. A task migration algorithm is developed to alleviate the problem of backlogged detection tasks at edge node. Besides, the detection tasks related to long-term correlation and short-term correlation in time series are allocated to cloud and edge node, respectively. A multi-dimensional feature representation scheme is devised to conduct efficient dimension reduction. Two key components of the feature representation trend identification and feature point extraction are elaborated. Based on the result of feature representation, pattern anomaly detection is performed with an improved kernel density estimation method. Finally, extensive experiments are conducted with synthetic data sets and real-world data sets.


2021 ◽  
Vol 13 (3) ◽  
pp. 530
Author(s):  
Junjun Yin ◽  
Jian Yang

Pseudo quad polarimetric (quad-pol) image reconstruction from the hybrid dual-pol (or compact polarimetric (CP)) synthetic aperture radar (SAR) imagery is a category of important techniques for radar polarimetric applications. There are three key aspects concerned in the literature for the reconstruction methods, i.e., the scattering symmetric assumption, the reconstruction model, and the solving approach of the unknowns. Since CP measurements depend on the CP mode configurations, different reconstruction procedures were designed when the transmit wave varies, which means the reconstruction procedures were not unified. In this study, we propose a unified reconstruction framework for the general CP mode, which is applicable to the mode with an arbitrary transmitted ellipse wave. The unified reconstruction procedure is based on the formalized CP descriptors. The general CP symmetric scattering model-based three-component decomposition method is also employed to fit the reconstruction model parameter. Finally, a least squares (LS) estimation method, which was proposed for the linear π/4 CP data, is extended for the arbitrary CP mode to estimate the solution of the system of non-linear equations. Validation is carried out based on polarimetric data sets from both RADARSAT-2 (C-band) and ALOS-2/PALSAR (L-band), to compare the performances of reconstruction models, methods, and CP modes.


2020 ◽  
Vol 9 (1) ◽  
pp. 61-81
Author(s):  
Lazhar BENKHELIFA

A new lifetime model, with four positive parameters, called the Weibull Birnbaum-Saunders distribution is proposed. The proposed model extends the Birnbaum-Saunders distribution and provides great flexibility in modeling data in practice. Some mathematical properties of the new distribution are obtained including expansions for the cumulative and density functions, moments, generating function, mean deviations, order statistics and reliability. Estimation of the model parameters is carried out by the maximum likelihood estimation method. A simulation study is presented to show the performance of the maximum likelihood estimates of the model parameters. The flexibility of the new model is examined by applying it to two real data sets.


2018 ◽  
Author(s):  
Michael Nute ◽  
Ehsan Saleh ◽  
Tandy Warnow

AbstractThe estimation of multiple sequence alignments of protein sequences is a basic step in many bioinformatics pipelines, including protein structure prediction, protein family identification, and phylogeny estimation. Statistical co-estimation of alignments and trees under stochastic models of sequence evolution has long been considered the most rigorous technique for estimating alignments and trees, but little is known about the accuracy of such methods on biological benchmarks. We report the results of an extensive study evaluating the most popular protein alignment methods as well as the statistical co-estimation method BAli-Phy on 1192 protein data sets from established benchmarks as well as on 120 simulated data sets. Our study (which used more than 230 CPU years for the BAli-Phy analyses alone) shows that BAli-Phy is dramatically more accurate than the other alignment methods on the simulated data sets, but is among the least accurate on the biological benchmarks. There are several potential causes for this discordance, including model misspecification, errors in the reference alignments, and conflicts between structural alignment and evolutionary alignments; future research is needed to understand the most likely explanation for our observations. multiple sequence alignment, BAli-Phy, protein sequences, structural alignment, homology


2020 ◽  
Vol 34 (04) ◽  
pp. 5620-5627 ◽  
Author(s):  
Murat Sensoy ◽  
Lance Kaplan ◽  
Federico Cerutti ◽  
Maryam Saleki

Deep neural networks are often ignorant about what they do not know and overconfident when they make uninformed predictions. Some recent approaches quantify classification uncertainty directly by training the model to output high uncertainty for the data samples close to class boundaries or from the outside of the training distribution. These approaches use an auxiliary data set during training to represent out-of-distribution samples. However, selection or creation of such an auxiliary data set is non-trivial, especially for high dimensional data such as images. In this work we develop a novel neural network model that is able to express both aleatoric and epistemic uncertainty to distinguish decision boundary and out-of-distribution regions of the feature space. To this end, variational autoencoders and generative adversarial networks are incorporated to automatically generate out-of-distribution exemplars for training. Through extensive analysis, we demonstrate that the proposed approach provides better estimates of uncertainty for in- and out-of-distribution samples, and adversarial examples on well-known data sets against state-of-the-art approaches including recent Bayesian approaches for neural networks and anomaly detection methods.


Author(s):  
Anteneh Ayanso ◽  
Paulo B. Goes ◽  
Kumar Mehta

Relational databases have increasingly become the basis for a wide range of applications that require efficient methods for exploratory search and retrieval. Top-k retrieval addresses this need and involves finding a limited number of records whose attribute values are the closest to those specified in a query. One of the approaches in the recent literature is query-mapping which deals with converting top-k queries into equivalent range queries that relational database management systems (RDBMSs) normally support. This approach combines the advantages of simplicity as well as practicality by avoiding the need for modifications to the query engine, or specialized data structures and indexing techniques to handle top-k queries separately. This paper reviews existing query-mapping techniques in the literature and presents a range query estimation method based on cost modeling. Experiments on real world and synthetic data sets show that the cost-based range estimation method performs at least as well as prior methods and avoids the need to calibrate workloads on specific database contents.


2018 ◽  
Vol 7 (5) ◽  
pp. 120
Author(s):  
T. H. M. Abouelmagd

A new version of the Lomax model is introduced andstudied. The major justification for the practicality of the new model isbased on the wider use of the Lomax model. We are also motivated tointroduce the new model since the density of the new distribution exhibitsvarious important shapes such as the unimodal, the right skewed and the leftskewed. The new model can be viewed as a mixture of the exponentiated Lomaxdistribution. It can also be considered as a suitable model for fitting thesymmetric, left skewed, right skewed, and unimodal data sets. The maximumlikelihood estimation method is used to estimate the model parameters. Weprove empirically the importance and flexibility of the new model inmodeling two types of aircraft windshield lifetime data sets. The proposedlifetime model is much better than gamma Lomax, exponentiated Lomax, Lomaxand beta Lomax models so the new distribution is a good alternative to thesemodels in modeling aircraft windshield data.


2006 ◽  
Vol 63 (9) ◽  
pp. 1674-1681 ◽  
Author(s):  
Hélène de Pontual ◽  
Anne Laure Groison ◽  
Carmen Piñeiro ◽  
Michel Bertignac

Abstract In 2002, a pilot experiment on hake tagging was carried out using methodology specifically developed to catch and handle fish in good condition. By the end of 2005, 36 hake and five tags had been returned to the laboratory (a 3.1% return rate) with a maximum time at liberty of 1066 days. The somatic growth of the recoveries proved to be twofold higher than that expected from published von Bertalanffy growth functions for the species in the Bay of Biscay. The growth underestimation was related to age overestimation, as demonstrated by two independent analyses. The first was based on a blind interpretation of marked otoliths conducted independently by two European experts involved in the routine age estimation of hake. The result shows that the age estimates were neither accurate (inconsistent with oxytetracycline mark positions) nor precise. The second approach compared the predicted otolith growth with the observed growth, and the discrepancy between the two data sets was large. Both types of analyses invalidate the internationally agreed age estimation method and demonstrate a need for further research. Although based on limited data, the study highlights the need to improve biological knowledge of the species in order to improve assessment and management advice. It also strengthens the argument for age validation.


Agriculture ◽  
2019 ◽  
Vol 9 (3) ◽  
pp. 55 ◽  
Author(s):  
Miles Grafton ◽  
Therese Kaul ◽  
Alan Palmer ◽  
Peter Bishop ◽  
Michael White

This work examines two large data sets to demonstrate that hyperspectral proximal devices may be able to measure soil nutrient. One data set has 3189 soil samples from four hill country pastoral farms and the second data set has 883 soil samples taken from a stratified nested grid survey. These were regressed with spectra from a proximal hyperspectral device measured on the same samples. This aim was to obtain wavelengths, which may be proxy indicators for measurements of soil nutrients. Olsen P and pH were regressed with 2150 wave bands between 350 nm and 2500 nm to find wavebands, which were significant indicators. The 100 most significant wavebands for each proxy were used to regress both data sets. The regression equations from the smaller data set were used to predict the values of pH and Olsen P to validate the larger data set. The predictions from the equations from the smaller data set were as good as the regression analyses from the large data set when applied to it. This may mean that, in the future, hyperspectral analysis may be a proxy to soil chemical analysis; or increase the intensity of soil testing by finding markers of fertility cheaply in the field.


Sign in / Sign up

Export Citation Format

Share Document