gaussian mixture Latest Research Papers

The UK commissions about £100 billion in infrastructure construction works every year. More than 50% of them finish later than planned, causing damage to the interests of stakeholders. The estimation of time-risk on construction projects is currently done subjectively, largely by experience despite there are many existing techniques available to analyse risk on the construction schedules. Unlike conventional methods that tend to depend on the accurate estimation of risk boundaries for each task, this research aims to proposes a hybrid method to assist planners in undertaking risk analysis using baseline schedules with improved accuracy. The proposed method is endowed with machine intelligence and is trained using a database of 293,263 tasks from a diverse sample of 302 completed infrastructure construction projects in the UK. It combines a Gaussian Mixture Modelling-based Empirical Bayesian Network and a Support Vector Machine followed by performing a Monte Carlo risk simulation. The former is used to investigate the uncertainty, correlated risk factors, and predict task duration deviations while the latter is used to return a time-risk simulated prediction. This study randomly selected 10 projects as case studies followed by comparing their results of the proposed hybrid method with Monte Carlo Simulation. Results indicated 54.4% more accurate prediction on project delays.

Download Full-text

baredSC: Bayesian approach to retrieve expression distribution of single-cell data

BMC Bioinformatics ◽

10.1186/s12859-021-04507-8 ◽

2022 ◽

Vol 23 (1) ◽

Author(s):

Lucille Lopez-Delisle ◽

Jean-Baptiste Delisle

Keyword(s):

Single Cell ◽

Bayesian Approach ◽

Genetic Interaction ◽

Gaussian Mixture ◽

Two Dimensions ◽

Biological Data ◽

Specific Gene ◽

Trimodal Distribution ◽

Embryonic Limb ◽

Cell Data

Abstract Background The number of studies using single-cell RNA sequencing (scRNA-seq) is constantly growing. This powerful technique provides a sampling of the whole transcriptome of a cell. However, sparsity of the data can be a major hurdle when studying the distribution of the expression of a specific gene or the correlation between the expressions of two genes. Results We show that the main technical noise associated with these scRNA-seq experiments is due to the sampling, i.e., Poisson noise. We present a new tool named baredSC, for Bayesian Approach to Retrieve Expression Distribution of Single-Cell data, which infers the intrinsic expression distribution in scRNA-seq data using a Gaussian mixture model. baredSC can be used to obtain the distribution in one dimension for individual genes and in two dimensions for pairs of genes, in particular to estimate the correlation in the two genes’ expressions. We apply baredSC to simulated scRNA-seq data and show that the algorithm is able to uncover the expression distribution used to simulate the data, even in multi-modal cases with very sparse data. We also apply baredSC to two real biological data sets. First, we use it to measure the anti-correlation between Hoxd13 and Hoxa11, two genes with known genetic interaction in embryonic limb. Then, we study the expression of Pitx1 in embryonic hindlimb, for which a trimodal distribution has been identified through flow cytometry. While other methods to analyze scRNA-seq are too sensitive to sampling noise, baredSC reveals this trimodal distribution. Conclusion baredSC is a powerful tool which aims at retrieving the expression distribution of few genes of interest from scRNA-seq data.

Download Full-text

The heritability of BMI varies across the range of BMI: a heritability curve analysis in a twin cohort

10.1101/2022.01.06.475210 ◽

2022 ◽

Author(s):

Francesca Azzolini ◽

Geir Berentsen ◽

Hans Skaug ◽

Jacob Hjelmborg ◽

Jaakko Kaprio

Keyword(s):

Repeated Measures ◽

Molecular Genetic ◽

Mixture Distribution ◽

Gaussian Mixture ◽

Genetic Effects ◽

Twin Data ◽

Normal Range ◽

Genetic Studies ◽

Mean Values ◽

Pairwise Correlations

The heritability of traits such as body mass index (BMI), a measure of obesity, is generally estimated using family, twin, and increasingly by molecular genetic approaches. These studies generally assume that genetic effects are uniform across all trait values, yet there is emerging evidence that this may not always be the case. This paper analyzes twin data using a recently developed measure of heritability called the heritability curve. Under the assumption that trait values in twin pairs are governed by a flexible Gaussian mixture distribution, heritability curves may vary across trait values. The data consist of repeated measures of BMI on 1506 monozygotic (MZ) and 2843 like-sexed dizygotic (DZ) adult twin pairs, gathered from multiple surveys in older Finnish Twin Cohorts. The heritability curve and BMI value-specific MZ and DZ pairwise correlations were estimated, and these varied across the range of BMI. MZ correlations were highest at BMI values from 21 to 24, with a stronger decrease for women than for men at higher values. Models with additive and dominance effects fit best at low and high BMI values, while models with additive genetic and common environmental effects fit best in the normal range of BMI. Thus, we demonstrate that twin and molecular genetic studies need to consider how genetic effects vary across trait values. Such variation may reconcile findings of traits with high heritabilities and major differences in mean values between countries or over time.

Download Full-text

Hidden Demographics Barriers of the Economic Growth: A Psychometric Approach

International Journal of Business, Management & Economics Research ◽

10.47747/ijbme.v3i1.471 ◽

2022 ◽

Vol 3 (1) ◽

pp. 24-51

Author(s):

Ntogwa N. Bundala

Keyword(s):

Economic Growth ◽

Weighted Least Squares ◽

Well Being ◽

Gaussian Mixture ◽

Education Level ◽

Primary Data ◽

Subjective Well Being ◽

Employment Policy ◽

Cross Sectional Survey ◽

Cross Sectional

This paper examined the hidden demographic barriers of economic growth. The study used a cross-sectional survey researches design. The primary data were collected by using a psychometric scale from 211 individuals who were randomly sampled from the Mwanza and Kagera regions in Tanzania. The data were linearly analysed by the weighted least squares (WLS) and Analysis weighted- automatic linear modelling (AW-ALM), and non-linearly analysed by Gaussian mixture model (GMM) and neural network analysis (NNA). The study found that the main hidden demographic barrier to economic growth is the negative subjective well-being of an individual’s current age and education level. Moreover, the GMM revealed that there is no significant data or regional clusters or classes in the study population. Furthermore, NNA evidenced the most effective predictor of economic growth is age, followed by education. The study concluded that the most hidden demographic factors that hinder economic growth are negative perceptions of an individual on his/her current age and level of education, not the age maturity, and education level. Operationally or practically, the paper implicates several socio-economical policies, mostly the national aging policy (NAP), the National Education and Training policy (NETP), the National Employment Policy (NEP), and regulations /laws on national social security funds schemes at national, regional and global levels. Therefore, the paper recommended that government and other education stakeholders increase the policy commitment on the mathematics, science, and technology subjects to be compulsory for primary and secondary schools, and the extension of the retirement age from 60 years (voluntary) to 65 years (compulsory)

Download Full-text

Learning to discover: expressive Gaussian mixture models for multi-dimensional simulation and parameter inference in the physical sciences

Machine Learning: Science and Technology ◽

10.1088/2632-2153/ac4a3b ◽

2022 ◽

Author(s):

Stephen Burns Menary ◽

Darren David Price

Keyword(s):

Particle Physics ◽

Scientific Discovery ◽

Hadron Collider ◽

Gaussian Mixture Models ◽

Gaussian Mixture ◽

High Energy ◽

High Energy Particle ◽

Calibration Data ◽

Likelihood Ratios ◽

Dimensional Simulation

Abstract We show that density models describing multiple observables with (i) hard boundaries and (ii) dependence on external parameters may be created using an auto-regressive Gaussian mixture model. The model is designed to capture how observable spectra are deformed by hypothesis variations, and is made more expressive by projecting data onto a configurable latent space. It may be used as a statistical model for scientific discovery in interpreting experimental observations, for example when constraining the parameters of a physical model or tuning simulation parameters according to calibration data. The model may also be sampled for use within a Monte Carlo simulation chain, or used to estimate likelihood ratios for event classification. The method is demonstrated on simulated high-energy particle physics data considering the anomalous electroweak production of a $Z$ boson in association with a dijet system at the Large Hadron Collider, and the accuracy of inference is tested using a realistic toy example. The developed methods are domain agnostic; they may be used within any field to perform simulation or inference where a dataset consisting of many real-valued observables has conditional dependence on external parameters.

Download Full-text

An Ensemble Prognostic Method of Francis Turbine Units Using Low-Quality Data under Variable Operating Conditions

Sensors ◽

10.3390/s22020525 ◽

2022 ◽

Vol 22 (2) ◽

pp. 525

Author(s):

Ran Duan ◽

Jie Liu ◽

Jianzhong Zhou ◽

Pei Wang ◽

Wei Liu

Keyword(s):

Prediction Model ◽

Prediction Interval ◽

Gaussian Mixture ◽

Operating Conditions ◽

Monitoring Data ◽

Francis Turbine ◽

Quality Data ◽

Percentage Error ◽

Data Set ◽

Operation Conditions

The prognostic is the key to the state-based maintenance of Francis turbine units (FTUs), which consists of performance state evaluation and degradation trend prediction. In practical engineering environments, there are three significant difficulties: low data quality, complex variable operation conditions, and prediction model parameter optimization. In order to effectively solve the above three problems, an ensemble prognostic method of FTUs using low-quality data under variable operation conditions is proposed in this study. Firstly, to consider the operation condition parameters, the running data set of the FTU is constructed by the water head, active power, and vibration amplitude of the top cover. Then, to improve the robustness of the proposed model against anomaly data, the density-based spatial clustering of applications with noise (DBSCAN) is introduced to clean outliers and singularities in the raw running data set. Next, considering the randomness of the monitoring data, the healthy state model based on the Gaussian mixture model is constructed, and the negative log-likelihood probability is calculated as the performance degradation indicator (PDI). Furthermore, to predict the trend of PDIs with confidence interval and automatically optimize the prediction model on both accuracy and certainty, the multiobjective prediction model is proposed based on the non-dominated sorting genetic algorithm and Gaussian process regression. Finally, monitoring data from an actual large FTU was used for effectiveness verification. The stability and smoothness of the PDI curve are improved by 3.2 times and 1.9 times, respectively, by DBSCAN compared with 3-sigma. The root-mean-squared error, the prediction interval normalized average, the prediction interval coverage probability, the mean absolute percentage error, and the R2 score of the proposed method achieved 0.223, 0.289, 1.000, 0.641%, and 0.974, respectively. The comparison experiments demonstrate that the proposed method is more robust to low-quality data and has better accuracy, certainty, and reliability for the prognostic of the FTU under complex operating conditions.

Download Full-text

Impacts of COVID-19 on Electric Vehicle Charging Behavior: Data Analytics, Visualization, and Clustering

Applied System Innovation ◽

10.3390/asi5010012 ◽

2022 ◽

Vol 5 (1) ◽

pp. 12

Author(s):

Sakib Shahriar ◽

A. R. Al-Ali

Keyword(s):

Electric Vehicle ◽

Clustering Algorithms ◽

Gaussian Mixture Models ◽

Gaussian Mixture ◽

Performance Comparison ◽

Internal Validation ◽

Electric Vehicle Charging ◽

The Usa ◽

And Performance ◽

Ev Charging

COVID-19 pandemic has infected millions and led to a catastrophic loss of lives globally. It has also significantly disrupted the movement of people, businesses, and industries. Additionally, electric vehicle (EV) users have faced challenges in charging their vehicles in public charging locations where there is a risk of COVID-19 exposure. However, a case study of EV charging behavior and its impacts during the SARS-CoV-2 is not addressed in the existing literature. This paper investigates the impacts of COVID-19 on EV charging behavior by analyzing the charging activity during the pandemic using a dataset from a public charging facility in the USA. Data visualization of charging behavior alongside significant timelines of the pandemic was utilized for analysis. Moreover, a cluster analysis using k-means, hierarchical clustering, and Gaussian mixture models was performed to identify common groups of charging behavior based on the vehicle arrival and departure times. Although the number of vehicles using the charging station was reduced significantly due to lockdown restrictions, the charging activity started to pick up again since May 2021 due to an increase in vaccination and easing of public restrictions. However, the charging activity currently still remains around half of the activity pre-pandemic. A noticeable decline in charging session length and an increase in energy consumption can be observed as well. Clustering algorithms identified three groups of charging behavior during the pandemic and their analysis and performance comparison using internal validation measures were also presented.

Download Full-text

gaussian mixture
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Ensemble unsupervised autoencoders and Gaussian mixture model for cyberattack detection

Double Gaussian mixture model-based terahertz wave dispersion compensation method using convex optimization technique

Low-rank Gaussian mixture modeling of space-snapshot representation of microphone array measurements for acoustic imaging in a complex noisy environment

Construction schedule risk analysis – a hybrid machine learning approach

baredSC: Bayesian approach to retrieve expression distribution of single-cell data

The heritability of BMI varies across the range of BMI: a heritability curve analysis in a twin cohort

Hidden Demographics Barriers of the Economic Growth: A Psychometric Approach

Learning to discover: expressive Gaussian mixture models for multi-dimensional simulation and parameter inference in the physical sciences

An Ensemble Prognostic Method of Francis Turbine Units Using Low-Quality Data under Variable Operating Conditions

Impacts of COVID-19 on Electric Vehicle Charging Behavior: Data Analytics, Visualization, and Clustering

Export Citation Format

gaussian mixtureRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Ensemble unsupervised autoencoders and Gaussian mixture model for cyberattack detection

Double Gaussian mixture model-based terahertz wave dispersion compensation method using convex optimization technique

Low-rank Gaussian mixture modeling of space-snapshot representation of microphone array measurements for acoustic imaging in a complex noisy environment

Construction schedule risk analysis – a hybrid machine learning approach

baredSC: Bayesian approach to retrieve expression distribution of single-cell data

The heritability of BMI varies across the range of BMI: a heritability curve analysis in a twin cohort

Hidden Demographics Barriers of the Economic Growth: A Psychometric Approach

Learning to discover: expressive Gaussian mixture models for multi-dimensional simulation and parameter inference in the physical sciences

An Ensemble Prognostic Method of Francis Turbine Units Using Low-Quality Data under Variable Operating Conditions

Impacts of COVID-19 on Electric Vehicle Charging Behavior: Data Analytics, Visualization, and Clustering

gaussian mixture
Recently Published Documents