regression imputation
Recently Published Documents


TOTAL DOCUMENTS

49
(FIVE YEARS 24)

H-INDEX

8
(FIVE YEARS 2)

Author(s):  
Sugondo Hadiyoso ◽  
Heru Nugroho ◽  
Tati Latifah Erawati Rajab ◽  
Kridanto Surendro

The development of a mesh topology in multi-node electrocardiogram (ECG) monitoring based on the ZigBee protocol still has limitations. When more than one active ECG node sends a data stream, there will be incorrect data or damage due to a failure of synchronization. The incorrect data will affect signal interpretation. Therefore, a mechanism is needed to correct or predict the damaged data. In this study, the method of expectation-maximization (EM) and regression imputation (RI) was proposed to overcome these problems. Real data from previous studies are the main modalities used in this study. The ECG signal data that has been predicted is then compared with the actual ECG data stored in the main controller memory. Root mean square error (RMSE) is calculated to measure system performance. The simulation was performed on 13 ECG waves, each of them has 1000 samples. The simulation results show that the EM method has a lower predictive error value than the RI method. The average RMSE for the EM and RI methods is 4.77 and 6.63, respectively. The proposed method is expected to be used in the case of multi-node ECG monitoring, especially in the ZigBee application to minimize errors.


Mathematics ◽  
2021 ◽  
Vol 9 (24) ◽  
pp. 3252
Author(s):  
Encarnación Álvarez-Verdejo ◽  
Pablo J. Moya-Fernández ◽  
Juan F. Muñoz-Rosas

The problem of missing data is a common feature in any study, and a single imputation method is often applied to deal with this problem. The first contribution of this paper is to analyse the empirical performance of some traditional single imputation methods when they are applied to the estimation of the Gini index, a popular measure of inequality used in many studies. Various methods for constructing confidence intervals for the Gini index are also empirically evaluated. We consider several empirical measures to analyse the performance of estimators and confidence intervals, allowing us to quantify the magnitude of the non-response bias problem. We find extremely large biases under certain non-response mechanisms, and this problem gets noticeably worse as the proportion of missing data increases. For a large correlation coefficient between the target and auxiliary variables, the regression imputation method may notably mitigate this bias problem, yielding appropriate mean square errors. We also find that confidence intervals have poor coverage rates when the probability of data being missing is not uniform, and that the regression imputation method substantially improves the handling of this problem as the correlation coefficient increases.


2021 ◽  
Author(s):  
Konstantinos Slavakis ◽  
Gaurav Shetty ◽  
Loris Cannelli ◽  
Gesualdo Scutari ◽  
Ukash Nakarmi ◽  
...  

<div>This paper introduces a non-parametric approximation framework for imputation-by-regression on data with missing entries. The proposed framework, coined kernel regression imputation in manifolds (KRIM), is built on the hypothesis that features, generated by the measured data, lie close to an unknown-to-the-user smooth manifold. The feature space, where the smooth manifold is embedded in, takes the form of a reproducing kernel Hilbert space (RKHS). Aiming at concise data descriptions, KRIM identifies a small number of ``landmark points'' to define approximating ``linear patches'' in the feature space which mimic tangent spaces to smooth manifolds. This geometric information is infused into the design through a novel bi-linear model that allows for multiple approximating RKHSs. To effect imputation-by-regression, a bi-linear inverse problem is solved by an iterative algorithm with guaranteed convergence to a stationary point of a non-convex loss function. To showcase KRIM's modularity, the application of KRIM to dynamic magnetic resonance imaging (dMRI) is detailed, where reconstruction of images from severely under-sampled dMRI data is desired. Extensive numerical tests on synthetic and real dMRI data demonstrate the superior performance of KRIM over state-of-the-art approaches under several metrics and with a small computational footprint.</div>


2021 ◽  
Author(s):  
Trenton J. Davis ◽  
Tarek R. Firzli ◽  
Emily A. Higgins Keppler ◽  
Matt Richardson ◽  
Heather D. Bean

Missing data is a significant issue in metabolomics that is often neglected when conducting data pre-processing, particularly when it comes to imputation. This can have serious implications for downstream statistical analyses and lead to misleading or uninterpretable inferences. In this study, we aim to identify the primary types of missingness that affect untargeted metabolomics data and compare strategies for imputation using two real-world comprehensive two-dimensional gas chromatog-raphy (GC×GC) data sets. We also present these goals in the context of experimental replication whereby imputation is conducted in a within-replicate-based fashion—the first description and evaluation of this strategy—and introduce an R package MetabImpute to carry out these analyses. Our results conclude that, in these two data sets, missingness was most likely of the missing at-random (MAR) and missing not-at-random (MNAR) types as opposed to missing completely at-random (MCAR). Gibbs sampler imputation and Random Forest gave the best results when imputing MAR and MNAR compared against single-value imputation (zero, minimum, mean, median, and half-minimum) and other more sophisticated approach-es (Bayesian principal components analysis and quantile regression imputation for left-censored data). When samples are replicated, within-replicate imputation approaches led to an increase in the reproducibility of peak quantification compared to imputation that ignores replication, suggesting that imputing with respect to replication may preserve potentially important features in downstream analyses for biomarker discovery.


2021 ◽  
Author(s):  
Trenton J. Davis ◽  
Tarek R. Firzli ◽  
Emily A. Higgins Keppler ◽  
Matt Richardson ◽  
Heather D. Bean

Missing data is a significant issue in metabolomics that is often neglected when conducting data pre-processing, particularly when it comes to imputation. This can have serious implications for downstream statistical analyses and lead to misleading or uninterpretable inferences. In this study, we aim to identify the primary types of missingness that affect untargeted metab-olomics data and compare strategies for imputation using two real-world comprehensive two-dimensional gas chromatog-raphy (GC×GC) data sets. We also present these goals in the context of experimental replication whereby imputation is con-ducted in a within-replicate-based fashion—the first description and evaluation of this strategy—and introduce an R package MetabImpute to carry out these analyses. Our results conclude that, in these two data sets, missingness was most likely of the missing at-random (MAR) and missing not-at-random (MNAR) types as opposed to missing completely at-random (MCAR). Gibbs sampler imputation and Random Forest gave the best results when imputing MAR and MNAR compared against single-value imputation (zero, minimum, mean, median, and half-minimum) and other more sophisticated approach-es (Bayesian principal components analysis and quantile regression imputation for left-censored data). When samples are replicated, within-replicate imputation approaches led to an increase in the reproducibility of peak quantification compared to imputation that ignores replication, suggesting that imputing with respect to replication may preserve potentially im-portant features in downstream analyses for biomarker discovery.


2021 ◽  
Author(s):  
Konstantinos Slavakis ◽  
Gaurav Shetty ◽  
Loris Cannelli ◽  
Gesualdo Scutari ◽  
Ukash Nakarmi ◽  
...  

This paper introduces a non-parametric kernel-based modeling framework for imputation by regression on data that are assumed to lie close to an unknown-to-the-user smooth manifold in a Euclidean space. The proposed framework, coined kernel regression imputation in manifolds (KRIM), needs no training data to operate. Aiming at computationally efficient solutions, KRIM utilizes a small number of ``landmark'' data-points to extract geometric information from the measured data via parsimonious affine combinations (``linear patches''), which mimic the concept of tangent spaces to smooth manifolds and take place in functional approximation spaces, namely reproducing kernel Hilbert spaces (RKHSs). Multiple complex RKHSs are combined in a data-driven way to surmount the obstacle of pin-pointing the ``optimal'' parameters of a single kernel through cross-validation. The extracted geometric information is incorporated into the design via a novel bi-linear data-approximation model, and the imputation-by-regression task takes the form of an inverse problem which is solved by an iterative algorithm with guaranteed convergence to a stationary point of the non-convex loss function. To showcase the modular character and wide applicability of KRIM, this paper highlights the application of KRIM to dynamic magnetic resonance imaging (dMRI), where reconstruction of high-resolution images from severely under-sampled dMRI data is desired. Extensive numerical tests on synthetic and real dMRI data demonstrate the superior performance of KRIM over state-of-the-art approaches under several metrics and with a small computational footprint.<br>


2021 ◽  
Author(s):  
Konstantinos Slavakis ◽  
Gaurav Shetty ◽  
Loris Cannelli ◽  
Gesualdo Scutari ◽  
Ukash Nakarmi ◽  
...  

This paper introduces a non-parametric kernel-based modeling framework for imputation by regression on data that are assumed to lie close to an unknown-to-the-user smooth manifold in a Euclidean space. The proposed framework, coined kernel regression imputation in manifolds (KRIM), needs no training data to operate. Aiming at computationally efficient solutions, KRIM utilizes a small number of ``landmark'' data-points to extract geometric information from the measured data via parsimonious affine combinations (``linear patches''), which mimic the concept of tangent spaces to smooth manifolds and take place in functional approximation spaces, namely reproducing kernel Hilbert spaces (RKHSs). Multiple complex RKHSs are combined in a data-driven way to surmount the obstacle of pin-pointing the ``optimal'' parameters of a single kernel through cross-validation. The extracted geometric information is incorporated into the design via a novel bi-linear data-approximation model, and the imputation-by-regression task takes the form of an inverse problem which is solved by an iterative algorithm with guaranteed convergence to a stationary point of the non-convex loss function. To showcase the modular character and wide applicability of KRIM, this paper highlights the application of KRIM to dynamic magnetic resonance imaging (dMRI), where reconstruction of high-resolution images from severely under-sampled dMRI data is desired. Extensive numerical tests on synthetic and real dMRI data demonstrate the superior performance of KRIM over state-of-the-art approaches under several metrics and with a small computational footprint.<br>


2021 ◽  
Vol 11 (3) ◽  
pp. 224-229
Author(s):  
Xiang Gao ◽  
◽  
Guanghui Li ◽  
Rong Tan ◽  
Leijiang Yao

With the rapid development of machine learning, it is possible to use neural networks to build models to predict performance of Ceramic Matrix Composites (CMCs) with raw materials and environments. In the traditional material science engineering, it always took a long time to develop a new CMC. Furthermore, there is still no theoretical basis providing references to design experiments to develop CMCs with ideal performances. This work proposed a model to predict the bending strength of CMCs with a Convolution Neural Network (CNN) using 8 factors considered to affect the bending strength of CMCs mainly. For the data were all collected from papers published on journals and conferences, and there is no standard to describe an experiment, the incompleteness of data influences the performance of our model seriously. Then we tried several methods to fill the data, finally the regression imputation with a dual-hidden-layer neural network performed a significant improvement of the CNN bending strength prediction model.


Sign in / Sign up

Export Citation Format

Share Document