scholarly journals Model-based clustering for flow and mass cytometry data with clinical information

2020 ◽  
Vol 21 (S13) ◽  
Author(s):  
Ko Abe ◽  
Kodai Minoura ◽  
Yuka Maeda ◽  
Hiroyoshi Nishikawa ◽  
Teppei Shimamura

Abstract Background High-dimensional flow cytometry and mass cytometry allow systemic-level characterization of more than 10 protein profiles at single-cell resolution and provide a much broader landscape in many biological applications, such as disease diagnosis and prediction of clinical outcome. When associating clinical information with cytometry data, traditional approaches require two distinct steps for identification of cell populations and statistical test to determine whether the difference between two population proportions is significant. These two-step approaches can lead to information loss and analysis bias. Results We propose a novel statistical framework, called LAMBDA (Latent Allocation Model with Bayesian Data Analysis), for simultaneous identification of unknown cell populations and discovery of associations between these populations and clinical information. LAMBDA uses specified probabilistic models designed for modeling the different distribution information for flow or mass cytometry data, respectively. We use a zero-inflated distribution for the mass cytometry data based the characteristics of the data. A simulation study confirms the usefulness of this model by evaluating the accuracy of the estimated parameters. We also demonstrate that LAMBDA can identify associations between cell populations and their clinical outcomes by analyzing real data. LAMBDA is implemented in R and is available from GitHub (https://github.com/abikoushi/lambda).

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Wasan Katip ◽  
Jukapun Yoodee ◽  
Suriyon Uitrakul ◽  
Peninnah Oberdorfer

AbstractColistin provides in vitro activity against numerous ESBL-producing and carbapenem-resistant bacteria. However, clinical information with respect to its utilization in infection caused by ESBL producers is limited. The aim of this study was a comparison of mortality rates of loading dose (LD) colistin and carbapenems as definitive therapies in a cohort of patients with infections caused by ESBL-producing Escherichia coli and Klebsiella pneumoniae. A retrospective cohort study in 396 patients with ESBL-producing E.coli and K.pneumoniae infection at a university-affiliated hospital was conducted between 1 January 2005 and 30 June 2015 to compare outcomes of infected patients who received LD colistin (95 patients) with carbapenems (301 patients). The three primary outcomes were 30-day mortality, clinical response and microbiological response. The most common infection types were urinary tract infection (49.49%), followed by pneumonia (40.66%), bacteremia (13.64%), skin and soft tissue infections (4.80%) and intra-abdominal infection (3.03%). LD colistin group provided higher 30-day mortality when compared with carbapenems group (HR 7.97; 95% CI 3.68 to 17.25; P = 0.001). LD colistin was also independently associated with clinical failure (HR 4.30; 95% CI 1.93 to 9.57; P = 0.001) and bacteriological failure (HR 9.49; 95% CI 3.76 to 23.96; P = 0.001) when compared with those who received carbapenems. LD colistin treatment was associated with poorer outcomes, i.e. mortality rate, clinical response and microbiological response. Moreover, when adjusted confounding factors, LD colistin was still less effective than carbapenems. It should be noted that, however, the use of Vitek-2 to assess colistin susceptibility could provide inaccurate results. Also, the difference in baseline characteristics could still remain in retrospective study although compensation by hazard ratio adjustment was performed. Therefore, clinical utilization of LD colistin should be recommended as an alternative for treatment ESBL-producing Enterobacteriaceae only in the circumstances where carbapenems cannot be utilized, but this recommendation must be considered carefully.


2021 ◽  
Vol 13 (12) ◽  
pp. 6914
Author(s):  
Frikkie Alberts Maré ◽  
Henry Jordaan

The high water intake and wastewater discharge of slaughterhouses have been a concern for many years. One neglected factor in previous research is allocating the water footprint (WF) to beef production’s different products and by-products. The objective of this article was to estimate the WF of different cattle breeds at a slaughterhouse and cutting plant and allocate it according to the different cuts (products) and by-products of beef based on the value fraction of each. The results indicated a negative relationship between the carcass weight and the processing WF when the different breeds were compared. Regarding a specific cut of beef, a kilogram of rib eye from the heaviest breed had a processing WF of 614.57 L/kg, compared to the 919.91 L/kg for the rib eye of the lightest breed. A comparison of the different cuts indicated that high-value cuts had higher WFs than low-value cuts. The difference between a kilogram of rib eye and flank was 426.26 L/kg for the heaviest breed and 637.86 L/kg for the lightest breed. An option to reduce the processing WF of beef is to lessen the WF by slaughtering heavier animals. This will require no extra investment from the slaughterhouse. At the same time, the returns should increase as the average production inputs per kilogram of output (carcass) should reduce, as the slaughterhouse will process more kilograms.


2021 ◽  
Vol 22 (6) ◽  
pp. 2822
Author(s):  
Efstathios Iason Vlachavas ◽  
Jonas Bohn ◽  
Frank Ückert ◽  
Sylvia Nürnberg

Recent advances in sequencing and biotechnological methodologies have led to the generation of large volumes of molecular data of different omics layers, such as genomics, transcriptomics, proteomics and metabolomics. Integration of these data with clinical information provides new opportunities to discover how perturbations in biological processes lead to disease. Using data-driven approaches for the integration and interpretation of multi-omics data could stably identify links between structural and functional information and propose causal molecular networks with potential impact on cancer pathophysiology. This knowledge can then be used to improve disease diagnosis, prognosis, prevention, and therapy. This review will summarize and categorize the most current computational methodologies and tools for integration of distinct molecular layers in the context of translational cancer research and personalized therapy. Additionally, the bioinformatics tools Multi-Omics Factor Analysis (MOFA) and netDX will be tested using omics data from public cancer resources, to assess their overall robustness, provide reproducible workflows for gaining biological knowledge from multi-omics data, and to comprehensively understand the significantly perturbed biological entities in distinct cancer types. We show that the performed supervised and unsupervised analyses result in meaningful and novel findings.


Biometrika ◽  
2020 ◽  
Author(s):  
S Na ◽  
M Kolar ◽  
O Koyejo

Abstract Differential graphical models are designed to represent the difference between the conditional dependence structures of two groups, thus are of particular interest for scientific investigation. Motivated by modern applications, this manuscript considers an extended setting where each group is generated by a latent variable Gaussian graphical model. Due to the existence of latent factors, the differential network is decomposed into sparse and low-rank components, both of which are symmetric indefinite matrices. We estimate these two components simultaneously using a two-stage procedure: (i) an initialization stage, which computes a simple, consistent estimator, and (ii) a convergence stage, implemented using a projected alternating gradient descent algorithm applied to a nonconvex objective, initialized using the output of the first stage. We prove that given the initialization, the estimator converges linearly with a nontrivial, minimax optimal statistical error. Experiments on synthetic and real data illustrate that the proposed nonconvex procedure outperforms existing methods.


2013 ◽  
Vol 2013 ◽  
pp. 1-11 ◽  
Author(s):  
Jia-Rou Liu ◽  
Po-Hsiu Kuo ◽  
Hung Hung

Large-p-small-ndatasets are commonly encountered in modern biomedical studies. To detect the difference between two groups, conventional methods would fail to apply due to the instability in estimating variances int-test and a high proportion of tied values in AUC (area under the receiver operating characteristic curve) estimates. The significance analysis of microarrays (SAM) may also not be satisfactory, since its performance is sensitive to the tuning parameter, and its selection is not straightforward. In this work, we propose a robust rerank approach to overcome the above-mentioned diffculties. In particular, we obtain a rank-based statistic for each feature based on the concept of “rank-over-variable.” Techniques of “random subset” and “rerank” are then iteratively applied to rank features, and the leading features will be selected for further studies. The proposed re-rank approach is especially applicable for large-p-small-ndatasets. Moreover, it is insensitive to the selection of tuning parameters, which is an appealing property for practical implementation. Simulation studies and real data analysis of pooling-based genome wide association (GWA) studies demonstrate the usefulness of our method.


Stroke ◽  
2013 ◽  
Vol 44 (suppl_1) ◽  
Author(s):  
Nandakumar Nagaraja ◽  
Steven Warach ◽  
Amie W Hsia ◽  
Sungyoung Auh ◽  
Lawrence L Latour ◽  
...  

Background: Blood pressure (BP) drop in the first 24 hours after stroke onset may occur in response to vessel recanalization. Clinical improvement could be due to recanalization or better collateral flow with persistent occlusion. We hypothesize that patients with combination of significant improvement on the NIHSS and a drop in BP at 24hr post tPA is associated with recanalization. Methods: We included intravenous t-PA patients from the Lesion Evolution of Stroke Ischemia On Neuroimaging (LESION) registry who had pre-treatment and 24 hour MRA scan, NIHSS scores at those times and an M1 MCA occlusion at baseline, but excluded those on pressors, pre tPA SBP<120 and tandem ICA occlusion. We classified recanalization status on the 24 hour MRA as none, partial or complete. We abstracted all BP measurements for the first 24 hours from the chart and calculated BP drop as the difference of the triage pre-tPA BP and the average of the last 3 hour readings preceding the 24 hour MRI. NIHSS improvement was defined as ≥4points improvement on NIHSS or NIHSS of 0 at 24hour. Patients with combination of drop in BP and NIHSS improvement were compared with others for recanalization status on 24hr MRA by Kendall Tau-b test. Results: Seventeen patients met the study criteria. There were 13 women, the mean age was 76 years and the median baseline NIHSS was 15. On the 24 hour MRA, 3, 8 and 6 patients had none, partial and complete recanalization, respectively. Patients with NIHSS improvement and a SBP drop ≥20 mmHg were more like to have recanalization at 24 hrs (57% Vs 0%, p=0.03). Similar patterns were seen for patients with NIHSS improvement and DBP drop ≥5mmHg (50% Vs 0%, p=0.04) or MAP drop ≥20mmHg (50% Vs 0%, p=0.04). Complete recanalization was only associated with the combination of NIHSS improvement with SBP drop ≥ 20mmHg (66% Vs 0%, p=0.04). A significant association was not found for recanalization with NIHSS improvement alone or drop in BP alone. Conclusion: There is an association of clinical improvement and BP drop in patients who recanalize. Bedside clinical information may be useful in the management of stroke patients.


2019 ◽  
Vol 11 (24) ◽  
pp. 7185 ◽  
Author(s):  
Jongsoo Kang ◽  
Marko Majer ◽  
Hyun-Jung Kim

This study examines the effect of omnichannel usage pattern on customers’ purchasing amount by determining statistical significance of different purchasing amount occurred for online and offline channel usage pattern with empirical analysis. The data is collected from a health and lifestyle company operated by Major Pharmaceutical company in Korea, which sells health supplement and skincare products through their owned online and offline channels. The channel usage pattern of customers is categorized into four groups: Customer using online channel only, customer using offline channel only, customer first joined membership through online and use both on/offline channels and customers joined membership through offline channel and use both on/offline. Then, the trading period, total number of purchasing, average purchasing amount per transaction and total purchasing amount during trading period among the above four groups were analyzed. The result demonstrated the number of purchasing, average purchasing amount and total purchasing amount for the omnichannel customer groups who cross used on and offline showed statistical significance. However, the difference in purchasing amount between the group of customers who joined online membership and use offline channel and another customer group that joined offline membership and use online channel was not statistically significant. This study overcame the limitation of conventional studies used survey based data, by the application of empirical data from the real customers in on/offline channels, and provides meaningful insights based on empirical real data that group of customers with higher purchasing experience in both on/offline channels shows high performance.


2020 ◽  
Vol 117 (46) ◽  
pp. 28784-28794
Author(s):  
Sisi Chen ◽  
Paul Rivaud ◽  
Jong H. Park ◽  
Tiffany Tsou ◽  
Emeric Charles ◽  
...  

Single-cell measurement techniques can now probe gene expression in heterogeneous cell populations from the human body across a range of environmental and physiological conditions. However, new mathematical and computational methods are required to represent and analyze gene-expression changes that occur in complex mixtures of single cells as they respond to signals, drugs, or disease states. Here, we introduce a mathematical modeling platform, PopAlign, that automatically identifies subpopulations of cells within a heterogeneous mixture and tracks gene-expression and cell-abundance changes across subpopulations by constructing and comparing probabilistic models. Probabilistic models provide a low-error, compressed representation of single-cell data that enables efficient large-scale computations. We apply PopAlign to analyze the impact of 40 different immunomodulatory compounds on a heterogeneous population of donor-derived human immune cells as well as patient-specific disease signatures in multiple myeloma. PopAlign scales to comparisons involving tens to hundreds of samples, enabling large-scale studies of natural and engineered cell populations as they respond to drugs, signals, or physiological change.


Stats ◽  
2019 ◽  
Vol 2 (1) ◽  
pp. 111-120 ◽  
Author(s):  
Dewi Rahardja

We construct a point and interval estimation using a Bayesian approach for the difference of two population proportion parameters based on two independent samples of binomial data subject to one type of misclassification. Specifically, we derive an easy-to-implement closed-form algorithm for drawing from the posterior distributions. For illustration, we applied our algorithm to a real data example. Finally, we conduct simulation studies to demonstrate the efficiency of our algorithm for Bayesian inference.


Author(s):  
Yakup Ari

The financial time series have a high frequency and the difference between their observations is not regular. Therefore, continuous models can be used instead of discrete-time series models. The purpose of this chapter is to define Lévy-driven continuous autoregressive moving average (CARMA) models and their applications. The CARMA model is an explicit solution to stochastic differential equations, and also, it is analogue to the discrete ARMA models. In order to form a basis for CARMA processes, the structures of discrete-time processes models are examined. Then stochastic differential equations, Lévy processes, compound Poisson processes, and variance gamma processes are defined. Finally, the parameter estimation of CARMA(2,1) is discussed as an example. The most common method for the parameter estimation of the CARMA process is the pseudo maximum likelihood estimation (PMLE) method by mapping the ARMA coefficients to the corresponding estimates of the CARMA coefficients. Furthermore, a simulation study and a real data application are given as examples.


Sign in / Sign up

Export Citation Format

Share Document