Use of gene expression studies to investigate the human immunological response to malaria infection

Abstract Background Transcriptional profiling of the human immune response to malaria has been used to identify diagnostic markers, understand the pathogenicity of severe disease and dissect the mechanisms of naturally acquired immunity (NAI). However, interpreting this body of work is difficult given considerable variation in study design, definition of disease, patient selection and methodology employed. This work details a comprehensive review of gene expression profiling (GEP) of the human immune response to malaria to determine how this technology has been applied to date, instances where this has advanced understanding of NAI and the extent of variability in methodology between studies to allow informed comparison of data and interpretation of results. Methods Datasets from the gene expression omnibus (GEO) including the search terms; ‘plasmodium’ or ‘malaria’ or ‘sporozoite’ or ‘merozoite’ or ‘gametocyte’ and ‘Homo sapiens’ were identified and publications analysed. Datasets of gene expression changes in relation to malaria vaccines were excluded. Results Twenty-three GEO datasets and 25 related publications were included in the final review. All datasets related to Plasmodium falciparum infection, except two that related to Plasmodium vivax infection. The majority of datasets included samples from individuals infected with malaria ‘naturally’ in the field (n = 13, 57%), however some related to controlled human malaria infection (CHMI) studies (n = 6, 26%), or cells stimulated with Plasmodium in vitro (n = 6, 26%). The majority of studies examined gene expression changes relating to the blood stage of the parasite. Significant heterogeneity between datasets was identified in terms of study design, sample type, platform used and method of analysis. Seven datasets specifically investigated transcriptional changes associated with NAI to malaria, with evidence supporting suppression of the innate pro-inflammatory response as an important mechanism for this in the majority of these studies. However, further interpretation of this body of work was limited by heterogeneity between studies and small sample sizes. Conclusions GEP in malaria is a potentially powerful tool, but to date studies have been hypothesis generating with small sample sizes and widely varying methodology. As CHMI studies are increasingly performed in endemic settings, there will be growing opportunity to use GEP to understand detailed time-course changes in host response and understand in greater detail the mechanisms of NAI.

Download Full-text

Assessing Differential Gene Expression with Small Sample Sizes in Oligonucleotide Arrays Using a Mean-Variance Model

Biometrics ◽

10.1111/j.1541-0420.2006.00675.x ◽

2007 ◽

Vol 63 (1) ◽

pp. 41-49 ◽

Cited By ~ 17

Author(s):

Jianhua Hu ◽

Fred A. Wright

Keyword(s):

Gene Expression ◽

Differential Gene Expression ◽

Small Sample ◽

Sample Sizes ◽

Variance Model ◽

Oligonucleotide Arrays ◽

Differential Gene ◽

Small Sample Sizes ◽

Mean Variance

Download Full-text

A quantum leap in the reproducibility, precision, and sensitivity of gene expression profile analysis even when sample size is extremely small

Journal of Bioinformatics and Computational Biology ◽

10.1142/s0219720015500183 ◽

2015 ◽

Vol 13 (04) ◽

pp. 1550018 ◽

Cited By ~ 16

Author(s):

Kevin Lim ◽

Zhenhua Li ◽

Kwok Pui Choi ◽

Limsoon Wong

Keyword(s):

Gene Expression ◽

Sample Size ◽

Small Sample Size ◽

Statistical Tests ◽

Transcript Level ◽

Small Sample ◽

Sample Sizes ◽

Quantum Leap ◽

Two Samples ◽

Small Sample Sizes

Transcript-level quantification is often measured across two groups of patients to aid the discovery of biomarkers and detection of biological mechanisms involving these biomarkers. Statistical tests lack power and false discovery rate is high when sample size is small. Yet, many experiments have very few samples (≤ 5). This creates the impetus for a method to discover biomarkers and mechanisms under very small sample sizes. We present a powerful method, ESSNet, that is able to identify subnetworks consistently across independent datasets of the same disease phenotypes even under very small sample sizes. The key idea of ESSNet is to fragment large pathways into smaller subnetworks and compute a statistic that discriminates the subnetworks in two phenotypes. We do not greedily select genes to be included based on differential expression but rely on gene-expression-level ranking within a phenotype, which is shown to be stable even under extremely small sample sizes. We test our subnetworks on null distributions obtained by array rotation; this preserves the gene–gene correlation structure and is suitable for datasets with small sample size allowing us to consistently predict relevant subnetworks even when sample size is small. For most other methods, this consistency drops to less than 10% when we test them on datasets with only two samples from each phenotype, whereas ESSNet is able to achieve an average consistency of 58% (72% when we consider genes within the subnetworks) and continues to be superior when sample size is large. We further show that the subnetworks identified by ESSNet are highly correlated to many references in the biological literature. ESSNet and supplementary material are available at: http://compbio.ddns.comp.nus.edu.sg:8080/essnet .

Download Full-text

Evaluation of TagSeq, a reliable low-cost alternative for RNAseq

10.1101/036426 ◽

2016 ◽

Cited By ~ 5

Author(s):

Brian Keith Lohman ◽

Jesse N Weber ◽

Daniel I Bolnick

Keyword(s):

Gene Expression ◽

Statistical Power ◽

Low Cost ◽

Ecological Genetics ◽

Small Sample ◽

Sample Sizes ◽

Experimental Conditions ◽

Efficient Alternative ◽

Highly Correlated ◽

Small Sample Sizes

RNAseq is a relatively new tool for ecological genetics that offers researchers insight into changes in gene expression in response to a myriad of natural or experimental conditions. However, standard RNAseq methods (e.g., Illumina TruSeq® or NEBNext®) can be cost prohibitive, especially when study designs require large sample sizes. Consequently, RNAseq is often underused as a method, or is applied to small sample sizes that confer poor statistical power. Low cost RNAseq methods could therefore enable far greater and more powerful applications of transcriptomics in ecological genetics and beyond. Standard mRNAseq is costly partly because one sequences portions of the full length of all transcripts. Such whole-mRNA data is redundant for estimates of relative gene expression. TagSeq is an alternative method that focuses sequencing effort on mRNAs 3-prime end, thereby reducing the necessary sequencing depth per sample, and thus cost. Here we present a revised TagSeq protocol, and compare its performance against NEBNext®, the gold-standard whole mRNAseq method. We built both TagSeq and NEBNext® libraries from the same biological samples, each spiked with control RNAs. We found that TagSeq measured the control RNA distribution more accurately than NEBNext®, for a fraction of the cost per sample (~10%). The higher accuracy of TagSeq was particularly apparent for transcripts of moderate to low abundance. Technical replicates of TagSeq libraries are highly correlated, and were correlated with NEBNext® results. Overall, we show that our modified TagSeq protocol is an efficient alternative to traditional whole mRNAseq, offering researchers comparable data at greatly reduced cost.

Download Full-text

Problems with small sample sizes in psychophysiological research

PsycEXTRA Dataset ◽

10.1037/e526132012-267 ◽

1996 ◽

Author(s):

Todd C. Riniolo ◽

Stephen W. Porges

Keyword(s):

Small Sample ◽

Sample Sizes ◽

Psychophysiological Research ◽

Small Sample Sizes

Download Full-text

Bayesian Latent Growth Mixture-Modeling With Small Sample Sizes

PsycEXTRA Dataset ◽

10.1037/e568142014-001 ◽

2014 ◽

Author(s):

Sarah Depaoli

Keyword(s):

Growth Mixture Modeling ◽

Mixture Modeling ◽

Small Sample ◽

Sample Sizes ◽

Latent Growth ◽

Growth Mixture ◽

Latent Growth Mixture Modeling ◽

Small Sample Sizes

Download Full-text

No Evidence that Experiencing Physical Warmth Promotes Interpersonal Warmth: Two Failures to Replicate Williams and Bargh (2008)

10.31234/osf.io/mvn9b ◽

2018 ◽

Cited By ~ 1

Author(s):

Christopher Chabris ◽

Patrick Ryan Heck ◽

Jaclyn Mandart ◽

Daniel Jacob Benjamin ◽

Daniel J. Simons

Keyword(s):

Null Hypothesis ◽

Small Sample ◽

Sample Sizes ◽

Double Blind ◽

Bayesian Analyses ◽

Physical Warmth ◽

Small Sample Sizes ◽

Interpersonal Warmth

Williams and Bargh (2008) reported that holding a hot cup of coffee caused participants to judge a person’s personality as warmer, and that holding a therapeutic heat pad caused participants to choose rewards for other people rather than for themselves. These experiments featured large effects (r = .28 and .31), small sample sizes (41 and 53 participants), and barely statistically significant results. We attempted to replicate both experiments in field settings with more than triple the sample sizes (128 and 177) and double-blind procedures, but found near-zero effects (r = –.03 and .02). In both cases, Bayesian analyses suggest there is substantially more evidence for the null hypothesis of no effect than for the original physical warmth priming hypothesis.

Download Full-text

Assessing the human immune response to SARS-CoV-2 variants

Nature Medicine ◽

10.1038/s41591-021-01290-0 ◽

2021 ◽

Vol 27 (4) ◽

pp. 571-572 ◽

Cited By ~ 1

Author(s):

Roberto Burioni ◽

Eric J. Topol

Keyword(s):

Immune Response ◽

Human Immune Response ◽

Human Immune

Download Full-text

High heterogeneity undermines generalization of differential expression results in RNA-Seq analysis

Human Genomics ◽

10.1186/s40246-021-00308-5 ◽

2021 ◽

Vol 15 (1) ◽

Author(s):

Weitong Cui ◽

Huaru Xue ◽

Lei Wei ◽

Jinghua Jin ◽

Xuewen Tian ◽

...

Keyword(s):

Gene Expression ◽

Differential Expression ◽

Small Sample ◽

Differentially Expressed ◽

Cancer Type ◽

Rna Seq ◽

Sample Sizes ◽

Large Sample ◽

Expression Levels ◽

Gene Expression Levels

Abstract Background RNA sequencing (RNA-Seq) has been widely applied in oncology for monitoring transcriptome changes. However, the emerging problem that high variation of gene expression levels caused by tumor heterogeneity may affect the reproducibility of differential expression (DE) results has rarely been studied. Here, we investigated the reproducibility of DE results for any given number of biological replicates between 3 and 24 and explored why a great many differentially expressed genes (DEGs) were not reproducible. Results Our findings demonstrate that poor reproducibility of DE results exists not only for small sample sizes, but also for relatively large sample sizes. Quite a few of the DEGs detected are specific to the samples in use, rather than genuinely differentially expressed under different conditions. Poor reproducibility of DE results is mainly caused by high variation of gene expression levels for the same gene in different samples. Even though biological variation may account for much of the high variation of gene expression levels, the effect of outlier count data also needs to be treated seriously, as outlier data severely interfere with DE analysis. Conclusions High heterogeneity exists not only in tumor tissue samples of each cancer type studied, but also in normal samples. High heterogeneity leads to poor reproducibility of DEGs, undermining generalization of differential expression results. Therefore, it is necessary to use large sample sizes (at least 10 if possible) in RNA-Seq experimental designs to reduce the impact of biological variability and DE results should be interpreted cautiously unless soundly validated.

Download Full-text

G-computation and machine learning for estimating the causal effects of binary exposure statuses on binary outcomes

Scientific Reports ◽

10.1038/s41598-021-81110-0 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Florent Le Borgne ◽

Arthur Chatton ◽

Maxime Léger ◽

Rémi Lenain ◽

Yohann Foucher

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Statistical Power ◽

Small Sample ◽

Causal Effects ◽

Small Samples ◽

Support Vector ◽

Sample Sizes ◽

Super Learner ◽

Small Sample Sizes

AbstractIn clinical research, there is a growing interest in the use of propensity score-based methods to estimate causal effects. G-computation is an alternative because of its high statistical power. Machine learning is also increasingly used because of its possible robustness to model misspecification. In this paper, we aimed to propose an approach that combines machine learning and G-computation when both the outcome and the exposure status are binary and is able to deal with small samples. We evaluated the performances of several methods, including penalized logistic regressions, a neural network, a support vector machine, boosted classification and regression trees, and a super learner through simulations. We proposed six different scenarios characterised by various sample sizes, numbers of covariates and relationships between covariates, exposure statuses, and outcomes. We have also illustrated the application of these methods, in which they were used to estimate the efficacy of barbiturates prescribed during the first 24 h of an episode of intracranial hypertension. In the context of GC, for estimating the individual outcome probabilities in two counterfactual worlds, we reported that the super learner tended to outperform the other approaches in terms of both bias and variance, especially for small sample sizes. The support vector machine performed well, but its mean bias was slightly higher than that of the super learner. In the investigated scenarios, G-computation associated with the super learner was a performant method for drawing causal inferences, even from small sample sizes.

Download Full-text

Approaches to Modelling the Human Immune Response in Transition of Candidates from Research to Development

Journal of Immunology Research ◽

10.1155/2014/395302 ◽

2014 ◽

Vol 2014 ◽

pp. 1-6 ◽

Cited By ~ 3

Author(s):

Diane Williamson

Keyword(s):

Immune Response ◽

Human Immune Response ◽

Human Immune ◽

Animal Rule

This review considers the steps required to evaluate a candidate biodefense vaccine or therapy as it emerges from the research phase, in order to transition it to development. The options for preclinical modelling of efficacy are considered in the context of the FDA’s Animal Rule.

Download Full-text