Inferring transmission bottleneck size from viral sequence data using a novel haplotype reconstruction method

The transmission bottleneck is defined as the number of viral particles transmitted from one host to another. Genome sequence data has been used to evaluate the size of the transmission bottleneck between humans infected with the influenza virus, however, the methods used to make these estimates have some limitations. Specifically, approaches using viral allele frequency data may not fully capture a process which involves the transmission of entire viral genomes. Here we set out a novel approach for inferring viral transmission bottlenecks; our method combines haplotype reconstruction, a method for inferring the composition of genomes in a viral population, with two maximum likelihood methods for bottleneck inference, tailored for small and large bottleneck sizes respectively. Our method allows for rapid calculation, and performs well when applied to data from simulated transmission events, being robust to errors in the haplotype reconstruction process. Applied to data from a previous household transmission study of influenza A infection we confirm the result that the majority of transmission events involve a small number of viruses, albeit with slightly looser bottlenecks being inferred, with between 1 and 13 particles transmitted in the majority of cases. While influenza A transmission involves a tight population bottleneck, the bottleneck is not so tight as to universally prevent the transmission of within-host viral diversity.

Download Full-text

Inferring Transmission Bottleneck Size from Viral Sequence Data Using a Novel Haplotype Reconstruction Method

Journal of Virology ◽

10.1128/jvi.00014-20 ◽

2020 ◽

Vol 94 (13) ◽

Cited By ~ 2

Author(s):

Mahan Ghafari ◽

Casper K. Lumby ◽

Daniel B. Weissman ◽

Christopher J. R. Illingworth

Keyword(s):

Influenza A ◽

Evolutionary Dynamics ◽

Sequence Data ◽

Viral Evolution ◽

Population Bottleneck ◽

Reconstruction Method ◽

Haplotype Reconstruction ◽

Viral Sequence ◽

Viral Particles ◽

Transmission Bottleneck

ABSTRACT The transmission bottleneck is defined as the number of viral particles that transmit from one host to establish an infection in another. Genome sequence data have been used to evaluate the size of the transmission bottleneck between humans infected with the influenza virus; however, the methods used to make these estimates have some limitations. Specifically, viral allele frequencies, which form the basis of many calculations, may not fully capture a process which involves the transmission of entire viral genomes. Here, we set out a novel approach for inferring viral transmission bottlenecks; our method combines an algorithm for haplotype reconstruction with maximum likelihood methods for bottleneck inference. This approach allows for rapid calculation and performs well when applied to data from simulated transmission events; errors in the haplotype reconstruction step did not adversely affect inferences of the population bottleneck. Applied to data from a previous household transmission study of influenza A infection, we confirm the result that the majority of transmission events involve a small number of viruses, albeit with slightly looser bottlenecks being inferred, with between 1 and 13 particles transmitted in the majority of cases. While influenza A transmission involves a tight population bottleneck, the bottleneck is not so tight as to universally prevent the transmission of within-host viral diversity. IMPORTANCE Viral populations undergo a repeated cycle of within-host growth followed by transmission. Viral evolution is affected by each stage of this cycle. The number of viral particles transmitted from one host to another, known as the transmission bottleneck, is an important factor in determining how the evolutionary dynamics of the population play out, restricting the extent to which the evolved diversity of the population can be passed from one host to another. Previous study of viral sequence data has suggested that the transmission bottleneck size for influenza A transmission between human hosts is small. Reevaluating these data using a novel and improved method, we largely confirm this result, albeit that we infer a slightly higher bottleneck size in some cases, of between 1 and 13 virions. While a tight bottleneck operates in human influenza transmission, it is not extreme in nature; some diversity can be meaningfully retained between hosts.

Download Full-text

The Linkage Method: A Novel Approach for SNP Detection and Haplotype Reconstruction from a Single Diploid Individual Using Next-Generation Sequence Data

Molecular Biology and Evolution ◽

10.1093/molbev/mst103 ◽

2013 ◽

Vol 30 (9) ◽

pp. 2187-2196 ◽

Cited By ~ 5

Author(s):

E. Sasaki ◽

R. P. Sugino ◽

H. Innan

Keyword(s):

Sequence Data ◽

Haplotype Reconstruction ◽

Next Generation ◽

Snp Detection ◽

Novel Approach ◽

Linkage Method ◽

Diploid Individual

Download Full-text

Differential Expression of Serum Exosome microRNAs and Cytokines in Influenza A and B Patients Collected in the 2016 and 2017 Influenza Seasons

Pathogens ◽

10.3390/pathogens10020149 ◽

2021 ◽

Vol 10 (2) ◽

pp. 149

Author(s):

Sreekumar Othumpangat ◽

William G. Lindsley ◽

Donald H. Beezhold ◽

Michael L. Kashon ◽

Carmen N. Burrell ◽

...

Keyword(s):

Healthy Volunteers ◽

Influenza A ◽

Influenza Infection ◽

Cell Types ◽

Airway Epithelial Cells ◽

Differentially Expressed ◽

Influenza B ◽

Airway Epithelial ◽

Monocyte Chemoattractant Protein 1 ◽

Novel Approach

MicroRNAs (miRNAs) have remarkable stability and are key regulators of mRNA transcripts for several essential proteins required for the survival of cells and replication of the virus. Exosomes are thought to play an essential role in intercellular communications by transporting proteins and miRNAs, making them ideal in the search for biomarkers. Evidence suggests that miRNAs are involved in the regulation of influenza virus replication in many cell types. During the 2016 and 2017 influenza season, we collected blood samples from 54 patients infected with influenza and from 30 healthy volunteers to identify the potential role of circulating serum miRNAs and cytokines in influenza infection. Data comparing the exosomal miRNAs in patients with influenza B to healthy volunteers showed 76 miRNAs that were differentially expressed (p < 0.05). In contrast, 26 miRNAs were differentially expressed between patients with influenza A (p < 0.05) and the controls. Of these miRNAs, 11 were commonly expressed in both the influenza A and B patients. Interferon (IFN)-inducing protein 10 (IP-10), which is involved in IFN synthesis during influenza infection, showed the highest level of expression in both influenza A and B patients. Influenza A patients showed increased expression of IFNα, GM-CSF, interleukin (IL)-13, IL-17A, IL-1β, IL-6 and TNFα, while influenza B induced increased levels of EGF, G-CSF, IL-1α, MIP-1α, and TNF-β. In addition, hsa-miR-326, hsa-miR-15b-5p, hsa-miR-885, hsa-miR-122-5p, hsa-miR-133a-3p, and hsa-miR-150-5p showed high correlations to IL-6, IL-15, IL-17A, IL-1β, and monocyte chemoattractant protein-1 (MCP-1) with both strains of influenza. Next-generation sequencing studies of H1N1-infected human lung small airway epithelial cells also showed similar pattern of expression of miR-375-5p, miR-143-3p, 199a-3p, and miR-199a-5p compared to influenza A patients. In summary, this study provides insights into the miRNA profiling in both influenza A and B virus in circulation and a novel approach to identify the early infections through a combination of cytokines and miRNA expression.

Download Full-text

An Integrated Framework for the Inference of Viral Population History From Reconstructed Genealogies

Genetics ◽

10.1093/genetics/155.3.1429 ◽

2000 ◽

Vol 155 (3) ◽

pp. 1429-1437

Author(s):

Oliver G Pybus ◽

Andrew Rambaut ◽

Paul H Harvey

Keyword(s):

Maximum Likelihood ◽

Sequence Data ◽

Demographic History ◽

Population History ◽

Maximum Likelihood Estimates ◽

Viral Population ◽

True Parameter ◽

Subtype B ◽

Exponential Growth Model ◽

Parameter Values

Abstract We describe a unified set of methods for the inference of demographic history using genealogies reconstructed from gene sequence data. We introduce the skyline plot, a graphical, nonparametric estimate of demographic history. We discuss both maximum-likelihood parameter estimation and demographic hypothesis testing. Simulations are carried out to investigate the statistical properties of maximum-likelihood estimates of demographic parameters. The simulations reveal that (i) the performance of exponential growth model estimates is determined by a simple function of the true parameter values and (ii) under some conditions, estimates from reconstructed trees perform as well as estimates from perfect trees. We apply our methods to HIV-1 sequence data and find strong evidence that subtypes A and B have different demographic histories. We also provide the first (albeit tentative) genetic evidence for a recent decrease in the growth rate of subtype B.

Download Full-text

Improving pandemic influenza risk assessment

eLife ◽

10.7554/elife.03883 ◽

2014 ◽

Vol 3 ◽

Cited By ~ 43

Author(s):

Colin A Russell ◽

Peter M Kasson ◽

Ruben O Donis ◽

Steven Riley ◽

John Dunbar ◽

...

Keyword(s):

Risk Assessment ◽

Pandemic Influenza ◽

Influenza A ◽

Sequence Data ◽

Virus Genome ◽

Influenza Viruses ◽

Influenza Surveillance ◽

Human Influenza ◽

Influenza A Viruses ◽

Virus Genotype

Assessing the pandemic risk posed by specific non-human influenza A viruses is an important goal in public health research. As influenza virus genome sequencing becomes cheaper, faster, and more readily available, the ability to predict pandemic potential from sequence data could transform pandemic influenza risk assessment capabilities. However, the complexities of the relationships between virus genotype and phenotype make such predictions extremely difficult. The integration of experimental work, computational tool development, and analysis of evolutionary pathways, together with refinements to influenza surveillance, has the potential to transform our ability to assess the risks posed to humans by non-human influenza viruses and lead to improved pandemic preparedness and response.

Download Full-text

VGEA: an RNA viral assembly toolkit

PeerJ ◽

10.7717/peerj.12129 ◽

2021 ◽

Vol 9 ◽

pp. e12129

Author(s):

Paul E. Oluniyi ◽

Fehintola Ajogbasile ◽

Judith Oguzie ◽

Jessica Uwanibe ◽

Adeyemi Kayode ◽

...

Keyword(s):

De Novo ◽

Sequence Data ◽

Workflow Management ◽

Viral Population ◽

Lassa Virus ◽

Viral Genomes ◽

Bioinformatics Tools ◽

Reference Sequences ◽

Genome Assemblies

Next generation sequencing (NGS)-based studies have vastly increased our understanding of viral diversity. Viral sequence data obtained from NGS experiments are a rich source of information, these data can be used to study their epidemiology, evolution, transmission patterns, and can also inform drug and vaccine design. Viral genomes, however, represent a great challenge to bioinformatics due to their high mutation rate and forming quasispecies in the same infected host, bringing about the need to implement advanced bioinformatics tools to assemble consensus genomes well-representative of the viral population circulating in individual patients. Many tools have been developed to preprocess sequencing reads, carry-out de novo or reference-assisted assembly of viral genomes and assess the quality of the genomes obtained. Most of these tools however exist as standalone workflows and usually require huge computational resources. Here we present (Viral Genomes Easily Analyzed), a Snakemake workflow for analyzing RNA viral genomes. VGEA enables users to map sequencing reads to the human genome to remove human contaminants, split bam files into forward and reverse reads, carry out de novo assembly of forward and reverse reads to generate contigs, pre-process reads for quality and contamination, map reads to a reference tailored to the sample using corrected contigs supplemented by the user’s choice of reference sequences and evaluate/compare genome assemblies. We designed a project with the aim of creating a flexible, easy-to-use and all-in-one pipeline from existing/stand-alone bioinformatics tools for viral genome analysis that can be deployed on a personal computer. VGEA was built on the Snakemake workflow management system and utilizes existing tools for each step: fastp (Chen et al., 2018) for read trimming and read-level quality control, BWA (Li & Durbin, 2009) for mapping sequencing reads to the human reference genome, SAMtools (Li et al., 2009) for extracting unmapped reads and also for splitting bam files into fastq files, IVA (Hunt et al., 2015) for de novo assembly to generate contigs, shiver (Wymant et al., 2018) to pre-process reads for quality and contamination, then map to a reference tailored to the sample using corrected contigs supplemented with the user’s choice of existing reference sequences, SeqKit (Shen et al., 2016) for cleaning shiver assembly for QUAST, QUAST (Gurevich et al., 2013) to evaluate/assess the quality of genome assemblies and MultiQC (Ewels et al., 2016) for aggregation of the results from fastp, BWA and QUAST. Our pipeline was successfully tested and validated with SARS-CoV-2 (n = 20), HIV-1 (n = 20) and Lassa Virus (n = 20) datasets all of which have been made publicly available. VGEA is freely available on GitHub at: https://github.com/pauloluniyi/VGEA under the GNU General Public License.

Download Full-text

Household Transmission of 2009 Pandemic Influenza A (H1N1) Virus in the United States

New England Journal of Medicine ◽

10.1056/nejmoa0905498 ◽

2009 ◽

Vol 361 (27) ◽

pp. 2619-2627 ◽

Cited By ~ 300

Author(s):

Simon Cauchemez ◽

Christl A. Donnelly ◽

Carrie Reed ◽

Azra C. Ghani ◽

Christophe Fraser ◽

...

Keyword(s):

United States ◽

Pandemic Influenza ◽

Influenza A ◽

H1n1 Virus ◽

The United States ◽

Household Transmission ◽

Influenza A H1n1

Download Full-text

SAR Observation Error Estimation Based on Maximum Relative Projection Matching

International Journal of Antennas and Propagation ◽

10.1155/2020/3517834 ◽

2020 ◽

Vol 2020 ◽

pp. 1-14

Author(s):

Y. Zhang ◽

B. P. Wang ◽

Y. Fang ◽

Z. X. Song

Keyword(s):

Error Estimation ◽

Estimation Method ◽

Estimation Methods ◽

Estimation Accuracy ◽

Reconstruction Method ◽

Observation Error ◽

Estimation Model ◽

Reconstruction Process ◽

Single Observation ◽

Imaging Results

The existing sparse imaging observation error estimation methods are to usually estimate the error of each observation position by substituting the error parameters into the iterative reconstruction process, which has a huge calculation cost. In this paper, by analysing the relationship between imaging results of single-observation sampling data and error parameters, a SAR observation error estimation method based on maximum relative projection matching is proposed. First, the method estimates the precise position parameters of the reference position by the sparse reconstruction method of joint error parameters. Second, a relative error estimation model is constructed based on the maximum correlation of base-space projection. Finally, the accurate error parameters are estimated by the Broyden–Fletcher–Goldfarb–Shanno method. Simulation and measured data of microwave anechoic chambers show that, compared to the existing methods, the proposed method has higher estimation accuracy, lower noise sensitivity, and higher computational efficiency.

Download Full-text

Reconstruction Techniques in IceCube using Convolutional and Generative Neural Networks

EPJ Web of Conferences ◽

10.1051/epjconf/201920705005 ◽

2019 ◽

Vol 207 ◽

pp. 05005 ◽

Cited By ~ 2

Author(s):

Mirco Huennefeld

Keyword(s):

Neural Networks ◽

High Energy Physics ◽

High Energy ◽

Reconstruction Method ◽

Network Architectures ◽

Reconstruction Methods ◽

Physics Potential ◽

Reconstruction Performance ◽

Maximum Likelihood Methods ◽

First Results

Reliable and accurate reconstruction methods are vital to the success of high-energy physics experiments such as IceCube. Machine learning based techniques, in particular deep neural networks, can provide a viable alternative to maximum-likelihood methods. However, most common neural network architectures were developed for other domains such as image recogntion. While these methods can enhance the reconstruction performance in IceCube, there is much potential for tailored techniques. In the typical physics use-case, many symmetries, invariances and prior knowledge exist in the data, which are not fully exploited by current network architectures. Novel and specialized deep learning based reconstruction techniques are desired which can leverage the physics potential of experiments like IceCube. A reconstruction method using convolutional neural networks is presented which can significantly increase the reconstruction accuracy while greatly reducing the runtime in comparison to standard reconstruction methods in Ice- Cube. In addition, first results are discussed for future developments based on generative neural networks.

Download Full-text

A fast and novel approach based on grouping and weighted mRMR for feature selection and classification of protein sequence data

International Journal of Data Mining and Bioinformatics ◽

10.1504/ijdmb.2020.105435 ◽

2020 ◽

Vol 23 (1) ◽

pp. 47

Author(s):

Kiranpreet Kaur ◽

Nagamma Patil

Keyword(s):

Feature Selection ◽

Protein Sequence ◽

Sequence Data ◽

Novel Approach ◽

Protein Sequence Data

Download Full-text