scholarly journals TRONCO: an R package for the inference of cancer progression models from heterogeneous genomic data

2015 ◽  
Author(s):  
Luca De Sano ◽  
Giulio Caravagna ◽  
Daniele Ramazzotti ◽  
Alex Graudenzi ◽  
Giancarlo Mauri ◽  
...  

AbstractMotivationWe introduce TRONCO (TRanslational ONCOlogy), an open-source R package that implements the state-of-the-art algorithms for the inference of cancer progression models from (epi)genomic mutational profiles. TRONCO can be used to extract population-level models describing the trends of accumulation of alterations in a cohort of cross-sectional samples, e.g., retrieved from publicly available databases, and individual-level models that reveal the clonal evolutionary history in single cancer patients, when multiple samples, e.g., multiple biopsies or single-cell sequencing data, are available. The resulting models can provide key hints in uncovering the evolutionary trajectories of cancer, especially for precision medicine or personalized therapy.AvailabilityTRONCO is released under the GPL license, it is hosted in the Software section at http://bimib.disco.unimib.it/ and archived also at [email protected]

2016 ◽  
Vol 113 (28) ◽  
pp. E4025-E4034 ◽  
Author(s):  
Giulio Caravagna ◽  
Alex Graudenzi ◽  
Daniele Ramazzotti ◽  
Rebeca Sanz-Pamplona ◽  
Luca De Sano ◽  
...  

The genomic evolution inherent to cancer relates directly to a renewed focus on the voluminous next-generation sequencing data and machine learning for the inference of explanatory models of how the (epi)genomic events are choreographed in cancer initiation and development. However, despite the increasing availability of multiple additional -omics data, this quest has been frustrated by various theoretical and technical hurdles, mostly stemming from the dramatic heterogeneity of the disease. In this paper, we build on our recent work on the “selective advantage” relation among driver mutations in cancer progression and investigate its applicability to the modeling problem at the population level. Here, we introduce PiCnIc (Pipeline for Cancer Inference), a versatile, modular, and customizable pipeline to extract ensemble-level progression models from cross-sectional sequenced cancer genomes. The pipeline has many translational implications because it combines state-of-the-art techniques for sample stratification, driver selection, identification of fitness-equivalent exclusive alterations, and progression model inference. We demonstrate PiCnIc’s ability to reproduce much of the current knowledge on colorectal cancer progression as well as to suggest novel experimentally verifiable hypotheses.


2021 ◽  
Vol 13 (1) ◽  
pp. 368
Author(s):  
Dillon T. Fitch ◽  
Hossain Mohiuddin ◽  
Susan L. Handy

One way cities are looking to promote bicycling is by providing publicly or privately operated bike-share services, which enable individuals to rent bicycles for one-way trips. Although many studies have examined the use of bike-share services, little is known about how these services influence individual-level travel behavior more generally. In this study, we examine the behavior of users and non-users of a dockless, electric-assisted bike-share service in the Sacramento region of California. This service, operated by Jump until suspended due to the coronavirus pandemic, was one of the largest of its kind in the U.S., and spanned three California cities: Sacramento, West Sacramento, and Davis. We combine data from a repeat cross-sectional before-and-after survey of residents and a longitudinal panel survey of bike-share users with the goal of examining how the service influenced individual-level bicycling and driving. Results from multilevel regression models suggest that the effect of bike-share on average bicycling and driving at the population level is likely small. However, our results indicate that people who have used-bike share are likely to have increased their bicycling because of bike-share.


2018 ◽  
Vol 148 (12) ◽  
pp. 1946-1953 ◽  
Author(s):  
Magali Rios-Leyvraz ◽  
Pascal Bovet ◽  
René Tabin ◽  
Bernard Genin ◽  
Michel Russo ◽  
...  

ABSTRACT Background The gold standard to assess salt intake is 24-h urine collections. Use of a urine spot sample can be a simpler alternative, especially when the goal is to assess sodium intake at the population level. Several equations to estimate 24-h urinary sodium excretion from urine spot samples have been tested in adults, but not in children. Objective The objective of this study was to assess the ability of several equations and urine spot samples to estimate 24-h urinary sodium excretion in children. Methods A cross-sectional study of children between 6 and 16 y of age was conducted. Each child collected one 24-h urine sample and 3 timed urine spot samples, i.e., evening (last void before going to bed), overnight (first void in the morning), and morning (second void in the morning). Eight equations (i.e., Kawasaki, Tanaka, Remer, Mage, Brown with and without potassium, Toft, and Meng) were used to estimate 24-h urinary sodium excretion. The estimates from the different spot samples and equations were compared with the measured excretion through the use of several statistics. Results Among the 101 children recruited, 86 had a complete 24-h urine collection and were included in the analysis (mean age: 10.5 y). The mean measured 24-h urinary sodium excretion was 2.5 g (range: 0.8–6.4 g). The different spot samples and equations provided highly heterogeneous estimates of the 24-h urinary sodium excretion. The overnight spot samples with the Tanaka and Brown equations provided the most accurate estimates (mean bias: −0.20 to −0.12 g; correlation: 0.48–0.53; precision: 69.7–76.5%; sensitivity: 76.9–81.6%; specificity: 66.7%; and misclassification: 23.0–27.7%). The other equations, irrespective of the timing of the spot, provided less accurate estimates. Conclusions Urine spot samples, with selected equations, might provide accurate estimates of the 24-h sodium excretion in children at a population level. At an individual level, they could be used to identify children with high sodium excretion. This study was registered at clinicaltrials.gov as NCT02900261.


2015 ◽  
Author(s):  
Giulio Caravagna ◽  
Alex Graudenzi ◽  
DANIELE RAMAZZOTTI ◽  
Rebeca Sanz-Pamplona ◽  
Luca De Sano ◽  
...  

The genomic evolution inherent to cancer relates directly to a renewed focus on the voluminous next generation sequencing (NGS) data, and machine learning for the inference of explanatory models of how the (epi)genomic events are choreographed in cancer initiation and development. However, despite the increasing availability of multiple additional -omics data, this quest has been frustrated by various theoretical and technical hurdles, mostly stemming from the dramatic heterogeneity of the disease. In this paper, we build on our recent works on "selective advantage" relation among driver mutations in cancer progression and investigate its applicability to the modeling problem at the population level. Here, we introduce PiCnIc (Pipeline for Cancer Inference), a versatile, modular and customizable pipeline to extract ensemble-level progression models from cross-sectional sequenced cancer genomes. The pipeline has many translational implications as it combines state-of-the-art techniques for sample stratification, driver selection, identification of fitness-equivalent exclusive alterations and progression model inference. We demonstrate PiCnIc's ability to reproduce much of the current knowledge on colorectal cancer progression, as well as to suggest novel experimentally verifiable hypotheses.


2019 ◽  
Author(s):  
Anthony Federico ◽  
Stefano Monti

ABSTRACTSummaryGeneset enrichment is a popular method for annotating high-throughput sequencing data. Existing tools fall short in providing the flexibility to tackle the varied challenges researchers face in such analyses, particularly when analyzing many signatures across multiple experiments. We present a comprehensive R package for geneset enrichment workflows that offers multiple enrichment, visualization, and sharing methods in addition to novel features such as hierarchical geneset analysis and built-in markdown reporting. hypeR is a one-stop solution to performing geneset enrichment for a wide audience and range of use cases.Availability and implementationThe most recent version of the package is available at https://github.com/montilab/hypeR.Supplementary informationComprehensive documentation and tutorials, are available at https://montilab.github.io/hypeR-docs.


2019 ◽  
Vol 57 (1) ◽  
pp. 55-77 ◽  
Author(s):  
Ryan Dew ◽  
Asim Ansari ◽  
Yang Li

Marketing research relies on individual-level estimates to understand the rich heterogeneity of consumers, firms, and products. While much of the literature focuses on capturing static cross-sectional heterogeneity, little research has been done on modeling dynamic heterogeneity, or the heterogeneous evolution of individual-level model parameters. In this work, the authors propose a novel framework for capturing the dynamics of heterogeneity, using individual-level, latent, Bayesian nonparametric Gaussian processes. Similar to standard heterogeneity specifications, this Gaussian process dynamic heterogeneity (GPDH) specification models individual-level parameters as flexible variations around population-level trends, allowing for sharing of statistical information both across individuals and within individuals over time. This hierarchical structure provides precise individual-level insights regarding parameter dynamics. The authors show that GPDH nests existing heterogeneity specifications and that not flexibly capturing individual-level dynamics may result in biased parameter estimates. Substantively, they apply GPDH to understand preference dynamics and to model the evolution of online reviews. Across both applications, they find robust evidence of dynamic heterogeneity and illustrate GPDH’s rich managerial insights, with implications for targeting, pricing, and market structure analysis.


2017 ◽  
Vol 21 (5) ◽  
pp. 948-956 ◽  
Author(s):  
Nicholas RV Jones ◽  
Tammy YN Tong ◽  
Pablo Monsivais

AbstractObjectiveTo test whether diets achieving recommendations from the UK’s Scientific Advisory Committee on Nutrition (SACN) were associated with higher monetary costs in a nationally representative sample of UK adults.DesignA cross-sectional study linking 4 d diet diaries in the National Diet and Nutrition Survey (NDNS) to contemporaneous food price data from a market research firm. The monetary cost of diets was assessed in relation to whether or not they met eight food- and nutrient-based recommendations from SACN. Regression models adjusted for potential confounding factors. The primary outcome measure was individual dietary cost per day and per 2000 kcal (8368 kJ).SettingUK.SubjectsAdults (n 2045) sampled between 2008 and 2012 in the NDNS.ResultsOn an isoenergetic basis, diets that met the recommendations for fruit and vegetables, oily fish, non-milk extrinsic sugars, fat, saturated fat and salt were estimated to be between 3 and 17 % more expensive. Diets meeting the recommendation for red and processed meats were 4 % less expensive, while meeting the recommendation for fibre was cost-neutral. Meeting multiple targets was also associated with higher costs; on average, diets meeting six or more SACN recommendations were estimated to be 29 % more costly than isoenergetic diets that met no recommendations.ConclusionsFood costs may be a population-level barrier limiting the adoption of dietary recommendations in the UK. Future research should focus on identifying systems- and individual-level strategies to enable consumers achieve dietary recommendations without increasing food costs. Such strategies may improve the uptake of healthy eating in the population.


2017 ◽  
Author(s):  
Andrea Martinez–Vernon ◽  
Frederick Farrell ◽  
Orkun S. Soyer

AbstractSummaryWith the rapid accumulation of sequencing data from genomic and metagenomic studies, there is an acute need for better tools that facilitate their analyses against biological functions. To this end, we developed MetQy, an open–source R package designed for query–based analysis of functional units in [meta]genomes and/or sets of genes using the The Kyoto Encyclopedia of Genes and Genomes (KEGG) database. Furthermore, MetQy contains visualization and analysis tools and facilitates KEGG’s flat file manipulation. Thus, MetQy enables better understanding of metabolic capabilities of known genomes or user–specified [meta]genomes by using the available information and can help guide studies in microbial ecology, metabolic engineering and synthetic biology.Availability and ImplementationThe MetQy R package is freely available and can be downloaded from our group’s website (http://osslab.lifesci.warwick.ac.uk) or GitHub (https://github.com/OSS-Lab/MetQy)[email protected]


2021 ◽  
Author(s):  
Mason Youngblood ◽  
David Lahti

In this study, we used a longitudinal dataset of house finch (Haemorhous mexicanus) song recordings spanning four decades in the introduced eastern range to assess how individual-level cultural transmission mechanisms drive population-level changes in birdsong. First, we developed an agent-based model (available as a new R package called TransmissionBias) that simulates the cultural transmission of house finch song given different parameters related to transmission biases, or biases in social learning that modify the probability of adoption of particular cultural variants. Next, we used approximate Bayesian computation and machine learning to estimate what parameter values likely generated the temporal changes in diversity in our observed data. We found evidence that strong content bias, likely targeted towards syllable complexity, plays a central role in the cultural evolution of house finch song in western Long Island. Frequency and demonstrator biases appear to be neutral or absent. Additionally, we estimated that house finch song is transmitted with extremely high fidelity. Future studies should use our simulation framework to better understand how cultural transmission and population declines influence song diversity in wild populations.


2019 ◽  
Author(s):  
Shu Tadaka ◽  
Fumiki Katsuoka ◽  
Masao Ueki ◽  
Kaname Kojima ◽  
Satoshi Makino ◽  
...  

AbstractThe first step towards realizing personalized healthcare is to catalog the genetic variations in a population. Since the dissemination of individual-level genomic information is strictly controlled, it will be useful to construct population-level allele frequency panels and to provide them through easy-to-use interfaces.In the Tohoku Medical Megabank Project, we have sequenced nearly 4,000 individuals from a Japanese population, and constructed an allele frequency panel of 3,552 individuals after removing related samples. The panel is called the 3.5KJPNv2. It was constructed by using a standard pipeline including the 1KGP and gnomAD algorithms to reduce technical biases and to allow comparisons to other populations. Our database is the first largescale panel providing the frequencies of variants present on the X chromosome and on the mitochondria in the Japanese population. All the data are available on our original database at https://jmorp.megabank.tohoku.ac.jp.


Sign in / Sign up

Export Citation Format

Share Document