A decade of research in statistics: a topic model approach

2015 ◽  
Vol 103 (2) ◽  
pp. 413-433 ◽  
Author(s):  
Francesca De Battisti ◽  
Alfio Ferrara ◽  
Silvia Salini
Keyword(s):  
2019 ◽  
Vol 1 (1) ◽  
pp. 45-78
Author(s):  
Chankyung Pak

Abstract To disseminate their stories efficiently via social media, news organizations make decisions that resemble traditional editorial decisions. However, the decisions for social media may deviate from traditional ones because they are often made outside the newsroom and guided by audience metrics. This study focuses on selective link sharing as quasi-gatekeeping on Twitter ‐ conditioning a link sharing decision about news content. It illustrates how selective link sharing resembles and deviates from gatekeeping for the publication of news stories. Using a computational data collection method and a machine learning technique called Structural Topic Model (STM), this study shows that selective link sharing generates a different topic distribution between news websites and Twitter and thus significantly revokes the specialty of news organizations. This finding implies that emergent logic, which governs news organizations’ decisions for social media, can undermine the provision of diverse news.


2019 ◽  
Vol 28 (01) ◽  
pp. 179-180

Abdellaoui R, Foulquié P, Texier N, Faviez C, Burgun A, Schück S. Detection of Cases of Noncompliance to Drug Treatment in Patient Forum Posts: Topic Model Approach. J Med Internet Res 2018;20(3):e85 https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5874436/ Jones J, Pradhan M, Hosseini M, Kulanthaivel A, Hosseini M. Novel Approach to Cluster Patient-Generated Data Into Actionable Topics: Case Study of a Web-Based Breast Cancer. JMIR Med Inform 2018;6(4):e45 https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6293240/ Park A, Conway M, Chen AT. Examining Thematic Similarity, Difference, and Membership in Three Online Mental Health Communities from Reddit: A Text Mining and Visualization Approach. Comput Human Behav 2018 Jan;78:98-112 https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5810583/


2020 ◽  
Author(s):  
Stephanie Fern Hudon

Our planet is undergoing rapid change due to the expanding human population and climate change, which leads to extreme weather events and habitat loss. It is more important than ever to develop methods which can monitor the impact we are having on the biodiversity of our planet. To influence policy changes in wildlife and resource management practices we need to provide measurable evidence of how we are affecting animal health and fitness and the ecosystems needed for their survival. We also need to pool our resources and work in interdisciplinary teams to find common threads which can help preserve biodiversity and vital habitats. This dissertation showcases how improved molecular biology assays and data analysis approaches can help monitor the fitness of animal populations within changing ecosystems. Chapter 1 details the development of a universal telomere assay for vertebrates. Recent work has shown the utility of telomere assays in tracking animal health. Telomere lengths can predict extinction events in animal populations, life span, and fitness consequences of anthropogenic activity. Telomere length assays are an improvement over other methods of measuring animal stress, such as cortisol levels, since they are stable during capture and sampling of animals. This dissertation provides a telomere length assay which can be used for any vertebrate. The assay was developed using a quantitative polymerase chain reaction platform which requires low DNA input and is rapid. This dissertation also demonstrates how this assay improves on current telomere assays developed for mice and can be used in a vertebrate not previously assayed for telomere lengths, the American kestrel. This work has the potential to propel research in vertebrate systems forward as it alleviates the need to develop new reference primers for each species of interest. This improved assay has shown promise in studies in mouse cell line studies, American kestrels, golden eagles, five species of passerine birds, osprey, northern goshawks and bighorn sheep. Chapter 2 presents a machine learning analysis, using a topic model approach, to integrate big data from remote sensing, leaf area index surveys, metabolomics and metagenomics to analyze community composition in cross-disciplinary datasets. Topic models were applied to understand community organization across a range of distinct, but connected, biological scales within the sagebrush steppe. The sagebrush steppe is home to several threatened species, including the pygmy rabbit (Brachylagus idahoensis) and sage-grouse (Centrocercus urophasianus). It covers vast swaths of the western United States and is subject to habitat fragmentation and land use conversion for both farming and rangeland use. It is also threatened by increases in fire events which can dramatically alter the landscape. Restoration efforts have been hampered by a lack of resources and often by inadequate collaboration between stakeholders and scientists. This work brought together scientists from four disciplines: remote sensing, field ecology, metabolomics and metagenomics, to provide a framework for how studies can be designed and analyzed that integrate patterns of biodiversity from multiple scales, from the molecular to the landscape scale. A topic model approach was used which groups features (chemicals, bacterial and plant taxa, and light spectrum) into “communities” which in turn can be analyzed for their presence within individual samples and time points. Within the landscape, I found communities which contain encroaching plant species, such as juniper (Juniperus spp.) and cheatgrass (Bromus tectorum). Within plants, I found chemicals which are known toxins to herbivores. Within herbivores, I identified differences in bacterial taxonomical communities associated with changes in diet. This work will help to inform restoration efforts and provide a road map for designing interdisciplinary studies.


2017 ◽  
Author(s):  
Stephen Woloszynek ◽  
Zhengqiao Zhao ◽  
Gideon Simpson ◽  
Michael P. O’Connor ◽  
Joshua Chang Mell ◽  
...  

ABSTRACTThe increasing availability of microbiome survey data has led to the use of complex machine learning and statistical approaches to measure taxonomic diversity and extract relationships between taxa and their host or environment. However, many approaches inadequately account for the difficulties inherent to microbiome data. These difficulties include (1) insufficient sequencing depth resulting in sparse count data, (2) a large feature space relative to sample space, resulting in data prone to overfitting, (3) library size imbalance, requiring normalization strategies that lead to compositional artifacts, and (4) zero-inflation. Recent work has used probabilistic topics models to more appropriately model microbiome data, but a thorough inspection of just how well topic models capture underlying microbiome signal is lacking. Also, no work has determined whether library size or variance normalization improves model fitting. Here, we assessed a topic model approach on 16S rRNA gene survey data. Through simulation, we show, for small sample sizes, library-size or variance normalization is unnecessary prior to fitting the topic model. In addition, by exploiting topic-to-topic correlations, the topic model successfully captured dynamic time-series behavior of simulated taxonomic subcommunities. Lastly, when the topic model was applied to the David et al. time-series dataset, three distinct gut configurations emerged. However, unlike the David et al. approach, we characterized the events in terms of topics, which captured taxonomic co-occurrence, and posterior uncertainty, which facilitated the interpretation of how the taxonomic configurations evolved over time.


Author(s):  
Gabrielle Lavenir ◽  
Nicolas Bourgeois

Over the past few years, the French mainstream press has paid more and more attention to "silver gamers", adults over sixty who play video games. This article investigates the discursive and normative paradigms that underlie the unexpected enthusiasm of the French mainstream press for older adults who play video games. We use mixed methods on a corpus of French, Swiss and Belgian articles that mention both older people and video games. First, we produce topics, that is, sets of words related by their meanings and identified with a Bayesian statistical algorithm. Second, we cross the topic model results with a discursive analysis of selected articles. We preface the topic modeling's conclusions with a discussion of the representations of older people and video games in European French-language mainstream media. Our analysis explores how the press coverage of older people who play video games simultaneously erases moral panic about video games and reinforces the discourse of "successful ageing".


2016 ◽  
Author(s):  
Linqing Liu ◽  
Yao Lu ◽  
Ye Luo ◽  
Renxian Zhang ◽  
Laurent Itti ◽  
...  

2018 ◽  
Vol 20 (3) ◽  
pp. e85 ◽  
Author(s):  
Redhouane Abdellaoui ◽  
Pierre Foulquié ◽  
Nathalie Texier ◽  
Carole Faviez ◽  
Anita Burgun ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document