scholarly journals Extending the Latent Dirichlet Allocation model to presence/absence data: A case study on North American breeding birds and biogeographical shifts expected from climate change

2018 ◽  
Vol 24 (11) ◽  
pp. 5560-5572 ◽  
Author(s):  
Denis Valle ◽  
Pedro Albuquerque ◽  
Qing Zhao ◽  
Albert Barberan ◽  
Robert J. Fletcher
2021 ◽  
Author(s):  
Jorge Arturo Lopez

Extraction of topics from large text corpuses helps improve Software Engineering (SE) processes. Latent Dirichlet Allocation (LDA) represents one of the algorithmic tools to understand, search, exploit, and summarize a large corpus of data (documents), and it is often used to perform such analysis. However, calibration of the models is computationally expensive, especially if iterating over a large number of topics. Our goal is to create a simple formula allowing analysts to estimate the number of topics, so that the top X topics include the desired proportion of documents under study. We derived the formula from the empirical analysis of three SE-related text corpuses. We believe that practitioners can use our formula to expedite LDA analysis. The formula is also of interest to theoreticians, as it suggests that different SE text corpuses have similar underlying properties.


2017 ◽  
Vol 10 ◽  
pp. 403-421 ◽  
Author(s):  
Putu Manik Prihatini ◽  
I Ketut Gede Darma Putra ◽  
Ida Ayu Dwi Giriantari ◽  
Made Sudarma

Sign in / Sign up

Export Citation Format

Share Document