scholarly journals Smart literature review: a practical topic modelling approach to exploratory literature review

2019 ◽  
Vol 6 (1) ◽  
Author(s):  
Claus Boye Asmussen ◽  
Charles Møller

Abstract Manual exploratory literature reviews should be a thing of the past, as technology and development of machine learning methods have matured. The learning curve for using machine learning methods is rapidly declining, enabling new possibilities for all researchers. A framework is presented on how to use topic modelling on a large collection of papers for an exploratory literature review and how that can be used for a full literature review. The aim of the paper is to enable the use of topic modelling for researchers by presenting a step-by-step framework on a case and sharing a code template. The framework consists of three steps; pre-processing, topic modelling, and post-processing, where the topic model Latent Dirichlet Allocation is used. The framework enables huge amounts of papers to be reviewed in a transparent, reliable, faster, and reproducible way.

Author(s):  
Andreas Falke ◽  
Harald Hruschka

AbstractThe increasing importance of online distribution channels is paralleled by a rising interest in gaining insights into the customer journey and browsing behavior. We evaluate several machine learning methods (latent Dirichlet allocation, correlated topic model, structural topic model, replicated softmax model) with respect to their ability to reproduce the browsing behavior of households across websites. In addition, we compare these machine learning methods to a related classical technique, singular value decomposition. In our study, the replicated softmax model outperforms latent Dirichlet allocation, but the correlated topic model attains the overall best performance. Compared to singular value decomposition both the correlated topic model and the replicated softmax model lead to a more efficient compression of web browsing data. On the other hand, singular value decomposition surpasses latent Dirichlet allocation. We interpret results of the correlated topic model and the replicated softmax model by determining combinations of topics or hidden variables that are heterogeneous with respect to visited websites. We show that decision makers should not rely on bivariate measures of site visits, as these do not agree with measures of interdependences between sites that can be inferred from the correlated topic model or the replicated softmax model. We investigate how well topics or hidden variables measured by these methods predict yearly household expenditures. The correlated topic model leads to the best predictive performance, followed by the replicated softmax model. We also discuss how the replicated softmax model can be used to support online marketing decisions of websites.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Alan Brnabic ◽  
Lisa M. Hess

Abstract Background Machine learning is a broad term encompassing a number of methods that allow the investigator to learn from the data. These methods may permit large real-world databases to be more rapidly translated to applications to inform patient-provider decision making. Methods This systematic literature review was conducted to identify published observational research of employed machine learning to inform decision making at the patient-provider level. The search strategy was implemented and studies meeting eligibility criteria were evaluated by two independent reviewers. Relevant data related to study design, statistical methods and strengths and limitations were identified; study quality was assessed using a modified version of the Luo checklist. Results A total of 34 publications from January 2014 to September 2020 were identified and evaluated for this review. There were diverse methods, statistical packages and approaches used across identified studies. The most common methods included decision tree and random forest approaches. Most studies applied internal validation but only two conducted external validation. Most studies utilized one algorithm, and only eight studies applied multiple machine learning algorithms to the data. Seven items on the Luo checklist failed to be met by more than 50% of published studies. Conclusions A wide variety of approaches, algorithms, statistical software, and validation strategies were employed in the application of machine learning methods to inform patient-provider decision making. There is a need to ensure that multiple machine learning approaches are used, the model selection strategy is clearly defined, and both internal and external validation are necessary to be sure that decisions for patient care are being made with the highest quality evidence. Future work should routinely employ ensemble methods incorporating multiple machine learning algorithms.


F1000Research ◽  
2017 ◽  
Vol 6 ◽  
pp. 2012 ◽  
Author(s):  
Hashem Koohy

In the era of explosion in biological data, machine learning techniques are becoming more popular in life sciences, including biology and medicine. This research note examines the rise and fall of the most commonly used machine learning techniques in life sciences over the past three decades.


Informatics ◽  
2021 ◽  
Vol 8 (3) ◽  
pp. 56
Author(s):  
Deepika Verma ◽  
Kerstin Bach ◽  
Paul Jarle Mork

The field of patient-centred healthcare has, during recent years, adopted machine learning and data science techniques to support clinical decision making and improve patient outcomes. We conduct a literature review with the aim of summarising the existing methodologies that apply machine learning methods on patient-reported outcome measures datasets for predicting clinical outcomes to support further research and development within the field. We identify 15 articles published within the last decade that employ machine learning methods at various stages of exploiting datasets consisting of patient-reported outcome measures for predicting clinical outcomes, presenting promising research and demonstrating the utility of patient-reported outcome measures data for developmental research, personalised treatment and precision medicine with the help of machine learning-based decision-support systems. Furthermore, we identify and discuss the gaps and challenges, such as inconsistency in reporting the results across different articles, use of different evaluation metrics, legal aspects of using the data, and data unavailability, among others, which can potentially be addressed in future studies.


F1000Research ◽  
2018 ◽  
Vol 6 ◽  
pp. 2012 ◽  
Author(s):  
Hashem Koohy

In the era of explosion in biological data, machine learning techniques are becoming more popular in life sciences, including biology and medicine. This research note examines the rise and fall of the most commonly used machine learning techniques in life sciences over the past three decades.


Author(s):  
M.V. Buinevich ◽  
K.E. Izrailov

Over the past years, the use of unsafe software, the search for vulnerabilities in which relies on static and dynamic analysis, continues to be the main threat to the infosphere. The manual form of conducting static analysis is extremely time-consuming and requires the involvement of highly qualified, and therefore deficient specialists. An alternative is the automation of the process based on artificial intelligence. This work is aimed at finding solutions for the use of machine learning methods at all stages of the static analysis of program code, for which the formal needs of the stages and the possibilities of the methods are studied and correlated. The main result of the study is a generalized domain model, and private — 14 solutions to the “key” problems of static analysis of program code using machine learning methods.


Sign in / Sign up

Export Citation Format

Share Document