scholarly journals Topic modeling in software engineering research

2021 ◽  
Vol 26 (6) ◽  
Author(s):  
Camila Costa Silva ◽  
Matthias Galster ◽  
Fabian Gilson

AbstractTopic modeling using models such as Latent Dirichlet Allocation (LDA) is a text mining technique to extract human-readable semantic “topics” (i.e., word clusters) from a corpus of textual documents. In software engineering, topic modeling has been used to analyze textual data in empirical studies (e.g., to find out what developers talk about online), but also to build new techniques to support software engineering tasks (e.g., to support source code comprehension). Topic modeling needs to be applied carefully (e.g., depending on the type of textual data analyzed and modeling parameters). Our study aims at describing how topic modeling has been applied in software engineering research with a focus on four aspects: (1) which topic models and modeling techniques have been applied, (2) which textual inputs have been used for topic modeling, (3) how textual data was “prepared” (i.e., pre-processed) for topic modeling, and (4) how generated topics (i.e., word clusters) were named to give them a human-understandable meaning. We analyzed topic modeling as applied in 111 papers from ten highly-ranked software engineering venues (five journals and five conferences) published between 2009 and 2020. We found that (1) LDA and LDA-based techniques are the most frequent topic modeling techniques, (2) developer communication and bug reports have been modelled most, (3) data pre-processing and modeling parameters vary quite a bit and are often vaguely reported, and (4) manual topic naming (such as deducting names based on frequent words in a topic) is common.

2017 ◽  
Vol 14 (1) ◽  
pp. 23-45 ◽  
Author(s):  
Andrey Maglyas ◽  
Uolevi Nikula ◽  
Kari Smolander ◽  
Samuel A. Fricker

Purpose Software product management (SPM) unites disciplines related to product strategy, planning, development, and release. There are many organizational activities addressing technical, social, and market issues when releasing a software product. Owing to the high number of activities involved, SPM remains a complex discipline to adopt. The purpose of this paper is to understand what are the core and supporting SPM activities. Design/methodology/approach The authors adopted the research method of meta-ethnography to present a set of techniques for synthesizing individual qualitative studies to increase the degree of conceptualization. The results obtained from three empirical studies were synthesized using the meta-ethnography approach to enhance, rethink, and create a higher level abstraction of the findings. Findings The results show that the study has both theoretical and practical contribution. As the meta-ethnography synthesis has not been widely applied in software engineering, the authors illustrate how to use this research method in the practice of software engineering research. The practical contribution of the study is in the identification of five core and six supporting SPM activities. Originality/value The practical value of this paper is in the identification of core SPM activities that should be present in any company practicing SPM. The list of supporting SPM consists of activities that are not reported to product manager but affect the product success.


2019 ◽  
Vol 44 (3) ◽  
pp. 41-42
Author(s):  
Sai Anirudh Karre ◽  
Lalit Mohan ◽  
Y. Raghu Raghu Reddy ◽  
K.V. Raghavan ◽  
R.D. Naik ◽  
...  

Proceedings ◽  
2021 ◽  
Vol 74 (1) ◽  
pp. 13
Author(s):  
Hatice Koç ◽  
Ali Mert Erdoğan ◽  
Yousef Barjakly ◽  
Serhat Peker

Software engineering is a discipline utilizing Unified Modelling Language (UML) diagrams, which are accepted as a standard to depict object-oriented design models. UML diagrams make it easier to identify the requirements and scopes of systems and applications by providing visual models. In this manner, this study aims to systematically review the literature on UML diagram utilization in software engineering research. A comprehensive review was conducted over the last two decades, spanning from 2000 to 2019. Among several papers, 128 were selected and examined. The main findings showed that UML diagrams were mostly used for the purpose of design and modeling, and class diagrams were the most commonly used ones.


Sign in / Sign up

Export Citation Format

Share Document