Discovering Data and Information Quality Research Insights Gained through Latent Semantic Analysis
In the recent decade, the field of data and information quality (DQ) has grown into a research area that spans multiple disciplines. The motivation here is to help understand the core topics and themes that constitute this area and to determine how those topics and themes from DQ relate to business intelligence (BI). To do so, the authors present the results of a study which mines the abstracts of articles in DQ published over the last decade. Using Latent Semantic Analysis (LSA) six core themes of DQ research are identified, as well as twelve dominant topics comprising them. Five of these topics--decision support, database design and data mining, data querying and cleansing, data integration, and DQ for analytics--all relate to BI, emphasizing the importance of research that combines DQ with BI. The DQ topics from these results are profiled with BI, and used to suggest several opportunities for researchers.