A Data-Driven Text Mining and Semantic Network Analysis for Design Information Retrieval

2017 ◽  
Vol 139 (11) ◽  
Author(s):  
Feng Shi ◽  
Liuqing Chen ◽  
Ji Han ◽  
Peter Childs

With the advent of the big-data era, massive information stored in electronic and digital forms on the internet become valuable resources for knowledge discovery in engineering design. Traditional document retrieval method based on document indexing focuses on retrieving individual documents related to the query, but is incapable of discovering the various associations between individual knowledge concepts. Ontology-based technologies, which can extract the inherent relationships between concepts by using advanced text mining tools, can be applied to improve design information retrieval in the large-scale unstructured textual data environment. However, few of the public available ontology database stands on a design and engineering perspective to establish the relations between knowledge concepts. This paper develops a “WordNet” focusing on design and engineering associations by integrating the text mining approaches to construct an unsupervised learning ontology network. Subsequent probability and velocity network analysis are applied with different statistical behaviors to evaluate the correlation degree between concepts for design information retrieval. The validation results show that the probability and velocity analysis on our constructed ontology network can help recognize the high related complex design and engineering associations between elements. Finally, an engineering design case study demonstrates the use of our constructed semantic network in real-world project for design relations retrieval.

Author(s):  
Feng Shi ◽  
Liuqing Chen ◽  
Ji Han ◽  
Peter Childs

With the advent of the big-data era, massive textual information stored in electronic and digital documents have become valuable resources for knowledge discovery in the fields of design and engineering. Ontology technologies and semantic networks have been widely applied with text mining techniques including Natural Language Processing (NLP) to extract structured knowledge associations from the large-scale unstructured textual data. However, most existing works mainly focus on how to construct the semantic networks by developing various text mining methods such as statistical approaches and semantic approaches, while few studies are found to focus on how to subsequently analyze and fully utilize the already well-established semantic networks. In this paper, a specific network analysis method is proposed to discover the implicit knowledge associations from the existing semantic network for improving knowledge discovery and design innovation. Pythagorean means are applied with Dijkstra’s shortest path algorithm to discover the implicit knowledge associations either around a single knowledge concept or between two concepts. Six criteria are established to evaluate and rank the correlation degree of the implicit associations. Two engineering case studies were conducted to illustrate the proposed knowledge discovery process, and the results showed the effectiveness of the retrieved implicit knowledge associations on helping providing relevant knowledge from various aspects, and provoking creative ideas for engineering innovation.


Author(s):  
Patrice Bellot ◽  
Ludovic Bonnefoy ◽  
Vincent Bouvier ◽  
Frédéric Duvert ◽  
Young-Min Kim

2021 ◽  
Vol 21 (3) ◽  
pp. 49-60
Author(s):  
Tae Jin Kim ◽  
Mi Ryeong Eum ◽  
Sang Hyun Park

Recently, the government has been increasingly communicating with the public in response to their opinions on state administration and policy projects. To examine the practicality of the public’s suggestions, this study investigated issues by disaster type, based on information from major media channels and comment data from the news. An analysis of the frequency of appearance, text mining (TF-IDF, LDA, and sentiment analysis), and the semantic network was performed by extracting the comment data of articles on the themes of “disaster” and “evacuation,” published from January 2010 to May 2020. The analysis results showed that news articles centered on these themes increased rapidly from 2017. The main disasters in Korea were those of “fire,” “typhoon,” “forest fire,” “radioactivity,” and “earthquake,” in order of enormity. Of the total negative words pertaining to “radioactivity” disasters, 43% were negative-sentiment words, and the semantic network analysis revealed that the terms “typhoon,” “forest fire,” and “earthquake” were connected to “radioactivity” disasters. This study is meaningful as it identifies issues by type of disaster and factors of anxiety expressed by the public using news and comment data, without conducting surveys and interviews.


Sign in / Sign up

Export Citation Format

Share Document