Hand-Written and Machine-Printed Text Classification in Architecture, Engineering & Construction Documents

Author(s):  
Supriya Das ◽  
Purnendu Banerjee ◽  
Bhagesh Seraogi ◽  
Himadri Majumder ◽  
Srinivas Mukkamala ◽  
...  
2022 ◽  
Vol 2022 ◽  
pp. 1-16
Author(s):  
Shu Chen ◽  
Junbo Xi ◽  
Yun Chen ◽  
Jinfan Zhao

Accidents of various types in the construction of hydropower engineering projects occur frequently, which leads to significant numbers of casualties and economic losses. Identifying and eliminating near misses are a significant means of preventing accidents. Mining near-miss data can provide valuable information on how to mitigate and control hazards. However, most of the data generated in the construction of hydropower engineering projects are semi-structured text data without unified standard expression, so data association analysis is time-consuming and labor-intensive. Thus, an artificial intelligence (AI) automatic classification method based on a convolutional neural network (CNN) is adopted to obtain structured data on near-miss locations and near-miss types from safety records. The apriori algorithm is used to further mine the associations between “locations” and “types” by scanning structured data. The association results are visualized using a network diagram. A Sankey diagram is used to reveal the information flow of near-miss specific objects using the “location ⟶ type” strong association rule. The proposed method combines text classification, association rules, and the Sankey diagrams and provides a novel approach for mining semi-structured text. Moreover, the method is proven to be useful and efficient for exploring near-miss distribution laws in hydropower engineering construction to reduce the possibility of accidents and efficiently improve the safety level of hydropower engineering construction sites.


Author(s):  
Padmavathi .S ◽  
M. Chidambaram

Text classification has grown into more significant in managing and organizing the text data due to tremendous growth of online information. It does classification of documents in to fixed number of predefined categories. Rule based approach and Machine learning approach are the two ways of text classification. In rule based approach, classification of documents is done based on manually defined rules. In Machine learning based approach, classification rules or classifier are defined automatically using example documents. It has higher recall and quick process. This paper shows an investigation on text classification utilizing different machine learning techniques.


Sign in / Sign up

Export Citation Format

Share Document