scholarly journals A Novel Statistic-Based Corpus Machine Processing Approach to Refine a Big Textual Data: An ESP Case of COVID-19 News Reports

2020 ◽  
Vol 10 (16) ◽  
pp. 5505
Author(s):  
Liang-Ching Chen ◽  
Kuei-Hu Chang ◽  
Hsiang-Yu Chung

With developments of modern and advanced information and communication technologies (ICTs), Industry 4.0 has launched big data analysis, natural language processing (NLP), and artificial intelligence (AI). Corpus analysis is also a part of big data analysis. For many cases of statistic-based corpus techniques adopted to analyze English for specific purposes (ESP), researchers extracted critical information by retrieving domain-oriented lexical units. However, even if corpus software embraces algorithms such as log-likelihood tests, log ratios, BIC scores, etc., the machine still cannot understand linguistic meanings. In many ESP cases, function words reduce the efficiency of corpus analysis. However, many studies still use manual approaches to eliminate function words. Manual annotation is inefficient and time-wasting, and can easily cause information distortion. To enhance the efficiency of big textual data analysis, this paper proposes a novel statistic-based corpus machine processing approach to refine big textual data. Furthermore, this paper uses COVID-19 news reports as a simulation example of big textual data and applies it to verify the efficacy of the machine optimizing process. The refined resulting data shows that the proposed approach is able to rapidly remove function and meaningless words by machine processing and provide decision-makers with domain-specific corpus data for further purposes.

2019 ◽  
Vol 9 (1) ◽  
pp. 01-12 ◽  
Author(s):  
Kristy F. Tiampo ◽  
Javad Kazemian ◽  
Hadi Ghofrani ◽  
Yelena Kropivnitskaya ◽  
Gero Michel

2020 ◽  
Vol 25 (2) ◽  
pp. 18-30
Author(s):  
Seung Wook Oh ◽  
Jin-Wook Han ◽  
Min Soo Kim

2020 ◽  
Vol 14 (1) ◽  
pp. 151-163
Author(s):  
Joon-Seo Choi ◽  
◽  
Su-in Park

2020 ◽  
Vol 29 (4) ◽  
pp. 29-38
Author(s):  
Jeong-Hyeon Kwak ◽  
Sun-Hee Lee

Sign in / Sign up

Export Citation Format

Share Document