Analyzing mass media influence using natural language processing and time series analysis

Interest in research involving health-medical information analysis based on artificial intelligence, especially for deep learning techniques, has recently been increasing. Most of the research in this field has been focused on searching for new knowledge for predicting and diagnosing disease by revealing the relation between disease and various information features of data. These features are extracted by analyzing various clinical pathology data, such as EHR (electronic health records), and academic literature using the techniques of data analysis, natural language processing, etc. However, still needed are more research and interest in applying the latest advanced artificial intelligence-based data analysis technique to bio-signal data, which are continuous physiological records, such as EEG (electroencephalography) and ECG (electrocardiogram). Unlike the other types of data, applying deep learning to bio-signal data, which is in the form of time series of real numbers, has many issues that need to be resolved in preprocessing, learning, and analysis. Such issues include leaving feature selection, learning parts that are black boxes, difficulties in recognizing and identifying effective features, high computational complexities, etc. In this paper, to solve these issues, we provide an encoding-based Wave2vec time series classifier model, which combines signal-processing and deep learning-based natural language processing techniques. To demonstrate its advantages, we provide the results of three experiments conducted with EEG data of the University of California Irvine, which are a real-world benchmark bio-signal dataset. After converting the bio-signals (in the form of waves), which are a real number time series, into a sequence of symbols or a sequence of wavelet patterns that are converted into symbols, through encoding, the proposed model vectorizes the symbols by learning the sequence using deep learning-based natural language processing. The models of each class can be constructed through learning from the vectorized wavelet patterns and training data. The implemented models can be used for prediction and diagnosis of diseases by classifying the new data. The proposed method enhanced data readability and intuition of feature selection and learning processes by converting the time series of real number data into sequences of symbols. In addition, it facilitates intuitive and easy recognition, and identification of influential patterns. Furthermore, real-time large-capacity data analysis is facilitated, which is essential in the development of real-time analysis diagnosis systems, by drastically reducing the complexity of calculation without deterioration of analysis performance by data simplification through the encoding process.

Download Full-text

Explainability in Time Series Forecasting, Natural Language Processing, and Computer Vision

Explainable Artificial Intelligence: An Introduction to Interpretable Machine Learning ◽

10.1007/978-3-030-83356-5_7 ◽

2021 ◽

pp. 261-302

Author(s):

Uday Kamath ◽

John Liu

Keyword(s):

Computer Vision ◽

Time Series ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Time Series Forecasting

Download Full-text

Using time series and natural language processing to identify viral moments in the 2016

10.18653/v1/w19-2107 ◽

2019 ◽

Author(s):

Josephine Lukito ◽

Prathusha K Sarma ◽

Jordan Foley ◽

Aman Abhishek

Keyword(s):

Time Series ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing

Download Full-text

Comparing the effects of mass media and telecommunications on economic development: A pooled time series analysis

Gazette (Leiden Netherlands) ◽

10.1177/001654929605700102 ◽

1996 ◽

Vol 57 (1) ◽

pp. 17-28 ◽

Cited By ~ 3

Author(s):

Jianguo Zhu

Keyword(s):

Time Series ◽

Economic Development ◽

Mass Media ◽

Time Series Analysis ◽

Series Analysis ◽

Pooled Time Series

Download Full-text

Impacts of mass media coverage of the economy during normal times and recessions on the Index of Consumer Confidence using time series analysis and Granger causal analysis

10.31274/rtd-180813-16559 ◽

2008 ◽

Author(s):

Lishan Su

Keyword(s):

Time Series ◽

Mass Media ◽

Time Series Analysis ◽

Media Coverage ◽

Causal Analysis ◽

Consumer Confidence ◽

Series Analysis

Download Full-text

Wildfire Emergency Response Hazard Extraction and Analysis of Trends (HEAT) through Natural Language Processing and Time Series

10.1109/dasc52595.2021.9594501 ◽

2021 ◽

Author(s):

Sequoia R. Andrade ◽

Hannah S. Walsh

Keyword(s):

Time Series ◽

Natural Language Processing ◽

Natural Language ◽

Emergency Response ◽

Language Processing

Download Full-text

Overview of Algorithms for Natural Language Processing and Time Series Analyses

Acta Neurochirurgica Supplement - Machine Learning in Clinical Neuroscience ◽

10.1007/978-3-030-85292-4_26 ◽

2021 ◽

pp. 221-242

Author(s):

James Feghali ◽

Adrian E. Jimenez ◽

Andrew T. Schilling ◽

Tej D. Azad

Keyword(s):

Time Series ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Time Series Analyses

Download Full-text

Natural Language Processing Applied to Reduction of False and Missed Alarms in Kick and Lost Circulation Detection

10.2118/206340-ms ◽

2021 ◽

Author(s):

Michael Yi ◽

Pradeepkumar Ashok ◽

Dawson Ramos ◽

Taylor Thetford ◽

Spencer Bohlander ◽

...

Keyword(s):

Time Series ◽

Natural Language Processing ◽

North America ◽

Natural Language ◽

Language Processing ◽

Time Series Data ◽

Series Data ◽

False Alarms ◽

Lost Circulation ◽

Good Flow

Abstract Kick and lost circulation events are large contributors to non-productive time. Therefore, early detection of these events is crucial. In the absence of good flow in and flow out sensors, pit volume trends offer the best possibility for influx/loss detection, but errors occur since external mud addition /removal to the pits is not monitored or sensed. The goal is to reduce false alarms caused by such mud additions and removal. Data analyzed from over 100s of wells in North America show that mud addition and removal results in certain unique pit volume gain / loss trends, and these trends are quite different from a kick, a lost circulation or a wellbore breathing event trend. Additionally, driller's input text memos into the data aggregation system (EDR) and these memos often provide information with regards to pit operations. In this paper, we introduce a method that utilizes a Bayesian network to aggregate trends detected in time-series data with events identified by natural language processing (NLP) of driller memos critical to greatly improve the accuracy and robustness of kick and lost circulation detection. The methodology was implemented in software that is currently running on rigs in North America. During the test phase, we applied it on several historical wells with lost circulation events and several historical wells with kick events. We were able to identify and quantify the losses even during connections and mud additions, where usually pit volume was increasing despite continual losses. Also, the real-time simultaneous analysis of driller memos provides context to pit volume trends and further reduce the false alarms. The algorithm is also able to take account of pit volume that was reduced due to drilling. Quantification of the losses offers more insight into what lost circulation material to use and the changes in the rate of loss while drilling. This approach was very robust in discovering kicks as well and differentiating it from mud removal and wellbore breathing events. These historical case studies will be detailed in this paper. This is the first time that patterns in mud volume addition and removal detected from time-series data have been used along with driller memos using NLP to reduce false alerts in kick and lost circulation detection. This approach is particularly useful in identifying kick and lost circulation events from pit volume data, especially when good flow in and flow out sensors are not available. The paper provides guidance on how real-time sensor data can be combined with textual data to improve the outputs from an advisory system.

Download Full-text