scholarly journals Argument annotation and analysis using deep learning with attention mechanism in Bahasa Indonesia

2020 ◽  
Vol 7 (1) ◽  
Author(s):  
Derwin Suhartono ◽  
Aryo Pradipta Gema ◽  
Suhendro Winton ◽  
Theodorus David ◽  
Mohamad Ivan Fanany ◽  
...  

Abstract Argumentation mining is a research field which focuses on sentences in type of argumentation. Argumentative sentences are often used in daily communication and have important role in each decision or conclusion making process. The research objective is to do observation in deep learning utilization combined with attention mechanism for argument annotation and analysis. Argument annotation is argument component classification from certain discourse to several classes. Classes include major claim, claim, premise and non-argumentative. Argument analysis points to argumentation characteristics and validity which are arranged into one topic. One of the analysis is about how to assess whether an established argument is categorized as sufficient or not. Dataset used for argument annotation and analysis is 402 persuasive essays. This data is translated into Bahasa Indonesia (mother tongue of Indonesia) to give overview about how it works with specific language other than English. Several deep learning models such as CNN (Convolutional Neural Network), LSTM (Long Short-Term Memory), and GRU (Gated Recurrent Unit) are utilized for argument annotation and analysis while HAN (Hierarchical Attention Network) is utilized only for argument analysis. Attention mechanism is combined with the model as weighted access setter for a better performance. From the whole experiments, combination of deep learning and attention mechanism for argument annotation and analysis arrives in a better result compared with previous research.

2021 ◽  
Vol 16 (3) ◽  
pp. 54-69
Author(s):  
Pier Giuseppe Giribone ◽  
◽  
Duccio Martelli ◽  
◽  

An Inflation-Indexed Swap (IIS) is a derivative in which, at every payment date, the counterparties swap an inflation rate with a fixed rate. For the calculation of the Inflation Leg cash flows it is necessary to build a mathematical model suitable for the Consumer Price Index (CPI) projection. For this purpose, quants typically start by using market quotes for the Zero-Coupon swaps in order to derive the future trend of the inflation index, together with a seasonality model for capturing the typical periodical effects. In this study, we propose a forecasting model for inflation seasonality based on a Long Short Term Memory (LSTM) network: a deep learning methodology particularly useful for forecasting purposes. The CPI predictions are conducted using a FinTech paradigm, but in respect of the traditional quantitative finance theory developed in this research field. The paper is structured according to the following sections: the first two parts illustrate the pricing methodologies for the most popular IIS: the Zero Coupon Inflation-Indexed Swap (ZCIIS) and the Year-on-Year Inflation-Indexed Swap (YYIIS); section 3 deals with the traditional standard method for the forecast of CPI values (trend + seasonality), while section 4 describes the LSTM architecture, and section 5 focuses on CPI projections, also called inflation bootstrap. Then section 6 describes a robust check, implementing a traditional SARIMA model in order to improve the interpretation of the LSTM outputs; finally, section 7 concludes with a real market case, where the two methodologies are used for computing the fair-value for a YYIIS and the model risk is quantified.


Author(s):  
B. Premjith ◽  
K. P. Soman

Morphological synthesis is one of the main components of Machine Translation (MT) frameworks, especially when any one or both of the source and target languages are morphologically rich. Morphological synthesis is the process of combining two words or two morphemes according to the Sandhi rules of the morphologically rich language. Malayalam and Tamil are two languages in India which are morphologically abundant as well as agglutinative. Morphological synthesis of a word in these two languages is challenging basically because of the following reasons: (1) Abundance in morphology; (2) Complex Sandhi rules; (3) The possibilty in Malayalam to form words by combining words that belong to different syntactic categories (for example, noun and verb); and (4) The construction of a sentence by combining multiple words. We formulated the task of the morphological generation of nouns and verbs of Malayalam and Tamil as a character-to-character sequence tagging problem. In this article, we used deep learning architectures like Recurrent Neural Network (RNN) , Long Short-Term Memory Networks (LSTM) , Gated Recurrent Unit (GRU) , and their stacked and bidirectional versions for the implementation of morphological synthesis at the character level. In addition to that, we investigated the performance of the combination of the aforementioned deep learning architectures and the Conditional Random Field (CRF) in the morphological synthesis of nouns and verbs in Malayalam and Tamil. We observed that the addition of CRF to the Bidirectional LSTM/GRU architecture achieved more than 99% accuracy in the morphological synthesis of Malayalam and Tamil nouns and verbs.


2021 ◽  
Vol 7 (2) ◽  
pp. 133
Author(s):  
Widi Hastomo ◽  
Adhitio Satyo Bayangkari Karno ◽  
Nawang Kalbuana ◽  
Ervina Nisfiani ◽  
Lussiana ETP

Penelitian ini bertujuan untuk meningkatkan akurasi dengan menurunkan tingkat kesalahan prediksi dari 5 data saham blue chip di Indonesia. Dengan cara mengkombinasikan desain 4 hidden layer neural nework menggunakan Long Short Term Memory (LSTM) dan Gated Recurrent Unit (GRU). Dari tiap data saham akan dihasilkan grafik rmse-epoch yang dapat menunjukan kombinasi layer dengan akurasi terbaik, sebagai berikut; (a) BBCA dengan layer LSTM-GRU-LSTM-GRU (RMSE=1120,651, e=15), (b) BBRI dengan layer LSTM-GRU-LSTM-GRU (RMSE =110,331, e=25), (c) INDF dengan layer GRU-GRU-GRU-GRU (RMSE =156,297, e=35 ), (d) ASII dengan layer GRU-GRU-GRU-GRU (RMSE =134,551, e=20 ), (e) TLKM dengan layer GRU-LSTM-GRU-LSTM (RMSE =71,658, e=35 ). Tantangan dalam mengolah data Deep Learning (DL) adalah menentukan nilai parameter epoch untuk menghasilkan prediksi akurasi yang tinggi.


Information ◽  
2020 ◽  
Vol 11 (5) ◽  
pp. 280
Author(s):  
Shaoxiu Wang ◽  
Yonghua Zhu ◽  
Wenjing Gao ◽  
Meng Cao ◽  
Mengyao Li

The sentiment analysis of microblog text has always been a challenging research field due to the limited and complex contextual information. However, most of the existing sentiment analysis methods for microblogs focus on classifying the polarity of emotional keywords while ignoring the transition or progressive impact of words in different positions in the Chinese syntactic structure on global sentiment, as well as the utilization of emojis. To this end, we propose the emotion-semantic-enhanced bidirectional long short-term memory (BiLSTM) network with the multi-head attention mechanism model (EBILSTM-MH) for sentiment analysis. This model uses BiLSTM to learn feature representation of input texts, given the word embedding. Subsequently, the attention mechanism is used to assign the attentive weights of each words to the sentiment analysis based on the impact of emojis. The attentive weights can be combined with the output of the hidden layer to obtain the feature representation of posts. Finally, the sentiment polarity of microblog can be obtained through the dense connection layer. The experimental results show the feasibility of our proposed model on microblog sentiment analysis when compared with other baseline models.


2021 ◽  
Vol 2 ◽  
Author(s):  
Yongliang Qiao ◽  
Cameron Clark ◽  
Sabrina Lomax ◽  
He Kong ◽  
Daobilige Su ◽  
...  

Individual cattle identification is a prerequisite and foundation for precision livestock farming. Existing methods for cattle identification require radio frequency or visual ear tags, all of which are prone to loss or damage. Here, we propose and implement a new unified deep learning approach to cattle identification using video analysis. The proposed deep learning framework is composed of a Convolutional Neural Network (CNN) and Bidirectional Long Short-Term Memory (BiLSTM) with a self-attention mechanism. More specifically, the Inception-V3 CNN was used to extract features from a cattle video dataset taken in a feedlot with rear-view. Extracted features were then fed to a BiLSTM layer to capture spatio-temporal information. Then, self-attention was employed to provide a different focus on the features captured by BiLSTM for the final step of cattle identification. We used a total of 363 rear-view videos from 50 cattle at three different times with an interval of 1 month between data collection periods. The proposed method achieved 93.3% identification accuracy using a 30-frame video length, which outperformed current state-of-the-art methods (Inception-V3, MLP, SimpleRNN, LSTM, and BiLSTM). Furthermore, two different attention schemes, namely, additive and multiplicative attention mechanisms were compared. Our results show that the additive attention mechanism achieved 93.3% accuracy and 91.0% recall, greater than multiplicative attention mechanism with 90.7% accuracy and 87.0% recall. Video length also impacted accuracy, with video sequence length up to 30-frames enhancing identification performance. Overall, our approach can capture key spatio-temporal features to improve cattle identification accuracy, enabling automated cattle identification for precision livestock farming.


CONVERTER ◽  
2021 ◽  
pp. 328-338
Author(s):  
Jinming Hu

Globalization has been the major contributor to economy boom. While at the same time, it has stimulated the development of crime method as frequent cross-border communication allowed. With the improvement in big data and prediction system of policing work, it has become a new research field to establish an efficient crime prediction model, by which police departments could clamp down on criminal activities more accurately. Besides, this model will be quite beneficial for commanding and dispatching police force thus to improve work efficiency. This paper proposes a combination model, which uses Long Short-Term Memory Network (LSTM) and Graph Convolutional Network (GCN) to predict crime rate and takes advantage of the Attention mechanism to improve the experimental result. By extracting the spatio-temporal characteristics of crimes and increasing the proportion of typical feature, it can not only predict crime quantity, but also detect the degree of crime risk in each region. A rolling forecast of crime data for about three years in the Boston of the United States shows that our model has good prediction performance.


Author(s):  
Yudi Widhiyasana ◽  
Transmissia Semiawan ◽  
Ilham Gibran Achmad Mudzakir ◽  
Muhammad Randi Noor

Klasifikasi teks saat ini telah menjadi sebuah bidang yang banyak diteliti, khususnya terkait Natural Language Processing (NLP). Terdapat banyak metode yang dapat dimanfaatkan untuk melakukan klasifikasi teks, salah satunya adalah metode deep learning. RNN, CNN, dan LSTM merupakan beberapa metode deep learning yang umum digunakan untuk mengklasifikasikan teks. Makalah ini bertujuan menganalisis penerapan kombinasi dua buah metode deep learning, yaitu CNN dan LSTM (C-LSTM). Kombinasi kedua metode tersebut dimanfaatkan untuk melakukan klasifikasi teks berita bahasa Indonesia. Data yang digunakan adalah teks berita bahasa Indonesia yang dikumpulkan dari portal-portal berita berbahasa Indonesia. Data yang dikumpulkan dikelompokkan menjadi tiga kategori berita berdasarkan lingkupnya, yaitu “Nasional”, “Internasional”, dan “Regional”. Dalam makalah ini dilakukan eksperimen pada tiga buah variabel penelitian, yaitu jumlah dokumen, ukuran batch, dan nilai learning rate dari C-LSTM yang dibangun. Hasil eksperimen menunjukkan bahwa nilai F1-score yang diperoleh dari hasil klasifikasi menggunakan metode C-LSTM adalah sebesar 93,27%. Nilai F1-score yang dihasilkan oleh metode C-LSTM lebih besar dibandingkan dengan CNN, dengan nilai 89,85%, dan LSTM, dengan nilai 90,87%. Dengan demikian, dapat disimpulkan bahwa kombinasi dua metode deep learning, yaitu CNN dan LSTM (C-LSTM),memiliki kinerja yang lebih baik dibandingkan dengan CNN dan LSTM.


Author(s):  
Yuqi Yu ◽  
Hanbing Yan ◽  
Yuan Ma ◽  
Hao Zhou ◽  
Hongchao Guan

AbstractHypertext Transfer Protocol (HTTP) accounts for a large portion of Internet application-layer traffic. Since the payload of HTTP traffic can record website status and user request information, many studies use HTTP protocol traffic for web application attack detection. In this work, we propose DeepHTTP, an HTTP traffic detection framework based on deep learning. Unlike previous studies, this framework not only performs malicious traffic detection but also uses the deep learning model to mine malicious fields of the traffic payload. The detection model is called AT-Bi-LSTM, which is based on Bidirectional Long Short-Term Memory (Bi-LSTM) with attention mechanism. The attention mechanism can improve the discriminative ability and make the result interpretable. To enhance the generalization ability of the model, this paper proposes a novel feature extraction method. Experiments show that DeepHTTP has an excellent performance in malicious traffic discrimination and pattern mining.


Atmosphere ◽  
2020 ◽  
Vol 11 (4) ◽  
pp. 348 ◽  
Author(s):  
Guang Yang ◽  
HwaMin Lee ◽  
Giyeol Lee

Both long- and short-term exposure to high concentrations of airborne particulate matter (PM) severely affect human health. Many countries now regulate PM concentrations. Early-warning systems based on PM concentration levels are urgently required to allow countermeasures to reduce harm and loss. Previous studies sought to establish accurate, efficient predictive models. Many machine-learning methods are used for air pollution forecasting. The long short-term memory and gated recurrent unit methods, typical deep-learning methods, reliably predict PM levels with some limitations. In this paper, the authors proposed novel hybrid models to combine the strength of two types of deep learning methods. Moreover, the authors compare hybrid deep-learning methods (convolutional neural network (CNN)—long short-term memory (LSTM) and CNN—gated recurrent unit (GRU)) with several stand-alone methods (LSTM, GRU) in terms of predicting PM concentrations in 39 stations in Seoul. Hourly air pollution data and meteorological data from January 2015 to December 2018 was used for these training models. The results of the experiment confirmed that the proposed prediction model could predict the PM concentrations for the next 7 days. Hybrid models outperformed single models in five areas selected randomly with the lowest root mean square error (RMSE) and mean absolute error (MAE) values for both PM10 and PM2.5. The error rate for PM10 prediction in Gangnam with RMSE is 1.688, and MAE is 1.161. For hybrid models, the CNN–GRU better-predicted PM10 for all stations selected, while the CNN–LSTM model performed better on predicting PM2.5.


Sign in / Sign up

Export Citation Format

Share Document