Sarcasm Detection for Japanese Text Using BERT and Emoji

2021 ◽  
pp. 119-124
Author(s):  
Yoshio Okimoto ◽  
Kosuke Suwa ◽  
Jianwei Zhang ◽  
Lin Li
Keyword(s):  
2019 ◽  
Author(s):  
Masataka Nakayama ◽  
Yukiko Uchida

Awe is theorized as an emotion appraised by perceived vastness and need for accommodation. This theoretical framework was developed with a review of spatially and temporally distributed literature mostly in the American and European cultural context, and is assumed to be culturally universal. However, awe as described by Japanese literature, was not explicitly included in the original theorization. We tested whether this framework generalized to the Japanese context by analyzing how Japanese awe-related words (e.g., “畏敬/ikei”) are used in Japanese text. A topic model was used to extract topics in contexts as an index of meaning. Results show that (1) the meaning of awe was statistically dissociable from similar but distinct meanings of fear and respect, and (2) the dissociating topics included transcendent entities such as god, spirits/ghosts, and powerful beings. Japanese meaning of awe includes vastness (i.e., transcendence) that goes beyond typical respect (i.e., power distance) requiring an accommodation of one’s mental framework.


2020 ◽  
Vol 1 (1) ◽  
pp. 75-98
Author(s):  
Koji Wajima ◽  
Kei Koqure ◽  
Toshihiro Furukawa ◽  
Tetsuji Satoh

Ways of disseminating(Verbreitungsmedien) information through different media have rapidly changed owing to technological progress, especially in the field of information and communication technologies. Reflecting the changes in terms of conditions of technological progress, communication methods, and abilities have also changed. On the Internet, contents with different expressions of difficulty are mixed even though they have almost the same contents. A user who intends to search for new things or unknown things may get confused and spend a lot of time in selecting contents that are understandable for them because there are large amounts of similar contents with different difficulties. Herein, The characteristics of relevant simplified corpora are critical for everybody. In this research, we propose a method to compare two types of documents with different difficulty, and select a characteristic related to simple of expression from various characteristics related to text. In our proposed method, thousands of text characteristics are compressed and converted by Non-negative Matrix Factorization(NMF), and a basis for characterizing the simplified document is selected. The proposed method combines the characteristics of the most conducted research using the characteristics of 32 types and 2,196 dimensions. We evaluated the text characteristics in the NMF Base of the results using a classifier. As a result of applying the proposed method to two kinds of environment white papers, it became clear that an effective basis can be selected. In Addtionally, We showed estimate of the causation relationships, Optimization of the parameter. Furthermore, We showed flexibility to other media.


Sign in / Sign up

Export Citation Format

Share Document