DRDF: A Deceptive Review Detection Framework of Combining Word-Level, Chunk-Level, And Sentence-Level Topic-Sentiment Models

Author(s):  
Xiaodong Du ◽  
Fuqiang Zhao ◽  
Zhengyu Zhu ◽  
Ping Han
2021 ◽  
Vol 14 (4) ◽  
pp. 1-24
Author(s):  
Sushant Kafle ◽  
Becca Dingman ◽  
Matt Huenerfauth

There are style guidelines for authors who highlight important words in static text, e.g., bolded words in student textbooks, yet little research has investigated highlighting in dynamic texts, e.g., captions during educational videos for Deaf or Hard of Hearing (DHH) users. In our experimental study, DHH participants subjectively compared design parameters for caption highlighting, including: decoration (underlining vs. italicizing vs. boldfacing), granularity (sentence level vs. word level), and whether to highlight only the first occurrence of a repeating keyword. In partial contrast to recommendations in prior research, which had not been based on experimental studies with DHH users, we found that DHH participants preferred boldface, word-level highlighting in captions. Our empirical results provide guidance for the design of keyword highlighting during captioned videos for DHH users, especially in educational video genres.


Author(s):  
Yazan Shaker Almahameed ◽  
May Al-Shaikhli

The current study aimed at investigating the salient syntactic and semantic errors made by Jordanian English foreign language learners as writing in English. Writing poses a great challenge for both native and non-native speakers of English, since writing involves employing most language sub-systems such as grammar, vocabulary, spelling and punctuation. A total of 30 Jordanian English foreign language learners participated in the study. The participants were instructed to write a composition of no more than one hundred and fifty words on a selected topic. Essays were collected and analyzed statistically to obtain the needed results. The results of the study displayed that syntactic errors produced by the participants were varied, in that eleven types of syntactic errors were committed as follows; verb-tense, agreement, auxiliary, conjunctions, word order, resumptive pronouns, null-subject, double-subject, superlative, comparative and possessive pronouns. Amongst syntactic errors, verb tense errors were the most frequent with 33%. The results additionally revealed that two types of semantic errors were made; errors at sentence level and errors at word level. Errors at word level outstripped by far errors at sentence level, scoring respectively 82% and 18%. It can be concluded that the syntactic and semantic knowledge of Jordanian learners of English is still insufficient.


2019 ◽  
Vol 16 (2) ◽  
pp. 359-380
Author(s):  
Zhehua Piao ◽  
Sang-Min Park ◽  
Byung-Won On ◽  
Gyu Choi ◽  
Myong-Soon Park

Product reputation mining systems can help customers make their buying decision about a product of interest. In addition, it will be helpful to investigate the preferences of recently released products made by enterprises. Unlike the conventional manual survey, it will give us quick survey results on a low cost budget. In this article, we propose a novel product reputation mining approach based on three dimensional points of view that are word, sentence, and aspect?levels. Given a target product, the aspect?level method assigns the sentences of a review document to the desired aspects. The sentence?level method is a graph-based model for quantifying the importance of sentences. The word?level method computes both importance and sentiment orientation of words. Aggregating these scores, the proposed approach measures the reputation tendency and preferred intensity and selects top-k informative review documents about the product. To validate the proposed method, we experimented with review documents relevant with K5 in Kia motors. Our experimental results show that our method is more helpful than the existing lexicon?based approach in the empirical and statistical studies.


2020 ◽  
Vol 201 ◽  
pp. 103068
Author(s):  
Haiyang Wei ◽  
Zhixin Li ◽  
Canlong Zhang ◽  
Huifang Ma

2017 ◽  
Vol 5 ◽  
pp. 205-218 ◽  
Author(s):  
André F. T. Martins ◽  
Marcin Junczys-Dowmunt ◽  
Fabio N. Kepler ◽  
Ramón Astudillo ◽  
Chris Hokamp ◽  
...  

Translation quality estimation is a task of growing importance in NLP, due to its potential to reduce post-editing human effort in disruptive ways. However, this potential is currently limited by the relatively low accuracy of existing systems. In this paper, we achieve remarkable improvements by exploiting synergies between the related tasks of word-level quality estimation and automatic post-editing. First, we stack a new, carefully engineered, neural model into a rich feature-based word-level quality estimation system. Then, we use the output of an automatic post-editing system as an extra feature, obtaining striking results on WMT16: a word-level FMULT1 score of 57.47% (an absolute gain of +7.95% over the current state of the art), and a Pearson correlation score of 65.56% for sentence-level HTER prediction (an absolute gain of +13.36%).


Electronics ◽  
2021 ◽  
Vol 10 (21) ◽  
pp. 2671
Author(s):  
Yu Zhang ◽  
Junan Yang ◽  
Xiaoshuai Li ◽  
Hui Liu ◽  
Kun Shao

Recent studies have shown that natural language processing (NLP) models are vulnerable to adversarial examples, which are maliciously designed by adding small perturbations to benign inputs that are imperceptible to the human eye, leading to false predictions by the target model. Compared to character- and sentence-level textual adversarial attacks, word-level attack can generate higher-quality adversarial examples, especially in a black-box setting. However, existing attack methods usually require a huge number of queries to successfully deceive the target model, which is costly in a real adversarial scenario. Hence, finding appropriate models is difficult. Therefore, we propose a novel attack method, the main idea of which is to fully utilize the adversarial examples generated by the local model and transfer part of the attack to the local model to complete ahead of time, thereby reducing costs related to attacking the target model. Extensive experiments conducted on three public benchmarks show that our attack method can not only improve the success rate but also reduce the cost, while outperforming the baselines by a significant margin.


Humaniora ◽  
2021 ◽  
Vol 12 (1) ◽  
pp. 7-12
Author(s):  
Umi Farichah ◽  
Ani Rakhmawati ◽  
Nugraheni Eko Wardani

The research aimed to see a relevance of the preservation of the Javanese language in Javanese conversations that Ganjar Pranowo carried out during the COVID-19 pandemic. The resulting data was about the level in the language that included word level, phrase level, and sentence level. Also, several manners, unggah ungguh, and ethics were produced that could become examples or role models for the people of Central Java. The research applied a qualitative method. The data source was the utterances contained in the uploads of Ganjar Pranowo in the form of video recordings that included primary data in the form of utterances or parts of spoken speech from various speeches and communications from the people of Central Java with Ganjar Pranowo. The results show that preservation of the Javanese language through conversations between leaders and the community has positive implications. This means that the preservation of the Javanese language is carried out optimally in the social sphere. This activity is well recorded and uploaded on social media, Ganjar Pranowo, a figure who has high credibility. The social sphere is an important component used to preserve Javanese language, culture, and traditions.


Since a decade research over sentiment analysis and opinion mining was evolving slowing and emerging widely with greater perspectives and objectives. Sentiment analysis is an important task in order to gain insights over the huge amounts of opinions that are generated on a daily basis. This analysis relies on the opinions made by the individuals. These opinions are text, may be positive or negative or a phrase which gives significance to the context. Also these opinions have the power of expressing the context besides drags the attention of new folks. Expressing such opinions ranges from documents level, to the sentence level, to phrase level, to word level and to special symbol level. All these opinion types are labelled with common name Sentiment Analysis. Sentiment Analysis is health care is evolving narrowly with wider research strings. This paper mainly focuses in identifying Sentiments in health care. These sentiments can be medical test values which may be numeric and nominal; sometimes in text too. Such sentiments are identified with pre-fragmentation of data set and Pointwise Mutual Information measure. To accomplish this data of hypertensive pregnant women is considered.


2017 ◽  
Vol 24 (1) ◽  
pp. 37
Author(s):  
Sariputri Ni Putu Trisna

This study discusses code switching used by the English Language Education students in their daily communication. This study also tries to find out the factor of applying code switching. The data was collected from around forty participants by using two methods such as observation and interview. The participants were the English Language Education students at Universitas Pendidikan Ganesha. From all of the participants, there were twenty expressions of code switching were revealed. The result shows that the students uses three types of linguistics level in code swicthing such as word level, phrase level and clause or sentence level. It is also found that there are two factors that made the students switch the one language into another language.


2020 ◽  
Vol 34 (05) ◽  
pp. 9725-9732
Author(s):  
Xiaorui Zhou ◽  
Senlin Luo ◽  
Yunfang Wu

In reading comprehension, generating sentence-level distractors is a significant task, which requires a deep understanding of the article and question. The traditional entity-centered methods can only generate word-level or phrase-level distractors. Although recently proposed neural-based methods like sequence-to-sequence (Seq2Seq) model show great potential in generating creative text, the previous neural methods for distractor generation ignore two important aspects. First, they didn't model the interactions between the article and question, making the generated distractors tend to be too general or not relevant to question context. Second, they didn't emphasize the relationship between the distractor and article, making the generated distractors not semantically relevant to the article and thus fail to form a set of meaningful options. To solve the first problem, we propose a co-attention enhanced hierarchical architecture to better capture the interactions between the article and question, thus guide the decoder to generate more coherent distractors. To alleviate the second problem, we add an additional semantic similarity loss to push the generated distractors more relevant to the article. Experimental results show that our model outperforms several strong baselines on automatic metrics, achieving state-of-the-art performance. Further human evaluation indicates that our generated distractors are more coherent and more educative compared with those distractors generated by baselines.


Sign in / Sign up

Export Citation Format

Share Document