scholarly journals Could a Conversational AI Identify Offensive Language?

Information ◽  
2021 ◽  
Vol 12 (10) ◽  
pp. 418
Author(s):  
Daniela America da Silva ◽  
Henrique Duarte Borges Louro ◽  
Gildarcio Sousa Goncalves ◽  
Johnny Cardoso Marques ◽  
Luiz Alberto Vieira Dias ◽  
...  

In recent years, we have seen a wide use of Artificial Intelligence (AI) applications in the Internet and everywhere. Natural Language Processing and Machine Learning are important sub-fields of AI that have made Chatbots and Conversational AI applications possible. Those algorithms are built based on historical data in order to create language models, however historical data could be intrinsically discriminatory. This article investigates whether a Conversational AI could identify offensive language and it will show how large language models often produce quite a bit of unethical behavior because of bias in the historical data. Our low-level proof-of-concept will present the challenges to detect offensive language in social media and it will discuss some steps to propitiate strong results in the detection of offensive language and unethical behavior using a Conversational AI.

2021 ◽  
Author(s):  
Md Anawar Hossen Wadud ◽  
Md Ashraf Uddin

Abstract The popularity of social media has exploded worldwide over the last few decades and becomes the most preferred mode of social interaction. The internet also provides a new platform through which adolescents are being bullied. Appropriate means of cyberbullying detection is still partial and in some cases very limited. Moreover, research on cyberbullying detection extensively focuses on surveys and its psychological impacts on victims. However, prevention has not been widely addressed. To bridge the gap, this paper aims to detect cyberbullying efficiently. This paper employs a standard machine learning method and natural language processing technique as a part of the detection process in decentralized Blockchain leveraged architecture. We provide a fog based architecture for cyberbullying detection, aiming at relieving the server's load by placing the detection and the prevention of cyberbullying processes at the fog layer. The proposal might offer a probable solution to save users, particularly adolescents from severe consequences of cyberbullying.


Author(s):  
K. P. Moholkar , Et. al.

Natural Language Processing (NLP), a subfield of Artificial Intelligence (AI), supports the machine to understand and manipulate the human languages in different sectors.  Subsequently, the Question and answering scheme using Machine learning is a challengeable task. For an efficient QA system, understanding the category of a question plays a pivot role in extracting suitable answer. Computers can answer questions requiring single, verifiable answers but fail to answer subjective question demanding deeper understanding of question. Subjective questions can take different forms entailing deeper, multidimensional understanding of context. Identifying the intent of the question helps to extract expected answer from a given passage. Pretrained language models (LMs) have demonstrated excellent results on many language tasks. The paper proposes model of deep learning architecture in hierarchical pattern to learn the semantic of question and extracting appropriate answer. The proposed method converts the given context to fine grained embedding to capture semantic and positional representation, identifies user intent and employs a encoder model to concentrate on answer span. The proposed methods show a remarkable improvement over existing system  


2021 ◽  
Author(s):  
revathi B. S. ◽  
A. Meena Kowshalya

Abstract Image Captioning is the process of generating textual descriptions of an image. These descriptions need to be syntactically and semantically correct. Image Captioning has potential advantages in many applications like image indexing techniques, devices for visually impaired persons, social media and several other natural language processing applications. Image Captioning is a popular research area where numerous scopes for new findings exist in preparation of datasets, generating language models, developing the models and evaluating the same. This paper extensively surveys very early literature that includes the advent of Artificial Intelligence, the Machine Learning pathway, the photography era, the early Deep Learning and the current Deep Learning methodology for image Captioning. This survey will definitely help novice researchers to understand the roadmap to current techniques.


2020 ◽  
Author(s):  
Ali Al-Garadi Mohammed ◽  
Yuan-Chi Yang ◽  
Haitao Cai ◽  
Yucheng Ruan ◽  
Karen O’Connor ◽  
...  

ABSTRACTPrescription medication (PM) misuse/abuse has emerged as a national crisis in the United States, and social media has been suggested as a potential resource for performing active monitoring. However, automating a social media-based monitoring system is challenging—requiring advanced natural language processing (NLP) and machine learning methods. In this paper, we describe the development and evaluation of automatic text classification models for detecting self-reports of PM abuse from Twitter. We experimented with state-of-the-art bi-directional transformer-based language models, which utilize tweet-level representations that enable transfer learning (e.g., BERT, RoBERTa, XLNet, AlBERT, and DistilBERT), proposed fusion-based approaches, and compared the developed models with several traditional machine learning, including deep learning, approaches. Using a public dataset, we evaluated the performances of the classifiers on their abilities to classify the non-majority “abuse/misuse” class. Our proposed fusion-based model performs significantly better than the best traditional model (F1-score [95% CI]: 0.67 [0.64-0.69] vs. 0.45 [0.42-0.48]). We illustrate, via experimentation using differing training set sizes, that the transformer-based models are more stable and require less annotated data compared to the other models. The significant improvements achieved by our best-performing classification model over past approaches makes it suitable for automated continuous monitoring of nonmedical PM use from Twitter.


2019 ◽  
Vol 16 (10) ◽  
pp. 4125-4134 ◽  
Author(s):  
Deepak Vats ◽  
Avinash Sharma

The reason at the back of data overloading dilemma faced by internet users on internet includes: excessive web information and billions of users around worldwide. Because of this, providing the internet users with more intended data is a challenging task in web applications. The lots of information available on internet are a fertile field for applying data mining techniques. This is what we call Web Mining (WM). The research in WM deals with research from many fields like database, Artificial Intelligence (machine learning [supervised, semi supervised, unsupervised and reinforcement], neural network and natural language processing (NLP)) and information retrieval. Here, research related to web mining and their categories is highlighted. We also situate comparison of most popular algorithms used from the field of data mining in pattern discovery phase of the WM.


2020 ◽  
Author(s):  
Md Anawar Hossen Wadud ◽  
Md Ashraf Uddin ◽  
Shamima Parvez ◽  
Mohammad Motiur Rahman ◽  
Ammar Alazab ◽  
...  

Abstract The popularity of social media has exploded worldwide over the last few decades and becomes the most preferred mode of social interaction. The internet also provides a new platform through which adolescents are being bullied. Appropriate means of cyberbullying detection is still partial and in some cases very limited. Moreover, research on cyberbullying detection extensively focuses on surveys and its psychological impacts on victims. However, prevention has not been widely addressed. To bridge the gap, this paper aims to detect cyberbullying efficiently. This paper employs a standard machine learning method and natural language processing technique as a part of the detection process in decentralized Blockchain leveraged architecture. We provide a fog based architecture for cyberbullying detection, aiming at relieving the server's load by placing the detection and the prevention of cyberbullying processes at the fog layer. The proposal might offer a probable solution to save users, particularly adolescents from severe consequences of cyberbullying.


2020 ◽  
Author(s):  
Shreya Reddy ◽  
Lisa Ewen ◽  
Pankti Patel ◽  
Prerak Patel ◽  
Ankit Kundal ◽  
...  

<p>As bots become more prevalent and smarter in the modern age of the internet, it becomes ever more important that they be identified and removed. Recent research has dictated that machine learning methods are accurate and the gold standard of bot identification on social media. Unfortunately, machine learning models do not come without their negative aspects such as lengthy training times, difficult feature selection, and overwhelming pre-processing tasks. To overcome these difficulties, we are proposing a blockchain framework for bot identification. At the current time, it is unknown how this method will perform, but it serves to prove the existence of an overwhelming gap of research under this area.<i></i></p>


2020 ◽  
Vol 114 ◽  
pp. 242-245
Author(s):  
Jootaek Lee

The term, Artificial Intelligence (AI), has changed since it was first coined by John MacCarthy in 1956. AI, believed to have been created with Kurt Gödel's unprovable computational statements in 1931, is now called deep learning or machine learning. AI is defined as a computer machine with the ability to make predictions about the future and solve complex tasks, using algorithms. The AI algorithms are enhanced and become effective with big data capturing the present and the past while still necessarily reflecting human biases into models and equations. AI is also capable of making choices like humans, mirroring human reasoning. AI can help robots to efficiently repeat the same labor intensive procedures in factories and can analyze historic and present data efficiently through deep learning, natural language processing, and anomaly detection. Thus, AI covers a spectrum of augmented intelligence relating to prediction, autonomous intelligence relating to decision making, automated intelligence for labor robots, and assisted intelligence for data analysis.


2021 ◽  
Vol 28 (1) ◽  
pp. e100262
Author(s):  
Mustafa Khanbhai ◽  
Patrick Anyadi ◽  
Joshua Symons ◽  
Kelsey Flott ◽  
Ara Darzi ◽  
...  

ObjectivesUnstructured free-text patient feedback contains rich information, and analysing these data manually would require a lot of personnel resources which are not available in most healthcare organisations.To undertake a systematic review of the literature on the use of natural language processing (NLP) and machine learning (ML) to process and analyse free-text patient experience data.MethodsDatabases were systematically searched to identify articles published between January 2000 and December 2019 examining NLP to analyse free-text patient feedback. Due to the heterogeneous nature of the studies, a narrative synthesis was deemed most appropriate. Data related to the study purpose, corpus, methodology, performance metrics and indicators of quality were recorded.ResultsNineteen articles were included. The majority (80%) of studies applied language analysis techniques on patient feedback from social media sites (unsolicited) followed by structured surveys (solicited). Supervised learning was frequently used (n=9), followed by unsupervised (n=6) and semisupervised (n=3). Comments extracted from social media were analysed using an unsupervised approach, and free-text comments held within structured surveys were analysed using a supervised approach. Reported performance metrics included the precision, recall and F-measure, with support vector machine and Naïve Bayes being the best performing ML classifiers.ConclusionNLP and ML have emerged as an important tool for processing unstructured free text. Both supervised and unsupervised approaches have their role depending on the data source. With the advancement of data analysis tools, these techniques may be useful to healthcare organisations to generate insight from the volumes of unstructured free-text data.


Sign in / Sign up

Export Citation Format

Share Document