A Survey on Bias in Deep NLP

Ismael Garrido-Muñoz ; Arturo Montejo-Ráez ; Fernando Martínez-Santiago ; L. Alfonso Ureña-López

doi:10.3390/app11073184

A Survey on Bias in Deep NLP

Applied Sciences ◽

10.3390/app11073184 ◽

2021 ◽

Vol 11 (7) ◽

pp. 3184

Author(s):

Ismael Garrido-Muñoz ◽

Arturo Montejo-Ráez ◽

Fernando Martínez-Santiago ◽

L. Alfonso Ureña-López

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Natural Language Processing ◽

Probability Distribution ◽

Natural Language ◽

Network Design ◽

Language Processing ◽

Deep Neural Networks ◽

Learning Processes ◽

Relevant Issue

Deep neural networks are hegemonic approaches to many machine learning areas, including natural language processing (NLP). Thanks to the availability of large corpora collections and the capability of deep architectures to shape internal language mechanisms in self-supervised learning processes (also known as “pre-training”), versatile and performing models are released continuously for every new network design. These networks, somehow, learn a probability distribution of words and relations across the training collection used, inheriting the potential flaws, inconsistencies and biases contained in such a collection. As pre-trained models have been found to be very useful approaches to transfer learning, dealing with bias has become a relevant issue in this new scenario. We introduce bias in a formal way and explore how it has been treated in several networks, in terms of detection and correction. In addition, available resources are identified and a strategy to deal with bias in deep NLP is proposed.

Get full-text (via PubEx)

A Survey on Bias in Deep NLP

10.20944/preprints202103.0049.v1 ◽

2021 ◽

Author(s):

Ismael Garrido-Muñoz ◽

Arturo Montejo-Ráez ◽

Fernando Martínez-Santiago ◽

L. Alfonso Ureña-López

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Natural Language Processing ◽

Probability Distribution ◽

Natural Language ◽

Network Design ◽

Language Processing ◽

Deep Neural Networks ◽

Learning Processes ◽

Relevant Issue

Deep neural networks are hegemonic approaches to many machine learning areas, including natural language processing (NLP). Thanks to the availability of large corpora collections and the capability of deep architectures to shape internal language mechanisms in self-supervised learning processes (also known as "pre-training"), versatile and performing models are released continuously for every new network design. But these networks, somehow, learn a probability distribution of words and relations across the training collection used, inheriting the potential flaws, inconsistencies and biases contained in such a collection. As pre-trained models have found to be very useful approaches to transfer learning, dealing with bias has become a relevant issue in this new scenario. We introduce bias in a formal way and explore how it has been treated in several networks, in terms of detection and correction. Also, available resources are identified and a strategy to deal with bias in deep NLP is proposed.

Get full-text (via PubEx)

An Analysis of Machine Learning Algorithms and Deep Neural Networks for Email Spam Classification using Natural Language Processing

10.1109/soli54607.2021.9672398 ◽

2021 ◽

Author(s):

Md. Mohidul Hasan ◽

Syed Mahbubuz Zaman ◽

Md. Asif Talukdar ◽

Ayesha Siddika ◽

Md. Golam Rabiul Alam

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Deep Neural Networks ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Email Spam

Get full-text (via PubEx)

Empirical evaluation of multi-task learning in deep neural networks for natural language processing

Neural Computing and Applications ◽

10.1007/s00521-020-05268-w ◽

2020 ◽

Author(s):

Jianquan Li ◽

Xiaokang Liu ◽

Wenpeng Yin ◽

Min Yang ◽

Liqun Ma ◽

...

Keyword(s):

Neural Networks ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Deep Neural Networks ◽

Empirical Evaluation ◽

Task Learning

Get full-text (via PubEx)

Performance Comparison of Natural Language Processing Model Based on Deep Neural Networks

The Journal of Korean Institute of Communications and Information Sciences ◽

10.7840/kics.2019.44.7.1344 ◽

2019 ◽

Vol 44 (7) ◽

pp. 1344-1350

Author(s):

Taegyeom Lee ◽

Kyungseop Shin

Keyword(s):

Neural Networks ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Deep Neural Networks ◽

Performance Comparison ◽

Model Based

Get full-text (via PubEx)

Recent advances in processing negation

Natural Language Engineering ◽

10.1017/s1351324920000534 ◽

2020 ◽

pp. 1-10

Author(s):

Roser Morante ◽

Eduardo Blanco

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Future Directions ◽

Rule Based ◽

Computational Approaches ◽

Recent Advances ◽

Linguistic Phenomenon

Abstract Negation is a complex linguistic phenomenon present in all human languages. It can be seen as an operator that transforms an expression into another expression whose meaning is in some way opposed to the original expression. In this article, we survey previous work on negation with an emphasis on computational approaches. We start defining negation and two important concepts: scope and focus of negation. Then, we survey work in natural language processing that considers negation primarily as a means to improve the results in some task. We also provide information about corpora containing negation annotations in English and other languages, which usually include a combination of annotations of negation cues, scopes, foci, and negated events. We continue the survey with a description of automated approaches to process negation, ranging from early rule-based systems to systems built with traditional machine learning and neural networks. Finally, we conclude with some reflections on current progress and future directions.

Get full-text (via PubEx)

Neural Language Modeling for Molecule Generation

10.26434/chemrxiv.14700831 ◽

2021 ◽

Author(s):

Sanjar Adilov

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Deep Learning ◽

Natural Language Processing ◽

Drug Design ◽

Natural Language ◽

Language Processing ◽

De Novo ◽

Language Modeling ◽

Machine Learning Methods

Generative neural networks have shown promising results in <i>de novo</i> drug design. Recent studies suggest that one of the efficient ways to produce novel molecules matching target properties is to model SMILES sequences using deep learning in a way similar to language modeling in natural language processing. In this paper, we present a survey of various machine learning methods for SMILES-based language modeling and propose our benchmarking results on a standardized subset of ChEMBL database.

Get full-text (via PubEx)

Neural Language Modeling for Molecule Generation

10.26434/chemrxiv.14700831.v1 ◽

2021 ◽

Author(s):

Sanjar Adilov

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Deep Learning ◽

Natural Language Processing ◽

Drug Design ◽

Natural Language ◽

Language Processing ◽

De Novo ◽

Language Modeling ◽

Machine Learning Methods

Get full-text (via PubEx)

Artificial Intelligence in Business

Management of Data in AI Age ◽

10.46679/isbn978819484834901 ◽

2020 ◽

pp. 1-38

Author(s):

Amandeep Kaur ◽

◽

Anjum Mohammad Aslam ◽

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Neural Networks ◽

Deep Learning ◽

Natural Language Processing ◽

Natural Language ◽

Case Studies ◽

Language Processing ◽

Learning Abilities ◽

The Core

In this chapter we discuss the core concept of Artificial Intelligence. We define the term of Artificial Intelligence and its interconnected terms such as Machine learning, deep learning, Neural Networks. We describe the concept with the perspective of its usage in the area of business. We further analyze various applications and case studies which can be achieved using Artificial Intelligence and its sub fields. In the area of business already numerous Artificial Intelligence applications are being utilized and will be expected to be utilized more in the future where machines will improve the Artificial Intelligence, Natural language processing, Machine learning abilities of humans in various zones.

Get full-text (via PubEx)

Cyberbullying Detection: Hybrid Models Based on Machine Learning and Natural Language Processing Techniques

Electronics ◽

10.3390/electronics10222810 ◽

2021 ◽

Vol 10 (22) ◽

pp. 2810

Author(s):

Chahat Raj ◽

Ayush Agarwal ◽

Gnana Bharathy ◽

Bhuva Narayan ◽

Mukesh Prasad

Keyword(s):

Neural Network ◽

Machine Learning ◽

Neural Networks ◽

Feature Extraction ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Machine Learning Algorithms ◽

Classification Methods ◽

Cyberbullying Detection

The rise in web and social media interactions has resulted in the efortless proliferation of offensive language and hate speech. Such online harassment, insults, and attacks are commonly termed cyberbullying. The sheer volume of user-generated content has made it challenging to identify such illicit content. Machine learning has wide applications in text classification, and researchers are shifting towards using deep neural networks in detecting cyberbullying due to the several advantages they have over traditional machine learning algorithms. This paper proposes a novel neural network framework with parameter optimization and an algorithmic comparative study of eleven classification methods: four traditional machine learning and seven shallow neural networks on two real world cyberbullying datasets. In addition, this paper also examines the effect of feature extraction and word-embedding-techniques-based natural language processing on algorithmic performance. Key observations from this study show that bidirectional neural networks and attention models provide high classification results. Logistic Regression was observed to be the best among the traditional machine learning classifiers used. Term Frequency-Inverse Document Frequency (TF-IDF) demonstrates consistently high accuracies with traditional machine learning techniques. Global Vectors (GloVe) perform better with neural network models. Bi-GRU and Bi-LSTM worked best amongst the neural networks used. The extensive experiments performed on the two datasets establish the importance of this work by comparing eleven classification methods and seven feature extraction techniques. Our proposed shallow neural networks outperform existing state-of-the-art approaches for cyberbullying detection, with accuracy and F1-scores as high as ~95% and ~98%, respectively.

Get full-text (via PubEx)

Deep Neural Networks for Natural Language Processing

Handbook of Statistics - Computational Analysis and Understanding of Natural Languages: Principles, Methods and Applications ◽

10.1016/bs.host.2018.07.006 ◽

2018 ◽

pp. 229-316 ◽

Cited By ~ 4

Author(s):

Ehsan Fathi ◽

Babak Maleki Shoja

Keyword(s):

Neural Networks ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Deep Neural Networks

Get full-text (via PubEx)