scholarly journals How Language Shapes Prejudice Against Women: An Examination Across 45 World Languages

2020 ◽  
Author(s):  
David DeFranza ◽  
Himanshu Mishra ◽  
Arul Mishra

Language provides an ever-present context for our cognitions and has the ability to shape them. Languages across the world can be gendered (language in which the form of noun, verb, or pronoun is presented as female or male) versus genderless. In an ongoing debate, one stream of research suggests that gendered languages are more likely to display gender prejudice than genderless languages. However, another stream of research suggests that language does not have the ability to shape gender prejudice. In this research, we contribute to the debate by using a Natural Language Processing (NLP) method which captures the meaning of a word from the context in which it occurs. Using text data from Wikipedia and the Common Crawl project (which contains text from billions of publicly facing websites) across 45 world languages, covering the majority of the world’s population, we test for gender prejudice in gendered and genderless languages. We find that gender prejudice occurs more in gendered rather than genderless languages. Moreover, we examine whether genderedness of language influences the stereotypic dimensions of warmth and competence utilizing the same NLP method.

Author(s):  
TIAN-SHUN YAO

With the word-based theory of natural language processing, a word-based Chinese language understanding system has been developed. In the light of psychological language analysis and the features of the Chinese language, this theory of natural language processing is presented with the description of the computer programs based on it. The heart of the system is to define a Total Information Dictionary and the World Knowledge Source used in the system. The purpose of this research is to develop a system which can understand not only Chinese sentences but also the whole text.


Vector representations for language have been shown to be useful in a number of Natural Language Processing tasks. In this paper, we aim to investigate the effectiveness of word vector representations for the problem of Sentiment Analysis. In particular, we target three sub-tasks namely sentiment words extraction, polarity of sentiment words detection, and text sentiment prediction. We investigate the effectiveness of vector representations over different text data and evaluate the quality of domain-dependent vectors. Vector representations has been used to compute various vector-based features and conduct systematically experiments to demonstrate their effectiveness. Using simple vector based features can achieve better results for text sentiment analysis of APP.


Author(s):  
Laura Buszard-Welcher

This chapter presents three technologies essential to enabling any language in the digital domain: language identifiers (ISO 639-3), Unicode (including fonts and keyboards), and the building of corpora to enable natural language processing. Just a few major languages of the world are well-enabled for use with electronically mediated communication. Another few hundred languages are arguably on their way to being well-enabled, if for market reasons alone. For all the remaining languages of the world, inclusion in the digital domain remains a distant possibility, and one that likely requires sustained interest, attention, and resources on the part of the language community itself. The good news is that the same technologies that enable the more widespread languages can also enable the less widespread, and even endangered ones, and bootstrapping is possible for all of them. The examples and resources described in this chapter can serve as inspiration and guidance in getting started.


Author(s):  
Amalia Todirascu ◽  
Marion Cargill

We present SimpleApprenant, a platform aiming to improve French L2 learners’ knowledge of Multi Word Expressions (MWEs). SimpleApprenant integrates an MWE database annotated with the Common European Framework of Reference for languages (CEFR) level and several Natural Language Processing (NLP) tools: a spelling checker, a parser, and a set of transformation rules. NLP tools and resources are used to build training and writing exercises to improve MWE knowledge and writing skills of French L2 learners. We present the user scenarios, the platform’s architecture, as well as the preliminary evaluation of its NLP tools.


News is a routine in everyone's life. It helps in enhancing the knowledge on what happens around the world. Fake news is a fictional information madeup with the intension to delude and hence the knowledge acquired becomes of no use. As fake news spreads extensively it has a negative impact in the society and so fake news detection has become an emerging research area. The paper deals with a solution to fake news detection using the methods, deep learning and Natural Language Processing. The dataset is trained using deep neural network. The dataset needs to be well formatted before given to the network which is made possible using the technique of Natural Language Processing and thus predicts whether a news is fake or not.


Author(s):  
Nazmun Nessa Moon ◽  
Imrus Salehin ◽  
Masuma Parvin ◽  
Md. Mehedi Hasan ◽  
Iftakhar Mohammad Talha ◽  
...  

<span>In this study we have described the process of identifying unnecessary video using an advanced combined method of natural language processing and machine learning. The system also includes a framework that contains analytics databases and which helps to find statistical accuracy and can detect, accept or reject unnecessary and unethical video content. In our video detection system, we extract text data from video content in two steps, first from video to MPEG-1 audio layer 3 (MP3) and then from MP3 to WAV format. We have used the text part of natural language processing to analyze and prepare the data set. We use both Naive Bayes and logistic regression classification algorithms in this detection system to determine the best accuracy for our system. In our research, our video MP4 data has converted to plain text data using the python advance library function. This brief study discusses the identification of unauthorized, unsocial, unnecessary, unfinished, and malicious videos when using oral video record data. By analyzing our data sets through this advanced model, we can decide which videos should be accepted or rejected for the further actions.</span>


2018 ◽  
Author(s):  
Jeremy Petch ◽  
Jane Batt ◽  
Joshua Murray ◽  
Muhammad Mamdani

BACKGROUND The increasing adoption of electronic health records (EHRs) in clinical practice holds the promise of improving care and advancing research by serving as a rich source of data, but most EHRs allow clinicians to enter data in a text format without much structure. Natural language processing (NLP) may reduce reliance on manual abstraction of these text data by extracting clinical features directly from unstructured clinical digital text data and converting them into structured data. OBJECTIVE This study aimed to assess the performance of a commercially available NLP tool for extracting clinical features from free-text consult notes. METHODS We conducted a pilot, retrospective, cross-sectional study of the accuracy of NLP from dictated consult notes from our tuberculosis clinic with manual chart abstraction as the reference standard. Consult notes for 130 patients were extracted and processed using NLP. We extracted 15 clinical features from these consult notes and grouped them a priori into categories of simple, moderate, and complex for analysis. RESULTS For the primary outcome of overall accuracy, NLP performed best for features classified as simple, achieving an overall accuracy of 96% (95% CI 94.3-97.6). Performance was slightly lower for features of moderate clinical and linguistic complexity at 93% (95% CI 91.1-94.4), and lowest for complex features at 91% (95% CI 87.3-93.1). CONCLUSIONS The findings of this study support the use of NLP for extracting clinical features from dictated consult notes in the setting of a tuberculosis clinic. Further research is needed to fully establish the validity of NLP for this and other purposes.


Sentiment Classification is one of the well-known and most popular domain of machine learning and natural language processing. An algorithm is developed to understand the opinion of an entity similar to human beings. This research fining article presents a similar to the mention above. Concept of natural language processing is considered for text representation. Later novel word embedding model is proposed for effective classification of the data. Tf-IDF and Common BoW representation models were considered for representation of text data. Importance of these models are discussed in the respective sections. The proposed is testing using IMDB datasets. 50% training and 50% testing with three random shuffling of the datasets are used for evaluation of the model.


Sign in / Sign up

Export Citation Format

Share Document