Computational Techniques in Political Language Processing: AnaDiP-2011

Speech is the natural communication means, however, it is not the typical input means afforded by computers. The interaction between humans and machines would have become easier, if speech were an alternative effective input means to the keyboard and mouse. With advancement in techniques for signal processing and model building and the empowerment of computing devices, significant progress has been made in speech recognition research, and various speech based applications have been developed. With rapid advancement of the speech recognition technology, telephone speech technology are getting more involved in many new applications of spoken language processing. From the literature it has been found that the spectro-temporal features gives a significant performance improvement for telephone speech recognition system in comparison to the robust feature techniques used for the recognition purpose. In this chapter, the authors have reported the use of various spectral and temporal features and the soft computing techniques that have been used for the telephonic speech recognition.

Download Full-text

MorphoBr: an open source large-coverage full-form lexicon for morphological analysis of Portuguese

Texto Livre Linguagem e Tecnologia ◽

10.17851/1983-3652.11.3.1-25 ◽

2018 ◽

Vol 11 (3) ◽

pp. 1-25

Author(s):

Leonel Figueiredo de Alencar ◽

Bruno Cuconato ◽

Alexandre Rademaker

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Open Source ◽

Computational Linguistics ◽

Language Processing ◽

Morphological Analysis ◽

Computational Techniques ◽

Processing Technologies ◽

Finite State ◽

Full Form

ABSTRACT: One of the prerequisites for many natural language processing technologies is the availability of large lexical resources. This paper reports on MorphoBr, an ongoing project aiming at building a comprehensive full-form lexicon for morphological analysis of Portuguese. A first version of the resource is already freely available online under an open source, free software license. MorphoBr combines analogous free resources, correcting several thousand errors and gaps, and systematically adding new entries. In comparison to the integrated resources, lexical entries in MorphoBr follow a more user-friendly format, which can be straightforwardly compiled into finite-state transducers for morphological analysis, e.g. in the context of syntactic parsing with a grammar in the LFG formalism using the XLE system. MorphoBr results from a combination of computational techniques. Errors and the more obvious gaps in the integrated resources were automatically corrected with scripts. However, MorphoBr's main contribution is the expansion in the inventory of nouns and adjectives. This was carried out by systematically modeling diminutive formation in the paradigm of finite-state morphology. This allowed MorphoBr to significantly outperform analogous resources in the coverage of diminutives. The first evaluation results show MorphoBr to be a promising initiative which will directly contribute to the development of more robust natural language processing tools and applications which depend on wide-coverage morphological analysis.KEYWORDS: computational linguistics; natural language processing; morphological analysis; full-form lexicon; diminutive formation. RESUMO: Um dos pré-requisitos para muitas tecnologias de processamento de linguagem natural é a disponibilidade de vastos recursos lexicais. Este artigo trata do MorphoBr, um projeto em desenvolvimento voltado para a construção de um léxico de formas plenas abrangente para a análise morfológica do português. Uma primeira versão do recurso já está disponível gratuitamente on-line sob uma licença de software livre e de código aberto. MorphoBr combina recursos livres análogos, corrigindo vários milhares de erros e lacunas. Em comparação com os recursos integrados, as entradas lexicais do MorphoBr seguem um formato mais amigável, o qual pode ser compilado diretamente em transdutores de estados finitos para análise morfológica, por exemplo, no contexto do parsing sintático com uma gramática no formalismo da LFG usando o sistema XLE. MorphoBr resulta de uma combinação de técnicas computacionais. Erros e lacunas mais óbvias nos recursos integrados foram automaticamente corrigidos com scripts. No entanto, a principal contribuição de MorphoBr é a expansão no inventário de substantivos e adjetivos. Isso foi alcançado pela modelação sistemática da formação de diminutivos no paradigma da morfologia de estados finitos. Isso possibilitou a MorphoBr superar de forma significativa recursos análogos na cobertura de diminutivos. Os primeiros resultados de avaliação mostram que o MorphoBr constitui uma iniciativa promissora que contribuirá de forma direta para conferir robustez a ferramentas e aplicações de processamento de linguagem natural que dependem de análise morfológica de ampla cobertura.PALAVRAS-CHAVE: linguística computacional; processamento de linguagem natural; análise morfológica; léxico de formas plenas; formação de diminutivos.

Download Full-text

Investigating the role of Word Embeddings in sentiment analysis

International Multidisciplinary Research Journal ◽

10.25081/imrj.2020.v10.6476 ◽

2020 ◽

pp. 18-24

Author(s):

Samuele Martinelli ◽

Gloria Gonella ◽

Dario Bertolino

Keyword(s):

Sentiment Analysis ◽

Customer Service ◽

Language Processing ◽

Question Answering ◽

Classification Task ◽

Mining Machine ◽

Computational Techniques ◽

Human Language ◽

Simple Classification ◽

Learning Techniques

During decades, Natural language processing (NLP) expanded its range of tasks, from document classification to automatic text summarization, sentiment analysis, text mining, machine translation, automatic question answering and others. In 2018, T. Young described NLP as a theory-motivated range of computational techniques for the automatic analysis and representation of human language. Outside and before AI, human language has been studied by specialists from various disciplines: linguistics, philosophy, logic, psychology. The aim of this work is to build a neural network to perform a sentiment analysis on Italian reviews from the chatbot customer service. Sentiment analysis is a data mining process which identifies and extracts subjective information from text. It could help to understand the social sentiment of clients, respect a business product or service. It could be a simple classification task that analyses a sentence and tells whether the underlying sentiment is positive or negative. The potentiality of deep learning techniques made this simple classification task evolve, creating new, more complex sentiment analysis, e.g. Intent Analysis and Contextual Semantic Search.

Download Full-text

Enabling qualitative research data sharing using a natural language processing pipeline for deidentification: moving beyond HIPAA Safe Harbor identifiers

JAMIA Open ◽

10.1093/jamiaopen/ooab069 ◽

2021 ◽

Vol 4 (3) ◽

Author(s):

Aditi Gupta ◽

Albert Lai ◽

Jessica Mozersky ◽

Xiaoteng Ma ◽

Heidi Walsh ◽

...

Keyword(s):

Qualitative Research ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Health Research ◽

Qualitative Data ◽

Research Data ◽

Computational Techniques ◽

Safe Harbor ◽

Qualitative Health Research

Abstract Objective Sharing health research data is essential for accelerating the translation of research into actionable knowledge that can impact health care services and outcomes. Qualitative health research data are rarely shared due to the challenge of deidentifying text and the potential risks of participant reidentification. Here, we establish and evaluate a framework for deidentifying qualitative research data using automated computational techniques including removal of identifiers that are not considered HIPAA Safe Harbor (HSH) identifiers but are likely to be found in unstructured qualitative data. Materials and Methods We developed and validated a pipeline for deidentifying qualitative research data using automated computational techniques. An in-depth analysis and qualitative review of different types of qualitative health research data were conducted to inform and evaluate the development of a natural language processing (NLP) pipeline using named-entity recognition, pattern matching, dictionary, and regular expression methods to deidentify qualitative texts. Results We collected 2 datasets with 1.2 million words derived from over 400 qualitative research data documents. We created a gold-standard dataset with 280K words (70 files) to evaluate our deidentification pipeline. The majority of identifiers in qualitative data are non-HSH and not captured by existing systems. Our NLP deidentification pipeline had a consistent F1-score of ∼0.90 for both datasets. Conclusion The results of this study demonstrate that NLP methods can be used to identify both HSH identifiers and non-HSH identifiers. Automated tools to assist researchers with the deidentification of qualitative data will be increasingly important given the new National Institutes of Health (NIH) data-sharing mandate.

Download Full-text

Natural language processing in mental health applications using non-clinical texts

Natural Language Engineering ◽

10.1017/s1351324916000383 ◽

2017 ◽

Vol 23 (5) ◽

pp. 649-685 ◽

Cited By ~ 41

Author(s):

RAFAEL A. CALVO ◽

DAVID N. MILNE ◽

M. SAZZAD HUSSAIN ◽

HELEN CHRISTENSEN

Keyword(s):

Mental Health ◽

Social Media ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Mental States ◽

Data Sources ◽

Computational Techniques ◽

Mental Health Support ◽

Psychological Assistance

AbstractNatural language processing (NLP) techniques can be used to make inferences about peoples’ mental states from what they write on Facebook, Twitter and other social media. These inferences can then be used to create online pathways to direct people to health information and assistance and also to generate personalized interventions. Regrettably, the computational methods used to collect, process and utilize online writing data, as well as the evaluations of these techniques, are still dispersed in the literature. This paper provides a taxonomy of data sources and techniques that have been used for mental health support and intervention. Specifically, we review how social media and other data sources have been used to detect emotions and identify people who may be in need of psychological assistance; the computational techniques used in labeling and diagnosis; and finally, we discuss ways to generate and personalize mental health interventions. The overarching aim of this scoping review is to highlight areas of research where NLP has been applied in the mental health literature and to help develop a common language that draws together the fields of mental health, human-computer interaction and NLP.

Download Full-text

On the Gap between Domestic Robotic Applications and Computational Intelligence

Electronics ◽

10.3390/electronics10070793 ◽

2021 ◽

Vol 10 (7) ◽

pp. 793

Author(s):

Junpei Zhong ◽

Chaofan Ling ◽

Angelo Cangelosi ◽

Ahmad Lotfi ◽

Xiaofeng Liu

Keyword(s):

Language Processing ◽

Computational Intelligence ◽

Intelligent Agents ◽

State Of The Art ◽

Daily Life ◽

The State ◽

Computational Techniques ◽

Human Machine Interaction ◽

Domestic Robots ◽

Machine Interaction

Aspired to build intelligent agents that can assist humans in daily life, researchers and engineers, both from academia and industry, have kept advancing the state-of-the-art in domestic robotics. With the rapid advancement of both hardware (e.g., high performance computing, smaller and cheaper sensors) and software (e.g., deep learning techniques and computational intelligence technologies), robotic products have become available to ordinary household users. For instance, domestic robots have assisted humans in various daily life scenarios to provide: (1) physical assistance such as floor vacuuming; (2) social assistance such as chatting; and (3) education and cognitive assistance such as offering partnerships. Crucial to the success of domestic robots is their ability to understand and carry out designated tasks from human users via natural and intuitive human-like interactions, because ordinary users usually have no expertise in robotics. To investigate whether and to what extent existing domestic robots can participate in intuitive and natural interactions, we survey existing domestic robots in terms of their interaction ability, and discuss the state-of-the-art research on multi-modal human–machine interaction from various domains, including natural language processing and multi-modal dialogue systems. We relate domestic robot application scenarios with state-of-the-art computational techniques of human–machine interaction, and discuss promising future directions towards building more reliable, capable and human-like domestic robots.

Download Full-text

Combining natural language processing and network analysis to examine how advocacy organizations stimulate conversation on social media

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.1607151113 ◽

2016 ◽

Vol 113 (42) ◽

pp. 11823-11828 ◽

Cited By ~ 37

Author(s):

Christopher Andrew Bail

Keyword(s):

Social Media ◽

Natural Language Processing ◽

Network Analysis ◽

Natural Language ◽

Language Processing ◽

Autism Spectrum ◽

Public Deliberation ◽

Computational Techniques ◽

Advocacy Organizations ◽

Media Audiences

Social media sites are rapidly becoming one of the most important forums for public deliberation about advocacy issues. However, social scientists have not explained why some advocacy organizations produce social media messages that inspire far-ranging conversation among social media users, whereas the vast majority of them receive little or no attention. I argue that advocacy organizations are more likely to inspire comments from new social media audiences if they create “cultural bridges,” or produce messages that combine conversational themes within an advocacy field that are seldom discussed together. I use natural language processing, network analysis, and a social media application to analyze how cultural bridges shaped public discourse about autism spectrum disorders on Facebook over the course of 1.5 years, controlling for various characteristics of advocacy organizations, their social media audiences, and the broader social context in which they interact. I show that organizations that create substantial cultural bridges provoke 2.52 times more comments about their messages from new social media users than those that do not, controlling for these factors. This study thus offers a theory of cultural messaging and public deliberation and computational techniques for text analysis and application-based survey research.

Download Full-text

Deep Learning Models for Detection and Diagnosis of Alzheimer's Disease

Advances in Medical Technologies and Clinical Practice - Machine Learning and Data Analytics for Predicting, Managing, and Monitoring Disease ◽

10.4018/978-1-7998-7188-0.ch011 ◽

2021 ◽

pp. 140-149

Author(s):

Gowhar Mohiuddin Dar ◽

Ashok Sharma ◽

Parveen Singh

Keyword(s):

Computer Vision ◽

Deep Learning ◽

Natural Language Processing ◽

Reinforcement Learning ◽

Natural Language ◽

Language Processing ◽

Computational Techniques ◽

Medical Sciences ◽

Data Application ◽

Detection And Diagnosis

The chapter explores the implications of deep learning in medical sciences, focusing on deep learning concerning natural language processing, computer vision, reinforcement learning, big data, and blockchain influence on some areas of medicine and construction of end-to-end systems with the help of these computational techniques. The deliberation of computer vision in the study is mainly concerned with medical imaging and further usage of natural language processing to spheres such as electronic wellbeing record data. Application of deep learning in genetic mapping and DNA sequencing termed as genomics and implications of reinforcement learning about surgeries assisted by robots are also overviewed.

Download Full-text

Large Scale Linguistic Processing of Tweets to Understand Social Interactions among Speakers of Less Resourced Languages: The Basque Case

Information ◽

10.3390/info10060212 ◽

2019 ◽

Vol 10 (6) ◽

pp. 212 ◽

Cited By ~ 1

Author(s):

Joseba Fernandez de Landa ◽

Rodrigo Agerri ◽

Iñaki Alegria

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Social Relations ◽

Large Scale ◽

Social Research ◽

Life Stage ◽

Unstructured Data ◽

Computational Techniques ◽

Processing Techniques

Social networks like Twitter are increasingly important in the creation of new ways of communication. They have also become useful tools for social and linguistic research due to the massive amounts of public textual data available. This is particularly important for less resourced languages, as it allows to apply current natural language processing techniques to large amounts of unstructured data. In this work, we study the linguistic and social aspects of young and adult people’s behaviour based on their tweets’ contents and the social relations that arise from them. With this objective in mind, we have gathered over 10 million tweets from more than 8000 users. First, we classified each user in terms of its life stage (young/adult) according to the writing style of their tweets. Second, we applied topic modelling techniques to the personal tweets to find the most popular topics according to life stages. Third, we established the relations and communities that emerge based on the retweets. We conclude that using large amounts of unstructured data provided by Twitter facilitates social research using computational techniques such as natural language processing, giving the opportunity both to segment communities based on demographic characteristics and to discover how they interact or relate to them.

Download Full-text

redBERT

International Journal of Open Source Software and Processes ◽

10.4018/ijossp.2021070103 ◽

2021 ◽

Vol 12 (3) ◽

pp. 32-47

Author(s):

Chaitanya Pandey

Keyword(s):

Decision Making ◽

Social Media ◽

Language Processing ◽

Computational Techniques ◽

Topic Modelling ◽

Automated Extraction ◽

Wide Scale ◽

Human Psyche ◽

Shed Light

A natural language processing (NLP) method was used to uncover various issues and sentiments surrounding COVID-19 from social media and get a deeper understanding of fluctuating public opinion in situations of wide-scale panic to guide improved decision making with the help of a sentiment analyser created for the automated extraction of COVID-19-related discussions based on topic modelling. Moreover, the BERT model was used for the sentiment classification of COVID-19 Reddit comments. These findings shed light on the importance of studying trends and using computational techniques to assess the human psyche in times of distress.

Download Full-text