scholarly journals Natural language processing for web browsing analytics: Challenges, lessons learned, and opportunities

2021 ◽  
pp. 108357
Author(s):  
Daniel Perdices ◽  
Javier Ramos ◽  
José L. García-Dorado ◽  
Iván González ◽  
Jorge E. López de Vergara
2020 ◽  
Vol 34 (09) ◽  
pp. 13397-13403
Author(s):  
Narges Norouzi ◽  
Snigdha Chaturvedi ◽  
Matthew Rutledge

This paper describes an experience in teaching Machine Learning (ML) and Natural Language Processing (NLP) to a group of high school students over an intense one-month period. In this work, we provide an outline of an AI course curriculum we designed for high school students and then evaluate its effectiveness by analyzing student's feedback and student outcomes. After closely observing students, evaluating their responses to our surveys, and analyzing their contribution to the course project, we identified some possible impediments in teaching AI to high school students and propose some measures to avoid them. These measures include employing a combination of objectivist and constructivist pedagogies, reviewing/introducing basic programming concepts at the beginning of the course, and addressing gender discrepancies throughout the course.


Author(s):  
Victoria Rubin

Artificially Intelligent (AI) systems are pervasive, but poorly understood by their users and, at times, developers. It is often unclear how and why certain algorithms make choices, predictions, or conclusions. What does AI transparency mean? What explanations do AI system users desire? This panel discusses AI opaqueness with examples in applied context such as natural language processing, people categorization, judicial decision explanations, and system recommendations. We offer insights from interviews with AI system users about their perceptions and developers’ lessons learned. What steps should be taken towards AI transparency and accountability for its decisions?  Les systèmes artificiellement intelligents (IA) sont omniprésents, mais mal compris par leurs utilisateurs et, parfois, par les développeurs. On ne sait souvent pas comment et pourquoi certains algorithmes font des choix, des prédictions ou des conclusions. Que signifie la transparence de l'IA? Quelles explications les utilisateurs du système d'IA souhaitent-ils? Ce panel examine l'opacité de l'IA avec des exemples dans un contexte appliqué tels que le traitement du langage naturel, la catégorisation des personnes, les explications des décisions judiciaires et les recommandations système. Nous proposons des informations issues d'entretiens avec des utilisateurs de systèmes d'IA sur leurs perceptions et les leçons apprises par les développeurs. Quelles mesures devraient être prises pour assurer la transparence et la responsabilité de l'IA pour ses décisions?


Author(s):  
William Digan ◽  
Aurélie Névéol ◽  
Antoine Neuraz ◽  
Maxime Wack ◽  
David Baudoin ◽  
...  

Abstract Background The increasing complexity of data streams and computational processes in modern clinical health information systems makes reproducibility challenging. Clinical natural language processing (NLP) pipelines are routinely leveraged for the secondary use of data. Workflow management systems (WMS) have been widely used in bioinformatics to handle the reproducibility bottleneck. Objective To evaluate if WMS and other bioinformatics practices could impact the reproducibility of clinical NLP frameworks. Materials and Methods Based on the literature across multiple research fields (NLP, bioinformatics and clinical informatics) we selected articles which (1) review reproducibility practices and (2) highlight a set of rules or guidelines to ensure tool or pipeline reproducibility. We aggregate insight from the literature to define reproducibility recommendations. Finally, we assess the compliance of 7 NLP frameworks to the recommendations. Results We identified 40 reproducibility features from 8 selected articles. Frameworks based on WMS match more than 50% of features (26 features for LAPPS Grid, 22 features for OpenMinted) compared to 18 features for current clinical NLP framework (cTakes, CLAMP) and 17 features for GATE, ScispaCy, and Textflows. Discussion 34 recommendations are endorsed by at least 2 articles from our selection. Overall, 15 features were adopted by every NLP Framework. Nevertheless, frameworks based on WMS had a better compliance with the features. Conclusion NLP frameworks could benefit from lessons learned from the bioinformatics field (eg, public repositories of curated tools and workflows or use of containers for shareability) to enhance the reproducibility in a clinical setting.


2017 ◽  
Vol 24 (5) ◽  
pp. 986-991 ◽  
Author(s):  
David S Carrell ◽  
Robert E Schoen ◽  
Daniel A Leffler ◽  
Michele Morris ◽  
Sherri Rose ◽  
...  

Abstract Objective: Widespread application of clinical natural language processing (NLP) systems requires taking existing NLP systems and adapting them to diverse and heterogeneous settings. We describe the challenges faced and lessons learned in adapting an existing NLP system for measuring colonoscopy quality. Materials and Methods: Colonoscopy and pathology reports from 4 settings during 2013–2015, varying by geographic location, practice type, compensation structure, and electronic health record. Results: Though successful, adaptation required considerably more time and effort than anticipated. Typical NLP challenges in assembling corpora, diverse report structures, and idiosyncratic linguistic content were greatly magnified. Discussion: Strategies for addressing adaptation challenges include assessing site-specific diversity, setting realistic timelines, leveraging local electronic health record expertise, and undertaking extensive iterative development. More research is needed on how to make it easier to adapt NLP systems to new clinical settings. Conclusions: A key challenge in widespread application of NLP is adapting existing systems to new clinical settings.


2020 ◽  
pp. 3-17
Author(s):  
Peter Nabende

Natural Language Processing for under-resourced languages is now a mainstream research area. However, there are limited studies on Natural Language Processing applications for many indigenous East African languages. As a contribution to covering the current gap of knowledge, this paper focuses on evaluating the application of well-established machine translation methods for one heavily under-resourced indigenous East African language called Lumasaaba. Specifically, we review the most common machine translation methods in the context of Lumasaaba including both rule-based and data-driven methods. Then we apply a state of the art data-driven machine translation method to learn models for automating translation between Lumasaaba and English using a very limited data set of parallel sentences. Automatic evaluation results show that a transformer-based Neural Machine Translation model architecture leads to consistently better BLEU scores than the recurrent neural network-based models. Moreover, the automatically generated translations can be comprehended to a reasonable extent and are usually associated with the source language input.


Sign in / Sign up

Export Citation Format

Share Document