scholarly journals Using Natural Language Processing to Extract Information from Unstructured code-change version control data: lessons learned

2021 ◽  
Author(s):  
Elisabetta Ronchieri ◽  
Marco Canaparo ◽  
Yue Yang
2021 ◽  
pp. 108357
Author(s):  
Daniel Perdices ◽  
Javier Ramos ◽  
José L. García-Dorado ◽  
Iván González ◽  
Jorge E. López de Vergara

2020 ◽  
Vol 34 (09) ◽  
pp. 13397-13403
Author(s):  
Narges Norouzi ◽  
Snigdha Chaturvedi ◽  
Matthew Rutledge

This paper describes an experience in teaching Machine Learning (ML) and Natural Language Processing (NLP) to a group of high school students over an intense one-month period. In this work, we provide an outline of an AI course curriculum we designed for high school students and then evaluate its effectiveness by analyzing student's feedback and student outcomes. After closely observing students, evaluating their responses to our surveys, and analyzing their contribution to the course project, we identified some possible impediments in teaching AI to high school students and propose some measures to avoid them. These measures include employing a combination of objectivist and constructivist pedagogies, reviewing/introducing basic programming concepts at the beginning of the course, and addressing gender discrepancies throughout the course.


Author(s):  
Victoria Rubin

Artificially Intelligent (AI) systems are pervasive, but poorly understood by their users and, at times, developers. It is often unclear how and why certain algorithms make choices, predictions, or conclusions. What does AI transparency mean? What explanations do AI system users desire? This panel discusses AI opaqueness with examples in applied context such as natural language processing, people categorization, judicial decision explanations, and system recommendations. We offer insights from interviews with AI system users about their perceptions and developers’ lessons learned. What steps should be taken towards AI transparency and accountability for its decisions?  Les systèmes artificiellement intelligents (IA) sont omniprésents, mais mal compris par leurs utilisateurs et, parfois, par les développeurs. On ne sait souvent pas comment et pourquoi certains algorithmes font des choix, des prédictions ou des conclusions. Que signifie la transparence de l'IA? Quelles explications les utilisateurs du système d'IA souhaitent-ils? Ce panel examine l'opacité de l'IA avec des exemples dans un contexte appliqué tels que le traitement du langage naturel, la catégorisation des personnes, les explications des décisions judiciaires et les recommandations système. Nous proposons des informations issues d'entretiens avec des utilisateurs de systèmes d'IA sur leurs perceptions et les leçons apprises par les développeurs. Quelles mesures devraient être prises pour assurer la transparence et la responsabilité de l'IA pour ses décisions?


2015 ◽  
Vol 1 (1) ◽  
Author(s):  
Keith W. Kintigh

AbstractTo address archaeology’s most pressing substantive challenges, researchers must discover, access, and extract information contained in the reports and articles that codify so much of archaeology’s knowledge. These efforts will require application of existing and emerging natural language processing technologies to extensive digital corpora. Automated classification can enable development of metadata needed for the discovery of relevant documents. Although it is even more technically challenging, automated extraction of and reasoning with information from texts can provide urgently needed access to contextualized information within documents. Effective automated translation is needed for scholars to benefit from research published in other languages.


1998 ◽  
Vol 37 (04/05) ◽  
pp. 334-344 ◽  
Author(s):  
G. Hripcsak ◽  
C. Friedman

AbstractEvaluating natural language processing (NLP) systems in the clinical domain is a difficult task which is important for advancement of the field. A number of NLP systems have been reported that extract information from free-text clinical reports, but not many of the systems have been evaluated. Those that were evaluated noted good performance measures but the results were often weakened by ineffective evaluation methods. In this paper we describe a set of criteria aimed at improving the quality of NLP evaluation studies. We present an overview of NLP evaluations in the clinical domain and also discuss the Message Understanding Conferences (MUC) [1-41. Although these conferences constitute a series of NLP evaluation studies performed outside of the clinical domain, some of the results are relevant within medicine. In addition, we discuss a number of factors which contribute to the complexity that is inherent in the task of evaluating natural language systems.


Author(s):  
William Digan ◽  
Aurélie Névéol ◽  
Antoine Neuraz ◽  
Maxime Wack ◽  
David Baudoin ◽  
...  

Abstract Background The increasing complexity of data streams and computational processes in modern clinical health information systems makes reproducibility challenging. Clinical natural language processing (NLP) pipelines are routinely leveraged for the secondary use of data. Workflow management systems (WMS) have been widely used in bioinformatics to handle the reproducibility bottleneck. Objective To evaluate if WMS and other bioinformatics practices could impact the reproducibility of clinical NLP frameworks. Materials and Methods Based on the literature across multiple research fields (NLP, bioinformatics and clinical informatics) we selected articles which (1) review reproducibility practices and (2) highlight a set of rules or guidelines to ensure tool or pipeline reproducibility. We aggregate insight from the literature to define reproducibility recommendations. Finally, we assess the compliance of 7 NLP frameworks to the recommendations. Results We identified 40 reproducibility features from 8 selected articles. Frameworks based on WMS match more than 50% of features (26 features for LAPPS Grid, 22 features for OpenMinted) compared to 18 features for current clinical NLP framework (cTakes, CLAMP) and 17 features for GATE, ScispaCy, and Textflows. Discussion 34 recommendations are endorsed by at least 2 articles from our selection. Overall, 15 features were adopted by every NLP Framework. Nevertheless, frameworks based on WMS had a better compliance with the features. Conclusion NLP frameworks could benefit from lessons learned from the bioinformatics field (eg, public repositories of curated tools and workflows or use of containers for shareability) to enhance the reproducibility in a clinical setting.


Sign in / Sign up

Export Citation Format

Share Document