The Impact of NLP on Software Testing

The number of applications being built and deployed everyday are increasing by leaps and bounds. To ensure the best user/client experience, the application needs to be free of bugs and other service issues. This marks the importance of testing phase in application development and deployment phase. Basically, testing is dissected into couple of parts being Manual Testing and Automation Testing. Manual testing, which is usually, an individual tester is given software guidance to execute. The tester would post the findings as “passed” or “failed” as per the guidance. But this kind of testing is very costly and time taking process. To eliminate these short comings, automation testing was introduced but it had very little scope and applications are limited. Now, that Artificial Intelligence has been foraying into many domains and has been showing significant impact over those domains. The core principles of Natural Language Processing that can be used in Software Testing are discussed in this paper. It also provides a glimpse at how Natural Language Processing and Software Testing will evolve in the future. Here we focus mainly on test case prioritization, predicting manual test case failure and generation of test cases from requirements utilizing NLP. The research indicates that NLP will improve software testing outcomes, and NLP-based testing will usher in a coming age of software testers work in the not-too-distant times.

Download Full-text

Building natural language processing tools for Runyakitara

Applied Linguistics Review ◽

10.1515/applirev-2020-2004 ◽

2020 ◽

Vol 0 (0) ◽

Author(s):

Fridah Katushemererwe ◽

Andrew Caines ◽

Paula Buttery

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Learning ◽

Language Processing ◽

Primary Data ◽

Computer Assisted ◽

Endangered Languages ◽

Test Case ◽

Short Supply ◽

Linguistic Resources

AbstractThis paper describes an endeavour to build natural language processing (NLP) tools for Runyakitara, a group of four closely related Bantu languages spoken in western Uganda. In contrast with major world languages such as English, for which corpora are comparatively abundant and NLP tools are well developed, computational linguistic resources for Runyakitara are in short supply. First therefore, we need to collect corpora for these languages, before we can proceed to the design of a spell-checker, grammar-checker and applications for computer-assisted language learning (CALL). We explain how we are collecting primary data for a new Runya Corpus of speech and writing, we outline the design of a morphological analyser, and discuss how we can use these new resources to build NLP tools. We are initially working with Runyankore–Rukiga, a closely-related pair of Runyakitara languages, and we frame our project in the context of NLP for low-resource languages, as well as CALL for the preservation of endangered languages. We put our project forward as a test case for the revitalization of endangered languages through education and technology.

Download Full-text

A Natural Language Processing Approach to Measuring Treatment Adherence and Consistency Using Semantic Similarity

AERA Open ◽

10.1177/23328584211028615 ◽

2021 ◽

Vol 7 ◽

pp. 233285842110286

Author(s):

Kylie L. Anglin ◽

Vivian C. Wong ◽

Arielle Boguslav

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Semantic Similarity ◽

Language Processing ◽

Intervention Implementation ◽

Proof Of Concept ◽

Coaching Intervention ◽

Processing Techniques ◽

Teacher Coaching ◽

The Impact

Though there is widespread recognition of the importance of implementation research, evaluators often face intense logistical, budgetary, and methodological challenges in their efforts to assess intervention implementation in the field. This article proposes a set of natural language processing techniques called semantic similarity as an innovative and scalable method of measuring implementation constructs. Semantic similarity methods are an automated approach to quantifying the similarity between texts. By applying semantic similarity to transcripts of intervention sessions, researchers can use the method to determine whether an intervention was delivered with adherence to a structured protocol, and the extent to which an intervention was replicated with consistency across sessions, sites, and studies. This article provides an overview of semantic similarity methods, describes their application within the context of educational evaluations, and provides a proof of concept using an experimental study of the impact of a standardized teacher coaching intervention.

Download Full-text

Application of natural language processing methods to extract coded data from administrative data held in the Scottish Prescribing Information System

International Journal for Population Data Science ◽

10.23889/ijpds.v1i1.263 ◽

2017 ◽

Vol 1 (1) ◽

Author(s):

Clifford Nangle ◽

Stuart McTaggart ◽

Margaret MacLeod ◽

Jackie Caldwell ◽

Marion Bennie

Keyword(s):

Information System ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Drug Exposure ◽

Drug Dose ◽

Free Text ◽

Wide Range ◽

The Impact ◽

Prescribing Information

ABSTRACT ObjectivesThe Prescribing Information System (PIS) datamart, hosted by NHS National Services Scotland receives around 90 million electronic prescription messages per year from GP practices across Scotland. Prescription messages contain information including drug name, quantity and strength stored as coded, machine readable, data while prescription dose instructions are unstructured free text and difficult to interpret and analyse in volume. The aim, using Natural Language Processing (NLP), was to extract drug dose amount, unit and frequency metadata from freely typed text in dose instructions to support calculating the intended number of days’ treatment. This then allows comparison with actual prescription frequency, treatment adherence and the impact upon prescribing safety and effectiveness. ApproachAn NLP algorithm was developed using the Ciao implementation of Prolog to extract dose amount, unit and frequency metadata from dose instructions held in the PIS datamart for drugs used in the treatment of gastrointestinal, cardiovascular and respiratory disease. Accuracy estimates were obtained by randomly sampling 0.1% of the distinct dose instructions from source records, comparing these with metadata extracted by the algorithm and an iterative approach was used to modify the algorithm to increase accuracy and coverage. ResultsThe NLP algorithm was applied to 39,943,465 prescription instructions issued in 2014, consisting of 575,340 distinct dose instructions. For drugs used in the gastrointestinal, cardiovascular and respiratory systems (i.e. chapters 1, 2 and 3 of the British National Formulary (BNF)) the NLP algorithm successfully extracted drug dose amount, unit and frequency metadata from 95.1%, 98.5% and 97.4% of prescriptions respectively. However, instructions containing terms such as ‘as directed’ or ‘as required’ reduce the usability of the metadata by making it difficult to calculate the total dose intended for a specific time period as 7.9%, 0.9% and 27.9% of dose instructions contained terms meaning ‘as required’ while 3.2%, 3.7% and 4.0% contained terms meaning ‘as directed’, for drugs used in BNF chapters 1, 2 and 3 respectively. ConclusionThe NLP algorithm developed can extract dose, unit and frequency metadata from text found in prescriptions issued to treat a wide range of conditions and this information may be used to support calculating treatment durations, medicines adherence and cumulative drug exposure. The presence of terms such as ‘as required’ and ‘as directed’ has a negative impact on the usability of the metadata and further work is required to determine the level of impact this has on calculating treatment durations and cumulative drug exposure.

Download Full-text

Supporting Test Case Design on Reasoning Scheme with Natural Language Processing Technique

10.1109/bigdata52589.2021.9671718 ◽

2021 ◽

Author(s):

Noriyuki Kushiro ◽

Yusuke Ogata

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Processing Technique ◽

Test Case ◽

Natural Language Processing Technique ◽

Case Design

Download Full-text

Money Makes the World Go Frowned. Analyzing the Impact of Chinese Foreign Aid on States’ Sentiment Using Natural Language Processing

Chinas Rolle in einer neuen Weltordnung ◽

10.5771/9783828876361-241 ◽

2021 ◽

pp. 241-264

Author(s):

Dennis Hammerschmidt ◽

Cosima Meyer

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Foreign Aid ◽

Language Processing ◽

The World ◽

Chinese Foreign Aid ◽

The Impact

Download Full-text

Analysis of the Impact of the US Presidential Election on the US Economy Based on Natural Language Processing and Big Data

Computational and Experimental Simulations in Engineering - Mechanisms and Machine Science ◽

10.1007/978-3-030-67090-0_39 ◽

2021 ◽

pp. 483-494

Author(s):

Mingzhen Li ◽

Xiangdong Liu

Keyword(s):

Big Data ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Presidential Election ◽

Us Economy ◽

The Us ◽

Us Presidential Election ◽

The Impact

Download Full-text

A Natural Language Processing Approach to Automated Highlighting of New Information in Clinical Notes

Applied Sciences ◽

10.3390/app10082824 ◽

2020 ◽

Vol 10 (8) ◽

pp. 2824

Author(s):

Yu-Hsiang Su ◽

Ching-Ping Chao ◽

Ling-Chien Hung ◽

Sheng-Feng Sung ◽

Pei-Ju Lee

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Task Performance ◽

Language Processing ◽

Automated Identification ◽

Clinical Notes ◽

New Information ◽

Perceived Workload ◽

The Impact ◽

User Experiment

Electronic medical records (EMRs) have been used extensively in most medical institutions for more than a decade in Taiwan. However, information overload associated with rapid accumulation of large amounts of clinical narratives has threatened the effective use of EMRs. This situation is further worsened by the use of “copying and pasting”, leading to lots of redundant information in clinical notes. This study aimed to apply natural language processing techniques to address this problem. New information in longitudinal clinical notes was identified based on a bigram language model. The accuracy of automated identification of new information was evaluated using expert annotations as the reference standard. A two-stage cross-over user experiment was conducted to evaluate the impact of highlighting of new information on task demands, task performance, and perceived workload. The automated method identified new information with an F1 score of 0.833. The user experiment found a significant decrease in perceived workload associated with a significantly higher task performance. In conclusion, automated identification of new information in clinical notes is feasible and practical. Highlighting of new information enables healthcare professionals to grasp key information from clinical notes with less perceived workload.

Download Full-text

Extraction of Construction Quality Requirements from Textual Specifications via Natural Language Processing

Transportation Research Record Journal of the Transportation Research Board ◽

10.1177/03611981211001385 ◽

2021 ◽

pp. 036119812110013

Author(s):

JungHo Jeon ◽

Xin Xu ◽

Yuxi Zhang ◽

Liu Yang ◽

Hubo Cai

Keyword(s):

Neural Network ◽

South Carolina ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Syntactic Analysis ◽

Test Case ◽

Promising Alternative ◽

Construction Inspection ◽

Construction Specification

Construction inspection is an essential component of the quality assurance programs of state transportation agencies (STAs), and the guidelines for this process reside in lengthy textual specifications. In the current practice, engineers and inspectors must manually go through these documents to plan, conduct, and document their inspections, which is time-consuming, very subjective, inconsistent, and prone to error. A promising alternative to this manual process is the application of natural language processing (NLP) techniques (e.g., text parsing, sentence classification, and syntactic analysis) to automatically extract construction inspection requirements from textual documents and present them as straightforward check questions. This paper introduces an NLP-based method that: 1) extracts individual sentences from the construction specification; 2) preprocesses the resulting sentences; 3) applies Word2Vec and GloVe algorithms to extract vector features; 4) uses a convolutional neural network (CNN) and recurrent neural network to classify sentences; and 5) converts the requirement sentences into check questions via syntactic analysis. The overall methodology was assessed using the Indiana Department of Transportation (DOT) specification as a test case. Our results revealed that the CNN + GloVe combination led to the highest accuracy, at 91.9%, and the lowest loss, at 11.7%. To further validate its use across STAs nationwide, we applied it to the construction specification of the South Carolina DOT as a test case, and our average accuracy was 92.6%.

Download Full-text

An Industrial Study of Natural Language Processing Based Test Case Prioritization

2017 IEEE International Conference on Software Testing, Verification and Validation (ICST) ◽

10.1109/icst.2017.66 ◽

2017 ◽

Cited By ~ 3

Author(s):

Yilin Yang ◽

Xinhai Huang ◽

Xuefei Hao ◽

Zicong Liu ◽

Zhenyu Chen

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Test Case ◽

Test Case Prioritization

Download Full-text

Evaluation of an international medical E-learning course with natural language processing and machine learning

BMC Medical Education ◽

10.1186/s12909-021-02609-8 ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Aditya Borakati

Keyword(s):

Machine Learning ◽

Cohort Study ◽

Natural Language Processing ◽

Medical Students ◽

Text Mining ◽

Natural Language ◽

Language Processing ◽

Small Scale ◽

E Learning ◽

The Impact

Abstract Background In the context of the ongoing pandemic, e-learning has become essential to maintain existing medical educational programmes. Evaluation of such courses has thus far been on a small scale at single institutions. Further, systematic appraisal of the large volume of qualitative feedback generated by massive online e-learning courses manually is time consuming. This study aimed to evaluate the impact of an e-learning course targeting medical students collaborating in an international cohort study, with semi-automated analysis of feedback using text mining and machine learning methods. Method This study was based on a multi-centre cohort study exploring gastrointestinal recovery following elective colorectal surgery. Collaborators were invited to complete a series of e-learning modules on key aspects of the study and complete a feedback questionnaire on the modules. Quantitative data were analysed using simple descriptive statistics. Qualitative data were analysed using text mining with most frequent words, sentiment analysis with the AFINN-111 and syuzhet lexicons and topic modelling using the Latent Dirichlet Allocation (LDA). Results One thousand six hundred and eleventh collaborators from 24 countries completed the e-learning course; 1396 (86.7%) were medical students; 1067 (66.2%) entered feedback. 1031 (96.6%) rated the quality of the course a 4/5 or higher (mean 4.56; SD 0.58). The mean sentiment score using the AFINN was + 1.54/5 (5: most positive; SD 1.19) and + 0.287/1 (1: most positive; SD 0.390) using syuzhet. LDA generated topics consolidated into the themes: (1) ease of use, (2) conciseness and (3) interactivity. Conclusions E-learning can have high user satisfaction for training investigators of clinical studies and medical students. Natural language processing may be beneficial in analysis of large scale educational courses.

Download Full-text