Enhancing the extraction of SBVR business vocabularies and business rules from UML use case diagrams with natural language processing

Author(s):  
Paulius Danenas ◽  
Tomas Skersys ◽  
Rimantas Butleris
10.2196/20492 ◽  
2021 ◽  
Vol 9 (7) ◽  
pp. e20492
Author(s):  
Lea Canales ◽  
Sebastian Menke ◽  
Stephanie Marchesseau ◽  
Ariel D’Agostino ◽  
Carlos del Rio-Bermudez ◽  
...  

Background Clinical natural language processing (cNLP) systems are of crucial importance due to their increasing capability in extracting clinically important information from free text contained in electronic health records (EHRs). The conversion of a nonstructured representation of a patient’s clinical history into a structured format enables medical doctors to generate clinical knowledge at a level that was not possible before. Finally, the interpretation of the insights gained provided by cNLP systems has a great potential in driving decisions about clinical practice. However, carrying out robust evaluations of those cNLP systems is a complex task that is hindered by a lack of standard guidance on how to systematically approach them. Objective Our objective was to offer natural language processing (NLP) experts a methodology for the evaluation of cNLP systems to assist them in carrying out this task. By following the proposed phases, the robustness and representativeness of the performance metrics of their own cNLP systems can be assured. Methods The proposed evaluation methodology comprised five phases: (1) the definition of the target population, (2) the statistical document collection, (3) the design of the annotation guidelines and annotation project, (4) the external annotations, and (5) the cNLP system performance evaluation. We presented the application of all phases to evaluate the performance of a cNLP system called “EHRead Technology” (developed by Savana, an international medical company), applied in a study on patients with asthma. As part of the evaluation methodology, we introduced the Sample Size Calculator for Evaluations (SLiCE), a software tool that calculates the number of documents needed to achieve a statistically useful and resourceful gold standard. Results The application of the proposed evaluation methodology on a real use-case study of patients with asthma revealed the benefit of the different phases for cNLP system evaluations. By using SLiCE to adjust the number of documents needed, a meaningful and resourceful gold standard was created. In the presented use-case, using as little as 519 EHRs, it was possible to evaluate the performance of the cNLP system and obtain performance metrics for the primary variable within the expected CIs. Conclusions We showed that our evaluation methodology can offer guidance to NLP experts on how to approach the evaluation of their cNLP systems. By following the five phases, NLP experts can assure the robustness of their evaluation and avoid unnecessary investment of human and financial resources. Besides the theoretical guidance, we offer SLiCE as an easy-to-use, open-source Python library.


2015 ◽  
Vol 24 (2) ◽  
pp. 277-286 ◽  
Author(s):  
Nabil Arman ◽  
Sari Jabbarin

AbstractAutomated software engineering has attracted a large amount of research efforts. The use of object-oriented methods for software systems development has made it necessary to develop approaches that automate the construction of different Unified Modeling Language (UML) models in a semiautomated approach from textual user requirements. UML use case models represent an essential artifact that provides a perspective of the system under analysis or development. The development of such use case models is very crucial in an object-oriented development method. The main principles used in obtaining these models are described. A natural language processing tool is used to parse different statements of the user requirements written in Arabic to obtain lists of nouns, noun phrases, verbs, verb phrases, etc., that aid in finding potential actors and use cases. A set of steps that represent our approach for constructing a use case model are presented. Finally, the proposed approach is validated using an experiment involving a group of graduate students who are familiar with use case modeling.


Author(s):  
Florian Jungmann ◽  
B. Kämpgen ◽  
F. Hahn ◽  
D. Wagner ◽  
P. Mildenberger ◽  
...  

Abstract Objective During the COVID-19 pandemic, the number of patients presenting in hospitals because of emergency conditions decreased. Radiology is thus confronted with the effects of the pandemic. The aim of this study was to use natural language processing (NLP) to automatically analyze the number and distribution of fractures during the pandemic and in the 5 years before the pandemic. Materials and methods We used a pre-trained commercially available NLP engine to automatically categorize 5397 radiological reports of radiographs (hand/wrist, elbow, shoulder, ankle, knee, pelvis/hip) within a 6-week period from March to April in 2015–2020 into “fracture affirmed” or “fracture not affirmed.” The NLP engine achieved an F1 score of 0.81 compared to human annotators. Results In 2020, we found a significant decrease of fractures in general (p < 0.001); the average number of fractures in 2015–2019 was 295, whereas it was 233 in 2020. In children and adolescents (p < 0.001), and in adults up to 65 years (p = 0.006), significantly fewer fractures were reported in 2020. The number of fractures in the elderly did not change (p = 0.15). The number of hand/wrist fractures (p < 0.001) and fractures of the elbow (p < 0.001) was significantly lower in 2020 compared with the average in the years 2015–2019. Conclusion NLP can be used to identify relevant changes in the number of pathologies as shown here for the use case fracture detection. This may trigger root cause analysis and enable automated real-time monitoring in radiology.


2021 ◽  
Vol 14 (1) ◽  
pp. 147-156
Author(s):  
Abdellatif Haj ◽  
◽  
Youssef Balouki ◽  
Taoufiq Gadi ◽  
◽  
...  

Business Rules (BR) are usually written by different stakeholders, which makes them vulnerable to contain different designations for a same concept. Such problem can be the source of a not well orchestrated behaviors. Whereas identification of synonyms is manual or totally neglected in most approaches dealing with natural language Business Rules. In this paper, we present an automated approach to identify semantic similarity between terms in textual BR using Natural Language Processing and knowledge-based algorithm refined using heuristics. Our method is unique in that it also identifies abbreviations/expansions (as a special case of synonym) which is not possible using a dictionary. Then, results are saved in a standard format (SBVR) for reusability purposes. Our approach was applied on more than 160 BR statements divided on three cases with an accuracy between 69% and 87% which suggests it to be an indispensable enhancement for other methods dealing with textual BR.


The application of Natural Language Processing (NLP) in the Balinese sentence dictionary is an application that can provide information about the meaning of Indonesian to Balinese translation using the words contained in the Indonesian pocket dictionary, the results or meanings obtained are based on the application of Natural Language Processing (NLP) in accordance with the stipulated provisions. This application does not translate per word but per sentence. The application of NLP in the Balinese sentence dictionary serves to provide convenience for tourists or users who are used to communicate with the Balinese people and understand the Balinese language itself. In addition, it can provide time efficiency to users because this application is built offline on the Android mobile operating system, so that it can be accessed anywhere and anytime. In its implementation, the waterfall method is used and the output of this application is Indonesian sentences to Balinese sentences. And in the development of this application used tools with UML (Unified Modelling Processing) which consists of use case diagrams, activity diagrams, sequential diagrams, statechart diagrams and class diagrams. Keywords: Dictionary, Natural Language Processing (NLP), Balinese


2020 ◽  
Vol 67 (4) ◽  
pp. 438-448 ◽  
Author(s):  
Simon B. Goldberg ◽  
Nikolaos Flemotomos ◽  
Victor R. Martinez ◽  
Michael J. Tanana ◽  
Patty B. Kuo ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document