scholarly journals Automatic Assessment of Absolute Sentence Complexity

Author(s):  
Sanja Stajner ◽  
Simone Paolo Ponzetto ◽  
Heiner Stuckenschmidt

Lexically and syntactically simpler sentences result in shorter reading time and better understanding in many people. However, no reliable systems for automatic assessment of absolute sentence complexity have been proposed so far. Instead, the assessment is usually done manually, requiring expert human annotators. To address this problem, we first define the sentence complexity assessment as a five-level classification task, and build a ‘gold standard’ dataset. Next, we propose robust systems for sentence complexity assessment, using a novel set of features based on leveraging lexical properties of freely available corpora, and investigate the impact of the feature type and corpus size on the classification performance.

2021 ◽  
Vol 21 (S2) ◽  
Author(s):  
Kun Zeng ◽  
Yibin Xu ◽  
Ge Lin ◽  
Likeng Liang ◽  
Tianyong Hao

Abstract Background Eligibility criteria are the primary strategy for screening the target participants of a clinical trial. Automated classification of clinical trial eligibility criteria text by using machine learning methods improves recruitment efficiency to reduce the cost of clinical research. However, existing methods suffer from poor classification performance due to the complexity and imbalance of eligibility criteria text data. Methods An ensemble learning-based model with metric learning is proposed for eligibility criteria classification. The model integrates a set of pre-trained models including Bidirectional Encoder Representations from Transformers (BERT), A Robustly Optimized BERT Pretraining Approach (RoBERTa), XLNet, Pre-training Text Encoders as Discriminators Rather Than Generators (ELECTRA), and Enhanced Representation through Knowledge Integration (ERNIE). Focal Loss is used as a loss function to address the data imbalance problem. Metric learning is employed to train the embedding of each base model for feature distinguish. Soft Voting is applied to achieve final classification of the ensemble model. The dataset is from the standard evaluation task 3 of 5th China Health Information Processing Conference containing 38,341 eligibility criteria text in 44 categories. Results Our ensemble method had an accuracy of 0.8497, a precision of 0.8229, and a recall of 0.8216 on the dataset. The macro F1-score was 0.8169, outperforming state-of-the-art baseline methods by 0.84% improvement on average. In addition, the performance improvement had a p-value of 2.152e-07 with a standard t-test, indicating that our model achieved a significant improvement. Conclusions A model for classifying eligibility criteria text of clinical trials based on multi-model ensemble learning and metric learning was proposed. The experiments demonstrated that the classification performance was improved by our ensemble model significantly. In addition, metric learning was able to improve word embedding representation and the focal loss reduced the impact of data imbalance to model performance.


2021 ◽  
Vol 11 (2) ◽  
pp. 796
Author(s):  
Alhanoof Althnian ◽  
Duaa AlSaeed ◽  
Heyam Al-Baity ◽  
Amani Samha ◽  
Alanoud Bin Dris ◽  
...  

Dataset size is considered a major concern in the medical domain, where lack of data is a common occurrence. This study aims to investigate the impact of dataset size on the overall performance of supervised classification models. We examined the performance of six widely-used models in the medical field, including support vector machine (SVM), neural networks (NN), C4.5 decision tree (DT), random forest (RF), adaboost (AB), and naïve Bayes (NB) on eighteen small medical UCI datasets. We further implemented three dataset size reduction scenarios on two large datasets and analyze the performance of the models when trained on each resulting dataset with respect to accuracy, precision, recall, f-score, specificity, and area under the ROC curve (AUC). Our results indicated that the overall performance of classifiers depend on how much a dataset represents the original distribution rather than its size. Moreover, we found that the most robust model for limited medical data is AB and NB, followed by SVM, and then RF and NN, while the least robust model is DT. Furthermore, an interesting observation is that a robust machine learning model to limited dataset does not necessary imply that it provides the best performance compared to other models.


2020 ◽  
Vol 41 (S1) ◽  
pp. s188-s189
Author(s):  
Jeffrey Gerber ◽  
Robert Grundmeier ◽  
Keith Hamilton ◽  
Lauri Hicks ◽  
Melinda Neuhauser ◽  
...  

Background: Antibiotic overuse contributes to antibiotic resistance and unnecessary adverse drug effects. Antibiotic stewardship interventions have primarily focused on acute-care settings. Most antibiotic use, however, occurs in outpatients with acute respiratory tract infections such as pharyngitis. The electronic health record (EHR) might provide an effective and efficient tool for outpatient antibiotic stewardship. We aimed to develop and validate an electronic algorithm to identify inappropriate antibiotic use for pediatric outpatients with pharyngitis. Methods: This study was conducted within the Children’s Hospital of Philadelphia (CHOP) Care Network, including 31 pediatric primary care practices and 3 urgent care centers with a shared EHR serving >250,000 children. We used International Classification of Diseases, Tenth Revision (ICD-10) codes to identify encounters for pharyngitis at any CHOP practice from March 15, 2017, to March 14, 2018, excluding those with concurrent infections (eg, otitis media, sinusitis), immunocompromising conditions, or other comorbidities that might influence the need for antibiotics. We randomly selected 450 features for detailed chart abstraction assessing patient demographics as well as practice and prescriber characteristics. Appropriateness of antibiotic use based on chart review served as the gold standard for evaluating the electronic algorithm. Criteria for appropriate use included streptococcal testing, use of penicillin or amoxicillin (absent β-lactam allergy), and a 10-day duration of therapy. Results: In 450 patients, the median age was 8.4 years (IQR, 5.5–9.0) and 54% were women. On chart review, 149 patients (33%) received an antibiotic, of whom 126 had a positive rapid strep result. Thus, based on chart review, 23 subjects (5%) diagnosed with pharyngitis received antibiotics inappropriately. Amoxicillin or penicillin was prescribed for 100 of the 126 children (79%) with a positive rapid strep test. Of the 126 children with a positive test, 114 (90%) received the correct antibiotic: amoxicillin, penicillin, or an appropriate alternative antibiotic due to b-lactam allergy. Duration of treatment was correct for all 126 children. Using the electronic algorithm, the proportion of inappropriate prescribing was 28 of 450 (6%). The test characteristics of the electronic algorithm (compared to gold standard chart review) for identification of inappropriate antibiotic prescribing were sensitivity (99%, 422 of 427); specificity (100%, 23 of 23); positive predictive value (82%, 23 of 28); and negative predictive value (100%, 422 of 422). Conclusions: For children with pharyngitis, an electronic algorithm for identification of inappropriate antibiotic prescribing is highly accurate. Future work should validate this approach in other settings and develop and evaluate the impact of an audit and feedback intervention based on this tool.Funding: NoneDisclosures: None


Author(s):  
Suzanne L. van Winkel ◽  
Alejandro Rodríguez-Ruiz ◽  
Linda Appelman ◽  
Albert Gubern-Mérida ◽  
Nico Karssemeijer ◽  
...  

Abstract Objectives Digital breast tomosynthesis (DBT) increases sensitivity of mammography and is increasingly implemented in breast cancer screening. However, the large volume of images increases the risk of reading errors and reading time. This study aims to investigate whether the accuracy of breast radiologists reading wide-angle DBT increases with the aid of an artificial intelligence (AI) support system. Also, the impact on reading time was assessed and the stand-alone performance of the AI system in the detection of malignancies was compared to the average radiologist. Methods A multi-reader multi-case study was performed with 240 bilateral DBT exams (71 breasts with cancer lesions, 70 breasts with benign findings, 339 normal breasts). Exams were interpreted by 18 radiologists, with and without AI support, providing cancer suspicion scores per breast. Using AI support, radiologists were shown examination-based and region-based cancer likelihood scores. Area under the receiver operating characteristic curve (AUC) and reading time per exam were compared between reading conditions using mixed-models analysis of variance. Results On average, the AUC was higher using AI support (0.863 vs 0.833; p = 0.0025). Using AI support, reading time per DBT exam was reduced (p < 0.001) from 41 (95% CI = 39–42 s) to 36 s (95% CI = 35– 37 s). The AUC of the stand-alone AI system was non-inferior to the AUC of the average radiologist (+0.007, p = 0.8115). Conclusions Radiologists improved their cancer detection and reduced reading time when evaluating DBT examinations using an AI reading support system. Key Points • Radiologists improved their cancer detection accuracy in digital breast tomosynthesis (DBT) when using an AI system for support, while simultaneously reducing reading time. • The stand-alone breast cancer detection performance of an AI system is non-inferior to the average performance of radiologists for reading digital breast tomosynthesis exams. • The use of an AI support system could make advanced and more reliable imaging techniques more accessible and could allow for more cost-effective breast screening programs with DBT.


Energies ◽  
2021 ◽  
Vol 14 (11) ◽  
pp. 3267
Author(s):  
Ramon C. F. Araújo ◽  
Rodrigo M. S. de Oliveira ◽  
Fernando S. Brasil ◽  
Fabrício J. B. Barros

In this paper, a novel image denoising algorithm and novel input features are proposed. The algorithm is applied to phase-resolved partial discharge (PRPD) diagrams with a single dominant partial discharge (PD) source, preparing them for automatic artificial-intelligence-based classification. It was designed to mitigate several sources of distortions often observed in PRPDs obtained from fully operational hydroelectric generators. The capabilities of the denoising algorithm are the automatic removal of sparse noise and the suppression of non-dominant discharges, including those due to crosstalk. The input features are functions of PD distributions along amplitude and phase, which are calculated in a novel way to mitigate random effects inherent to PD measurements. The impact of the proposed contributions was statistically evaluated and compared to classification performance obtained using formerly published approaches. Higher recognition rates and reduced variances were obtained using the proposed methods, statistically outperforming autonomous classification techniques seen in earlier works. The values of the algorithm’s internal parameters are also validated by comparing the recognition performance obtained with different parameter combinations. All typical PD sources described in hydro-generators PD standards are considered and can be automatically detected.


2020 ◽  
Vol 41 (S1) ◽  
pp. s32-s32
Author(s):  
Ebbing Lautenbach ◽  
Keith Hamilton ◽  
Robert Grundmeier ◽  
Melinda Neuhauser ◽  
Lauri Hicks ◽  
...  

Background: Antibiotic resistance has increased at alarming rates, driven predominantly by antibiotic overuse. Although most antibiotic use occurs in outpatients, antimicrobial stewardship programs have primarily focused on inpatient settings. A major challenge for outpatient stewardship is the lack of accurate and accessible electronic data to target interventions. We sought to develop and validate an electronic algorithm to identify inappropriate antibiotic use for outpatients with acute bronchitis. Methods: This study was conducted within the University of Pennsylvania Health System (UPHS). We used ICD-10 diagnostic codes to identify encounters for acute bronchitis at any outpatient UPHS practice between March 15, 2017, and March 14, 2018. Exclusion criteria included underlying immunocompromising condition, other comorbidity influencing the need for antibiotics (eg, emphysema), or ICD-10 code at the same visit for a concurrent infection (eg, sinusitis). We randomly selected 300 (150 from academic practices and 150 from nonacademic practices) eligible subjects for detailed chart abstraction that assessed patient demographics and practice and prescriber characteristics. Appropriateness of antibiotic use based on chart review served as the gold standard for assessment of the electronic algorithm. Because antibiotic use is not indicated for this study population, appropriateness was assessed based upon whether an antibiotic was prescribed or not. Results: Of 300 subjects, median age was 61 years (interquartile range, 50–68), 62% were women, 74% were seen in internal medicine (vs family medicine) practices, and 75% were seen by a physician (vs an advanced practice provider). On chart review, 167 (56%) subjects received an antibiotic. Of these subjects, 1 had documented concern for pertussis and 4 had excluding conditions for which there were no ICD-10 codes. One received an antibiotic prescription for a planned dental procedure. Thus, based on chart review, 161 (54%) subjects received antibiotics inappropriately. Using the electronic algorithm based on diagnostic codes, underlying and concurrent conditions, and prescribing data, the number of subjects with inappropriate prescribing was 170 (56%) because 3 subjects had antibiotic prescribing not noted based on chart review. The test characteristics of the electronic algorithm (compared to gold standard chart review) for identification of inappropriate antibiotic prescribing were the following: sensitivity, 100% (161 of 161); specificity, 94% (130 of 139); positive predictive value, 95% (161 of 170); and negative predictive value, 100% (130 of 130). Conclusions: For outpatients with acute bronchitis, an electronic algorithm for identification of inappropriate antibiotic prescribing is highly accurate. This algorithm could be used to efficiently assess prescribing among practices and individual clinicians. The impact of interventions based on this algorithm should be tested in future studies.Funding: NoneDisclosures: None


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  

Purpose This paper aims to review the latest management developments across the globe and pinpoint practical implications from cutting-edge research and case studies. Design This briefing is prepared by an independent writer who adds their own impartial comments and places the articles in context. Findings The study takes empirical data from three case studies of projects that have faced setbacks to explore the impact of project manager signature strengths on team resilience and finds that four signature strengths, leadership, open-mindedness, persistence and hope, were present in project managers across all three case studies. Originality The briefing saves busy executives and researchers hours of reading time by selecting only the very best, most pertinent information and presenting it in a condensed and easy-to-digest format.


2019 ◽  
Vol 45 (1) ◽  
pp. 1-57 ◽  
Author(s):  
Silvio Cordeiro ◽  
Aline Villavicencio ◽  
Marco Idiart ◽  
Carlos Ramisch

Nominal compounds such as red wine and nut case display a continuum of compositionality, with varying contributions from the components of the compound to its semantics. This article proposes a framework for compound compositionality prediction using distributional semantic models, evaluating to what extent they capture idiomaticity compared to human judgments. For evaluation, we introduce data sets containing human judgments in three languages: English, French, and Portuguese. The results obtained reveal a high agreement between the models and human predictions, suggesting that they are able to incorporate information about idiomaticity. We also present an in-depth evaluation of various factors that can affect prediction, such as model and corpus parameters and compositionality operations. General crosslingual analyses reveal the impact of morphological variation and corpus size in the ability of the model to predict compositionality, and of a uniform combination of the components for best results.


2008 ◽  
Vol 18 (1) ◽  
pp. 123-138 ◽  
Author(s):  
Milos Radovanovic ◽  
Mirjana Ivanovic

Motivated by applying Text Categorization to classification of Web search results, this paper describes an extensive experimental study of the impact of bag-of- words document representations on the performance of five major classifiers - Na?ve Bayes, SVM, Voted Perceptron, kNN and C4.5. The texts, representing short Web-page descriptions sorted into a large hierarchy of topics, are taken from the dmoz Open Directory Web-page ontology, and classifiers are trained to automatically determine the topics which may be relevant to a previously unseen Web-page. Different transformations of input data: stemming, normalization, logtf and idf, together with dimensionality reduction, are found to have a statistically significant improving or degrading effect on classification performance measured by classical metrics - accuracy, precision, recall, F1 and F2. The emphasis of the study is not on determining the best document representation which corresponds to each classifier, but rather on describing the effects of every individual transformation on classification, together with their mutual relationships. .


2020 ◽  
Author(s):  
Wesley Delage ◽  
Julien Thevenon ◽  
Claire Lemaitre

AbstractSince 2009, numerous tools have been developed to detect structural variants (SVs) using short read technologies. Insertions >50 bp are one of the hardest type to discover and are drastically underrepresented in gold standard variant callsets. The advent of long read technologies has completely changed the situation. In 2019, two independent cross technologies studies have published the most complete variant callsets with sequence resolved insertions in human individuals. Among the reported insertions, only 17 to 37% could be discovered with short-read based tools. In this work, we performed an in-depth analysis of these unprecedented insertion callsets in order to investigate the causes of such failures. We have first established a precise classification of insertion variants according to four layers of characterization: the nature and size of the inserted sequence, the genomic context of the insertion site and the breakpoint junction complexity. Because these levels are intertwined, we then used simulations to characterize the impact of each complexity factor on the recall of several SV callers. Simulations showed that the most impacting factor was the insertion type rather than the genomic context, with various difficulties being handled differently among the tested SV callers, and they highlighted the lack of sequence resolution for most insertion calls. Our results explain the low recall by pointing out several difficulty factors among the observed insertion features and provide avenues for improving SV caller algorithms and their [email protected]


Sign in / Sign up

Export Citation Format

Share Document