ICD10Net: An Artificial Intelligence Algorithm with Medical Background Conducts ICD-10-CM Coding Task with Outstanding Performance (Preprint)

2019 ◽  
Author(s):  
Chin Lin ◽  
Yu-Sheng Lou ◽  
Chia-Cheng Lee ◽  
Chia-Jung Hsu ◽  
Ding-Chung Wu ◽  
...  

BACKGROUND An artificial intelligence-based algorithm has shown a powerful ability for coding the International Classification of Diseases, Tenth Revision, Clinical Modification (ICD-10-CM) in discharge notes. However, its performance still requires improvement compared with human experts. The major disadvantage of the previous algorithm is its lack of understanding medical terminologies. OBJECTIVE We propose some methods based on human-learning process and conduct a series of experiments to validate their improvements. METHODS We compared two data sources for training the word-embedding model: English Wikipedia and PubMed journal abstracts. Moreover, the fixed, changeable, and double-channel embedding tables were used to test their performance. Some additional tricks were also applied to improve accuracy. We used these methods to identify the three-chapter-level ICD-10-CM diagnosis codes in a set of discharge notes. Subsequently, 94,483-labeled discharge notes from June 1, 2015 to June 30, 2017 were used from the Tri-Service General Hospital in Taipei, Taiwan. To evaluate performance, 24,762 discharge notes from July 1, 2017 to December 31, 2017, from the same hospital were used. Moreover, 74,324 additional discharge notes collected from other seven hospitals were also tested. The F-measure is the major global measure of effectiveness. RESULTS In understanding medical terminologies, the PubMed-embedding model (Pearson correlation = 0.60/0.57) shows a better performance compared with the Wikipedia-embedding model (Pearson correlation = 0.35/0.31). In the accuracy of ICD-10-CM coding, the changeable model both used the PubMed- and Wikipedia-embedding model has the highest testing mean F-measure (0.7311 and 0.6639 in Tri-Service General Hospital and other seven hospitals, respectively). Moreover, a proposed method called a hybrid sampling method, an augmentation trick to avoid algorithms identifying negative terms, was found to additionally improve the model performance. CONCLUSIONS The proposed model architecture and training method is named as ICD10Net, which is the first expert level model practically applied to daily work. This model can also be applied in unstructured information extraction from free-text medical writing. We have developed a web app to demonstrate our work (https://linchin.ndmctsgh.edu.tw/app/ICD10/).

2019 ◽  
Author(s):  
Chin Lin ◽  
Yu-Sheng Lou ◽  
Dung-Jang Tsai ◽  
Chia-Cheng Lee ◽  
Chia-Jung Hsu ◽  
...  

BACKGROUND Most current state-of-the-art models for searching the International Classification of Diseases, Tenth Revision Clinical Modification (ICD-10-CM) codes use word embedding technology to capture useful semantic properties. However, they are limited by the quality of initial word embeddings. Word embedding trained by electronic health records (EHRs) is considered the best, but the vocabulary diversity is limited by previous medical records. Thus, we require a word embedding model that maintains the vocabulary diversity of open internet databases and the medical terminology understanding of EHRs. Moreover, we need to consider the particularity of the disease classification, wherein discharge notes present only positive disease descriptions. OBJECTIVE We aimed to propose a projection word2vec model and a hybrid sampling method. In addition, we aimed to conduct a series of experiments to validate the effectiveness of these methods. METHODS We compared the projection word2vec model and traditional word2vec model using two corpora sources: English Wikipedia and PubMed journal abstracts. We used seven published datasets to measure the medical semantic understanding of the word2vec models and used these embeddings to identify the three–character-level ICD-10-CM diagnostic codes in a set of discharge notes. On the basis of embedding technology improvement, we also tried to apply the hybrid sampling method to improve accuracy. The 94,483 labeled discharge notes from the Tri-Service General Hospital of Taipei, Taiwan, from June 1, 2015, to June 30, 2017, were used. To evaluate the model performance, 24,762 discharge notes from July 1, 2017, to December 31, 2017, from the same hospital were used. Moreover, 74,324 additional discharge notes collected from seven other hospitals were tested. The F-measure, which is the major global measure of effectiveness, was adopted. RESULTS In medical semantic understanding, the original EHR embeddings and PubMed embeddings exhibited superior performance to the original Wikipedia embeddings. After projection training technology was applied, the projection Wikipedia embeddings exhibited an obvious improvement but did not reach the level of original EHR embeddings or PubMed embeddings. In the subsequent ICD-10-CM coding experiment, the model that used both projection PubMed and Wikipedia embeddings had the highest testing mean F-measure (0.7362 and 0.6693 in Tri-Service General Hospital and the seven other hospitals, respectively). Moreover, the hybrid sampling method was found to improve the model performance (F-measure=0.7371/0.6698). CONCLUSIONS The word embeddings trained using EHR and PubMed could understand medical semantics better, and the proposed projection word2vec model improved the ability of medical semantics extraction in Wikipedia embeddings. Although the improvement from the projection word2vec model in the real ICD-10-CM coding task was not substantial, the models could effectively handle emerging diseases. The proposed hybrid sampling method enables the model to behave like a human expert.


2017 ◽  
Author(s):  
Chin Lin ◽  
Chia-Jung Hsu ◽  
Yu-Sheng Lou ◽  
Shih-Jen Yeh ◽  
Chia-Cheng Lee ◽  
...  

BACKGROUND Automated disease code classification using free-text medical information is important for public health surveillance. However, traditional natural language processing (NLP) pipelines are limited, so we propose a method combining word embedding with a convolutional neural network (CNN). OBJECTIVE Our objective was to compare the performance of traditional pipelines (NLP plus supervised machine learning models) with that of word embedding combined with a CNN in conducting a classification task identifying International Classification of Diseases, Tenth Revision, Clinical Modification (ICD-10-CM) diagnosis codes in discharge notes. METHODS We used 2 classification methods: (1) extracting from discharge notes some features (terms, n-gram phrases, and SNOMED CT categories) that we used to train a set of supervised machine learning models (support vector machine, random forests, and gradient boosting machine), and (2) building a feature matrix, by a pretrained word embedding model, that we used to train a CNN. We used these methods to identify the chapter-level ICD-10-CM diagnosis codes in a set of discharge notes. We conducted the evaluation using 103,390 discharge notes covering patients hospitalized from June 1, 2015 to January 31, 2017 in the Tri-Service General Hospital in Taipei, Taiwan. We used the receiver operating characteristic curve as an evaluation measure, and calculated the area under the curve (AUC) and F-measure as the global measure of effectiveness. RESULTS In 5-fold cross-validation tests, our method had a higher testing accuracy (mean AUC 0.9696; mean F-measure 0.9086) than traditional NLP-based approaches (mean AUC range 0.8183-0.9571; mean F-measure range 0.5050-0.8739). A real-world simulation that split the training sample and the testing sample by date verified this result (mean AUC 0.9645; mean F-measure 0.9003 using the proposed method). Further analysis showed that the convolutional layers of the CNN effectively identified a large number of keywords and automatically extracted enough concepts to predict the diagnosis codes. CONCLUSIONS Word embedding combined with a CNN showed outstanding performance compared with traditional methods, needing very little data preprocessing. This shows that future studies will not be limited by incomplete dictionaries. A large amount of unstructured information from free-text medical writing will be extracted by automated approaches in the future, and we believe that the health care field is about to enter the age of big data.


2020 ◽  
Author(s):  
Lingling Zhou ◽  
Cheng Cheng ◽  
Dong Ou ◽  
Hao Huang

Abstract Background The International Classification of Diseases, 10th Revision (ICD-10) has been widely used to describe the diagnosis information of patients. Automatic ICD-10 coding is important because manually assigning codes is expensive, time consuming and error prone. Although numerous approaches have been developed to explore automatic coding, few of them have been applied in practice. Our aim is to construct a practical, automatic ICD-10 coding machine to improve coding efficiency and quality in daily work. Methods In this study, we propose the use of regular expressions (regexps) to establish a correspondence between diagnosis codes and diagnosis descriptions in outpatient settings and at admission and discharge. The description models of the regexps were embedded in our upgraded coding system, which queries a diagnosis description and assigns a unique diagnosis code. Like most studies, the precision (P), recall (R), F-measure (F) and overall accuracy (A) were used to evaluate the system performance. Our study had two stages. The datasets were obtained from the diagnosis information on the homepage of the discharge medical record. The testing sets were from October 1, 2017 to April 30, 2018 and from July 1, 2018 to January 31, 2019. Results The values of P were 89.27% and 88.38% in the first testing phase and the second testing phase, respectively, which demonstrate high precision. The automatic ICD-10 coding system completed more than 160,000 codes in 16 months, which reduced the workload of the coders. In addition, a comparison between the amount of time needed for manual coding and automatic coding indicated the effectiveness of the system-the time needed for automatic coding takes nearly 100 times less than manual coding. Conclusions Our automatic coding system is well suited for the coding task. Further studies are warranted to perfect the description models of the regexps and to develop synthetic approaches to improve system performance.


Stroke ◽  
2020 ◽  
Vol 51 (Suppl_1) ◽  
Author(s):  
Vitor Mendes Pereira ◽  
Yoni Donner ◽  
Gil Levi ◽  
Nicole Cancelliere ◽  
Erez Wasserman ◽  
...  

Cerebral Aneurysms (CAs) may occur in 5-10% of the population. They can be often missed because they require a very methodological diagnostic approach. We developed an algorithm using artificial intelligence to assist and supervise and detect CAs. Methods: We developed an automated algorithm to detect CAs. The algorithm is based on 3D convolutional neural network modeled as a U-net. We included all saccular CAs from 2014 to 2016 from a single center. Normal and pathological datasets were prepared and annotated in 3D using an in-house developed platform. To assess the accuracy and to optimize the model, we assessed preliminary results using a validation dataset. After the algorithm was trained, a dataset was used to evaluate final IA detection and aneurysm measurements. The accuracy of the algorithm was derived using ROC curves and Pearson correlation tests. Results: We used 528 CTAs with 674 aneurysms at the following locations: ACA (3%), ACA/ACOM (26.1%), ICA/MCA (26.3%), MCA (29.4%), PCA/PCOM (2.3%), Basilar (6.6%), Vertebral (2.3%) and other (3.7%). Training datasets consisted of 189 CA scans. We plotted ROC curves and achieved an AUC of 0.85 for unruptured and 0.88 for ruptured CAs. We improved the model performance by increasing the training dataset employing various methods of data augmentation to leverage the data to its fullest. The final model tested was performed in 528 CTAs using 5-fold cross-validation and an additional set of 2400 normal CTAs. There was a significant improvement compared to the initial assessment, with an AUC of 0.93 for unruptured and 0.94 for ruptured. The algorithm detected larger aneurysms more accurately, reaching an AUC of 0.97 and a 91.5% specificity at 90% sensitivity for aneurysms larger than 7mm. Also, the algorithm accurately detected CAs in the following locations: basilar(AUC of 0.97) and MCA/ACOM (AUC of 0.94). The volume measurement (mm3) by the model compared to the annotated one achieved a Pearson correlation of 99.36. Conclusion: The Viz.ai aneurysm algorithm was able to detect and measure ruptured and unruptured CAs in consecutive CTAs. The model has demonstrated that a deep learning AI algorithm can achieve clinically useful levels of accuracy for clinical decision support.


2020 ◽  
Vol 51 (3) ◽  
Author(s):  
Ebner Bon G. Maceda ◽  
Maria Melanie Liberty B. Alcausin

Objective. The study aimed to determine the prevalence of birth defects among neonates born at the Philippine General Hospital (PGH) from January 2011 to December 2014. Methods. Monthly censuses of all deliveries from January 2011 to December 2014 were obtained from the Section of Newborn Medicine. All deliveries with birth defects were coded using International Classification of Diseases-10 (ICD -10). The codes were tallied and classified as either an isolated, part of a recognizable syndrome, chromosomal syndrome or multi-malformed case (MMC). Period prevalence was then calculated. Results. There was a total of 20,939 deliveries from 2011 to 2014 in PGH, of which 574 babies (2.74%) had a diagnosis of at least one birth defect. Two-hundred seventy-three babies (47.56%) had isolated defects; 130 (22.65%) with defects in MMC; 106 (18.47%) with defects as part of recognizable syndromes; and 65 (11.32%) with defects as part of chromosomal syndromes. One in 36 births has at least one birth defect, which is higher than that reported in other Asian countries. Conclusion. Birth defects are significant causes of morbidity and mortality. Results of this study provide baseline data that can be used for future studies on the causation of such birth defects, and can be used to formulate policies on primary and secondary prevention. For a tertiary hospital like PGH, these data can serve as a guide towards allocation of resources and manpower towards the more common birth defects.


2020 ◽  
Author(s):  
Lingling Zhou ◽  
Cheng Cheng ◽  
Dong Ou ◽  
Hao Huang

Abstract Background The International Classification of Diseases, 10th Revision (ICD-10) has been widely used to describe the diagnosis information of patients. Automatic ICD-10 coding is important because manually assigning codes is expensive, time consuming and error prone. Although numerous approaches have been developed to explore automatic coding, few of them have been applied in practice. Our aim is to construct a practical, automatic ICD-10 coding machine to improve coding efficiency and quality in daily work. Methods In this study, we propose the use of regular expressions (regexps) to establish a correspondence between diagnosis codes and diagnosis descriptions in outpatient settings and at admission and discharge. The description models of the regexps were embedded in our upgraded coding system, which queries a diagnosis description and assigns a unique diagnosis code. Like most studies, the precision (P), recall (R), F-measure (F) and overall accuracy (A) were used to evaluate the system performance. Our study had two stages. The datasets were obtained from the diagnosis information on the homepage of the discharge medical record. The testing sets were from October 1, 2017 to April 30, 2018 and from July 1, 2018 to January 31, 2019. Results The values of P were 89.27% and 88.38% in the first testing phase and the second testing phase, respectively, which demonstrate high precision. The automatic ICD-10 coding system completed more than 160,000 codes in 16 months, which reduced the workload of the coders. In addition, a comparison between the amount of time needed for manual coding and automatic coding indicated the effectiveness of the system-the time needed for automatic coding takes nearly 100 times less than manual coding. Conclusions Our automatic coding system is well suited for the coding task. Further studies are warranted to perfect the description models of the regexps and to develop synthetic approaches to improve system performance.


Author(s):  
Sam Sansome ◽  
Iain Turnbull ◽  
John McDonnell

IntroductionUsing linkage to the Chinese National Health Insurance (HI) system, we identified disease outcomes from a prospective cohort study of 512,000 middle-aged Chinese adults. Mandarin free-text diagnosis data were supplied by over 30 different agencies across 10 areas, often without an accompanying International Classification of Diseases 10th revision (ICD-10) code. Objectives and ApproachTo facilitate a genome-wide association study (GWAS) of all our genotyped participants, we needed to code as many of our 2.02 million hospitalisation events as possible. We developed software to assign ICD-10 codes to unique disease descriptions and stored the coded diagnoses in an internal corpus. The software used an interface which allowed clinicians to select and code disease descriptions individually, or collectively using Chinese keywords. All coded disease descriptions were subsequently validated by an independent Mandarin-speaking clinician. All new events with descriptions which matched exactly those already in the corpus were automatically coded to ICD-10. ResultsBy the end of 2016, there were 2,021,352 hospitalisation events coded to ICD-10. 436,702 (21.6%) were automatically assigned codes where disease descriptions corresponded to those in the Chinese version of the ICD-10 codebook. A further 1,084,197 (53.6%) were coded by a clinician using our standardisation software; all disease descriptions linked to 200 or more events were included. Finally, a remaining 454,237 (22.5%) events were given the ICD-10 codes supplied by the health insurance agency (after cleaning). In total, 97.7% of all health insurance events were coded to ICD-10. Overall, over 17,000 unique disease descriptions have been clinically classified. Conclusion/ImplicationsAutomatic coding of hospitalisation events to ICD-10 has enabled our study to investigate a greater range of diseases and use GWAS to detect novel genetic variants. We are now well positioned to test semantic matching and machine learning strategies for coding of the remaining 46,216 (2.3%) uncoded events.


2019 ◽  
Vol 4 (5) ◽  
pp. 936-946
Author(s):  
Dawn Konrad-Martin ◽  
Neela Swanson ◽  
Angela Garinis

Purpose Improved medical care leading to increased survivorship among patients with cancer and infectious diseases has created a need for ototoxicity monitoring programs nationwide. The goal of this report is to promote effective and standardized coding and 3rd-party payer billing practices for the audiological management of symptomatic ototoxicity. Method The approach was to compile the relevant International Classification of Diseases, 10th Revision (ICD-10-CM) codes and Current Procedural Terminology (CPT; American Medical Association) codes and explain their use for obtaining reimbursement from Medicare, Medicaid, and private insurance. Results Each claim submitted to a payer for reimbursement of ototoxicity monitoring must include both ICD-10-CM codes to report the patient's diagnosis and CPT codes to report the services provided by the audiologist. Results address the general 3rd-party payer guidelines for ototoxicity monitoring and ICD-10-CM and CPT coding principles and provide illustrative examples. There is no “stand-alone” CPT code for high-frequency audiometry, an important test for ototoxicity monitoring. The current method of adding a –22 modifier to a standard audiometry code and then submitting a letter rationalizing why the test was done has inconsistent outcomes and is time intensive for the clinician. Similarly, some clinicians report difficulty getting reimbursed for detailed otoacoustic emissions testing in the context of ototoxicity monitoring. Conclusions Ethical practice, not reimbursement, must guide clinical practice. However, appropriate billing and coding resulting in 3rd-party reimbursement for audiology services rendered is critical for maintaining an effective ototoxicity monitoring program. Many 3rd-party payers reimburse for these services. For any CPT code, payment patterns vary widely within and across 3rd-party payers. Standardizing coding and billing practices as well as advocacy including letters from audiology national organizations may be necessary to help resolve these issues of coding and coverage in order to support best practice recommendations for ototoxicity monitoring.


Author(s):  
Timo D. Vloet ◽  
Marcel Romanos

Zusammenfassung. Hintergrund: Nach 12 Jahren Entwicklung wird die 11. Version der International Classification of Diseases (ICD-11) von der Weltgesundheitsorganisation (WHO) im Januar 2022 in Kraft treten. Methodik: Im Rahmen eines selektiven Übersichtsartikels werden die Veränderungen im Hinblick auf die Klassifikation von Angststörungen von der ICD-10 zur ICD-11 zusammenfassend dargestellt. Ergebnis: Die diagnostischen Kriterien der generalisierten Angststörung, Agoraphobie und spezifischen Phobien werden angepasst. Die ICD-11 wird auf Basis einer Lebenszeitachse neu organisiert, sodass die kindesaltersspezifischen Kategorien der ICD-10 aufgelöst werden. Die Trennungsangststörung und der selektive Mutismus werden damit den „regulären“ Angststörungen zugeordnet und können zukünftig auch im Erwachsenenalter diagnostiziert werden. Neu ist ebenso, dass verschiedene Symptomdimensionen der Angst ohne kategoriale Diagnose verschlüsselt werden können. Diskussion: Die Veränderungen im Bereich der Angsterkrankungen umfassen verschiedene Aspekte und sind in der Gesamtschau nicht unerheblich. Positiv zu bewerten ist die Einführung einer Lebenszeitachse und Parallelisierung mit dem Diagnostic and Statistical Manual of Mental Disorders (DSM-5). Schlussfolgerungen: Die entwicklungsbezogene Neuorganisation in der ICD-11 wird auch eine verstärkte längsschnittliche Betrachtung von Angststörungen in der Klinik sowie Forschung zur Folge haben. Damit rückt insbesondere die Präventionsforschung weiter in den Fokus.


2020 ◽  
Author(s):  
Shintaro Tsuji ◽  
Andrew Wen ◽  
Naoki Takahashi ◽  
Hongjian Zhang ◽  
Katsuhiko Ogasawara ◽  
...  

BACKGROUND Named entity recognition (NER) plays an important role in extracting the features of descriptions for mining free-text radiology reports. However, the performance of existing NER tools is limited because the number of entities depends on its dictionary lookup. Especially, the recognition of compound terms is very complicated because there are a variety of patterns. OBJECTIVE The objective of the study is to develop and evaluate a NER tool concerned with compound terms using the RadLex for mining free-text radiology reports. METHODS We leveraged the clinical Text Analysis and Knowledge Extraction System (cTAKES) to develop customized pipelines using both RadLex and SentiWordNet (a general-purpose dictionary, GPD). We manually annotated 400 of radiology reports for compound terms (Cts) in noun phrases and used them as the gold standard for the performance evaluation (precision, recall, and F-measure). Additionally, we also created a compound-term-enhanced dictionary (CtED) by analyzing false negatives (FNs) and false positives (FPs), and applied it for another 100 radiology reports for validation. We also evaluated the stem terms of compound terms, through defining two measures: an occurrence ratio (OR) and a matching ratio (MR). RESULTS The F-measure of the cTAKES+RadLex+GPD was 32.2% (Precision 92.1%, Recall 19.6%) and that of combined the CtED was 67.1% (Precision 98.1%, Recall 51.0%). The OR indicated that stem terms of “effusion”, "node", "tube", and "disease" were used frequently, but it still lacks capturing Cts. The MR showed that 71.9% of stem terms matched with that of ontologies and RadLex improved about 22% of the MR from the cTAKES default dictionary. The OR and MR revealed that the characteristics of stem terms would have the potential to help generate synonymous phrases using ontologies. CONCLUSIONS We developed a RadLex-based customized pipeline for parsing radiology reports and demonstrated that CtED and stem term analysis has the potential to improve dictionary-based NER performance toward expanding vocabularies.


Sign in / Sign up

Export Citation Format

Share Document