Lab indicators standardization method for the regional healthcare platform: a case study on heart failure

Abstract Background Laboratory indicator test results in electronic health records have been applied to many clinical big data analysis. However, it is quite common that the same laboratory examination item (i.e., lab indicator) is presented using different names in Chinese due to the translation problem and the habit problem of various hospitals, which results in distortion of analysis results. Methods A framework with a recall model and a binary classification model is proposed, which could reduce the alignment scale and improve the accuracy of lab indicator normalization. To reduce alignment scale, tf-idf is used for candidate selection. To assure the accuracy of output, we utilize enhanced sequential inference model for binary classification. And active learning is applied with a selection strategy which is proposed for reducing annotation cost. Results Since our indicator standardization method mainly focuses on Chinese indicator inconsistency, we perform our experiment on Shanghai Hospital Development Center and select clinical data from 8 hospitals. The method achieves a F1-score 92.08$$\%$$ % in our final binary classification. As for active learning, the new strategy proposed performs better than random baseline and could outperform the result trained on full data with only 43$$\%$$ % training data. A case study on heart failure clinic analysis conducted on the sub-dataset collected from SHDC shows that our proposed method is practical in the application with good performance. Conclusion This work demonstrates that the structure we proposed can be effectively applied to lab indicator normalization. And active learning is also suitable for this task for cost reduction. Such a method is also valuable in data cleaning, data mining, text extracting and entity alignment.

Download Full-text

Multi-Class Support Vector Machine via Maximizing Multi-Class Margins

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/440 ◽

2017 ◽

Cited By ~ 8

Author(s):

Jie Xu ◽

Xianglong Liu ◽

Zhouyuan Huo ◽

Cheng Deng ◽

Feiping Nie ◽

...

Keyword(s):

Support Vector Machine ◽

Binary Classification ◽

Training Data ◽

Classification Model ◽

Support Vector ◽

Frobenius Norm ◽

Great Success ◽

Svm Model ◽

Binary Classifiers ◽

And Training

Support Vector Machine (SVM) is originally proposed as a binary classification model, and it has already achieved great success in different applications. In reality, it is more often to solve a problem which has more than two classes. So, it is natural to extend SVM to a multi-class classifier. There have been many works proposed to construct a multi-class classifier based on binary SVM, such as one versus all strategy, one versus one strategy and Weston's multi-class SVM. One versus all strategy and one versus one strategy split the multi-class problem to multiple binary classification subproblems, and we need to train multiple binary classifiers. Weston's multi-class SVM is formed by ensuring risk constraints and imposing a specific regularization, like Frobenius norm. It is not derived by maximizing the margin between hyperplane and training data which is the motivation in SVM. In this paper, we propose a multi-class SVM model from the perspective of maximizing margin between training points and hyperplane, and analyze the relation between our model and other related methods. In the experiment, it shows that our model can get better or compared results when comparing with other related methods.

Download Full-text

Data Cleaning for Classification Using Misclassification Analysis

Journal of Advanced Computational Intelligence and Intelligent Informatics ◽

10.20965/jaciii.2010.p0297 ◽

2010 ◽

Vol 14 (3) ◽

pp. 297-302 ◽

Cited By ~ 32

Author(s):

Piyasak Jeatrakul ◽

◽

Kok Wai Wong ◽

Chun Che Fung

Keyword(s):

Data Cleaning ◽

Binary Classification ◽

Good Alternative ◽

Training Data ◽

Classification Model ◽

Data Sets ◽

Pima Indians ◽

Classification Problems ◽

Data Set ◽

Preprocessing Technique

In most classification problems, sometimes in order to achieve better results, data cleaning is used as a preprocessing technique. The purpose of data cleaning is to remove noise, inconsistent data and errors in the training data. This should enable the use of a better and representative data set to develop a reliable classification model. In most classification models, unclean data could sometime affect the classification accuracies of a model. In this paper, we investigate the use of misclassification analysis for data cleaning. In order to demonstrate our concept, we have used Artificial Neural Network (ANN) as the core computational intelligence technique. We use four benchmark data sets obtained from the University of California Irvine (UCI) machine learning repository to investigate the results from our proposed data cleaning technique. The experimental data sets used in our experiment are binary classification problems, which are German credit data, BUPA liver disorders, Johns Hopkins Ionosphere and Pima Indians Diabetes. The results show that the proposed cleaning technique could be a good alternative to provide some confidence when constructing a classification model.

Download Full-text

Case study: a patient with dilated cardiomyopathy, leftbundle branch block, recurrent syncope and heart failure

European Heart Journal ◽

10.1016/s0195-668x(02)80040-9 ◽

2002 ◽

Vol 4 ◽

pp. D132-D133

Author(s):

D BOCKER ◽

G BREITHARDT

Keyword(s):

Heart Failure ◽

Dilated Cardiomyopathy

Download Full-text

A Case Study on the Effects of Active Learning Classrooms in Higher Education

Korean Association for Educational Information and Media ◽

10.15833/kafeiam.24.4.733 ◽

2018 ◽

Vol 24 (4) ◽

pp. 733-754

Author(s):

Hyeon Woo Lee ◽

Yoon Mi Cha ◽

Kibeom Kim Kibeom Kim

Keyword(s):

Higher Education ◽

Active Learning

Download Full-text

Case Study: Mini-Case Studies: Small Infusions of Active Learning for Large-Lecture Courses

Journal of College Science Teaching ◽

10.2505/4/jcst17_046_06_63 ◽

2017 ◽

Vol 046 (06) ◽

Cited By ~ 3

Author(s):

Lisa Carloye

Keyword(s):

Active Learning ◽

Case Studies ◽

Large Lecture ◽

Lecture Courses

Download Full-text

High spatial resolution remote sensing image segmentation based on the multiclassification model and the binary classification model

Neural Computing and Applications ◽

10.1007/s00521-020-05561-8 ◽

2021 ◽

Author(s):

Xiaoxiong Zheng ◽

Tao Chen

Keyword(s):

Remote Sensing ◽

Image Segmentation ◽

Spatial Resolution ◽

High Spatial Resolution ◽

Binary Classification ◽

Remote Sensing Image ◽

Classification Model

Download Full-text

Failure mode and effect analysis (FMEA) to improve collaborative project-based learning: Case study of a Study and Research Path in mechanical engineering

International Journal of Mechanical Engineering Education ◽

10.1177/0306419021999046 ◽

2021 ◽

pp. 030641902199904

Author(s):

Elena Bartolomé ◽

Paula Benítez

Keyword(s):

Active Learning ◽

Failure Mode ◽

Mechanical Engineering ◽

Failure Modes ◽

Collaborative Study ◽

Project Based Learning ◽

Educational Process ◽

Effect Analysis ◽

Questions And Answers

Failure Mode and Effect Analysis (FMEA) is a powerful quality tool, widely used in industry, for the identification of failure modes, their effects and causes. In this work, we investigated the utility of FMEA in the education field to improve active learning processes. In our case study, the FMEA principles were adapted to assess the risk of failures in a Mechanical Engineering course on “Theory of Machines and Mechanisms” conducted through a project-based, collaborative “Study and Research Path (SRP)” methodology. The SRP is an active learning instruction format which is initiated by a generating question that leads to a sequence of derived questions and answers, and combines moments of study and inquiry. By applying the FMEA, the teaching team was able to identify the most critical failures of the process, and implement corrective actions to improve the SRP in the subsequent year. Thus, our work shows that FMEA represents a simple tool of risk assesment which can serve to identify criticality in educational process, and improve the quality of active learning.

Download Full-text

Anomaly Identification during Polymerase Chain Reaction for Detecting SARS-CoV-2 Using Artificial Intelligence Trained from Simulated Data

Molecules ◽

10.3390/molecules26010020 ◽

2020 ◽

Vol 26 (1) ◽

pp. 20

Author(s):

Reynaldo Villarreal-González ◽

Antonio J. Acosta-Hoyos ◽

Jaime A. Garzon-Ochoa ◽

Nataly J. Galán-Freyle ◽

Paola Amar-Sepúlveda ◽

...

Keyword(s):

Artificial Intelligence ◽

Big Data ◽

Real Time ◽

Binary Classification ◽

Simulated Data ◽

Classification Model ◽

Rt Pcr ◽

Simon Bolivar ◽

Polymerase Chain ◽

Available Information

Real-time reverse transcription (RT) PCR is the gold standard for detecting Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2), owing to its sensitivity and specificity, thereby meeting the demand for the rising number of cases. The scarcity of trained molecular biologists for analyzing PCR results makes data verification a challenge. Artificial intelligence (AI) was designed to ease verification, by detecting atypical profiles in PCR curves caused by contamination or artifacts. Four classes of simulated real-time RT-PCR curves were generated, namely, positive, early, no, and abnormal amplifications. Machine learning (ML) models were generated and tested using small amounts of data from each class. The best model was used for classifying the big data obtained by the Virology Laboratory of Simon Bolivar University from real-time RT-PCR curves for SARS-CoV-2, and the model was retrained and implemented in a software that correlated patient data with test and AI diagnoses. The best strategy for AI included a binary classification model, which was generated from simulated data, where data analyzed by the first model were classified as either positive or negative and abnormal. To differentiate between negative and abnormal, the data were reevaluated using the second model. In the first model, the data required preanalysis through a combination of prepossessing. The early amplification class was eliminated from the models because the numbers of cases in big data was negligible. ML models can be created from simulated data using minimum available information. During analysis, changes or variations can be incorporated by generating simulated data, avoiding the incorporation of large amounts of experimental data encompassing all possible changes. For diagnosing SARS-CoV-2, this type of AI is critical for optimizing PCR tests because it enables rapid diagnosis and reduces false positives. Our method can also be used for other types of molecular analyses.

Download Full-text

Investigating the Potential of Network Optimization for a Constrained Object Detection Problem

Journal of Imaging ◽

10.3390/jimaging7040064 ◽

2021 ◽

Vol 7 (4) ◽

pp. 64

Author(s):

Tanguy Ophoff ◽

Cédric Gullentops ◽

Kristof Van Beeck ◽

Toon Goedemé

Keyword(s):

Computational Complexity ◽

Object Detection ◽

Network Optimization ◽

Real Life ◽

Optimization Techniques ◽

Training Data ◽

Single Shot ◽

Standard Object ◽

Number Of Classes

Object detection models are usually trained and evaluated on highly complicated, challenging academic datasets, which results in deep networks requiring lots of computations. However, a lot of operational use-cases consist of more constrained situations: they have a limited number of classes to be detected, less intra-class variance, less lighting and background variance, constrained or even fixed camera viewpoints, etc. In these cases, we hypothesize that smaller networks could be used without deteriorating the accuracy. However, there are multiple reasons why this does not happen in practice. Firstly, overparameterized networks tend to learn better, and secondly, transfer learning is usually used to reduce the necessary amount of training data. In this paper, we investigate how much we can reduce the computational complexity of a standard object detection network in such constrained object detection problems. As a case study, we focus on a well-known single-shot object detector, YoloV2, and combine three different techniques to reduce the computational complexity of the model without reducing its accuracy on our target dataset. To investigate the influence of the problem complexity, we compare two datasets: a prototypical academic (Pascal VOC) and a real-life operational (LWIR person detection) dataset. The three optimization steps we exploited are: swapping all the convolutions for depth-wise separable convolutions, perform pruning and use weight quantization. The results of our case study indeed substantiate our hypothesis that the more constrained a problem is, the more the network can be optimized. On the constrained operational dataset, combining these optimization techniques allowed us to reduce the computational complexity with a factor of 349, as compared to only a factor 9.8 on the academic dataset. When running a benchmark on an Nvidia Jetson AGX Xavier, our fastest model runs more than 15 times faster than the original YoloV2 model, whilst increasing the accuracy by 5% Average Precision (AP).

Download Full-text