Prediction of Periventricular Leukomalacia Occurrence in Neonates Using a Novel Unsupervised Learning Method

Author(s):  
Dieter Bender ◽  
Ali Jalali ◽  
C. Nataraj

Prior work has documented that Support Vector Machine (SVM) classifiers can be powerful tools in predicting clinical outcomes of complex diseases such as Periventricular Leukomalacia (PVL). A preceding study indicated that SVM performance can be improved significantly by optimizing the supervised training set used during the learning stage of the overall SVM algorithm. This preliminary work, as well as the complex nature of the PVL data suggested integration of the active learning algorithm into the overall SVM framework. The present study supports this initial hypothesis and shows that active learning SVM type classifier performs considerably well and outperforms normal SVM type classifiers when dealing with clinical data of high dimensionality.

Author(s):  
Dieter Bender ◽  
Ali Jalali ◽  
Daniel J. Licht ◽  
C. Nataraj

Prior work has documented that Support Vector Machine (SVM) classifiers can be powerful tools in predicting clinical outcomes of complex diseases such as Periventricular Leukomalacia (PVL). Our previous study showed that SVM performance can be improved significantly by optimizing the supervised training set used during the learning stage of the overall SVM algorithm. This study fully develops the initial idea using the reliable Leave-One-Out Cross-validation (LOOCV) technique. The work presented in this paper confirms previous results and improves the performance of the SVM even further. In addition, using the LOOCV technique, the computational time is decreased and the structure of the algorithm simplified, making this framework more feasible. Furthermore, we evaluate the performance of the resulting optimized SVM classifier on an unseen set of data. This demonstrates that the developed SVM algorithm outperforms normal SVM type classifiers without any loss of generalization.


2021 ◽  
Vol 10 (5) ◽  
pp. 992
Author(s):  
Martina Barchitta ◽  
Andrea Maugeri ◽  
Giuliana Favara ◽  
Paolo Marco Riela ◽  
Giovanni Gallo ◽  
...  

Patients in intensive care units (ICUs) were at higher risk of worsen prognosis and mortality. Here, we aimed to evaluate the ability of the Simplified Acute Physiology Score (SAPS II) to predict the risk of 7-day mortality, and to test a machine learning algorithm which combines the SAPS II with additional patients’ characteristics at ICU admission. We used data from the “Italian Nosocomial Infections Surveillance in Intensive Care Units” network. Support Vector Machines (SVM) algorithm was used to classify 3782 patients according to sex, patient’s origin, type of ICU admission, non-surgical treatment for acute coronary disease, surgical intervention, SAPS II, presence of invasive devices, trauma, impaired immunity, antibiotic therapy and onset of HAI. The accuracy of SAPS II for predicting patients who died from those who did not was 69.3%, with an Area Under the Curve (AUC) of 0.678. Using the SVM algorithm, instead, we achieved an accuracy of 83.5% and AUC of 0.896. Notably, SAPS II was the variable that weighted more on the model and its removal resulted in an AUC of 0.653 and an accuracy of 68.4%. Overall, these findings suggest the present SVM model as a useful tool to early predict patients at higher risk of death at ICU admission.


2009 ◽  
Vol 15 (2) ◽  
pp. 241-271 ◽  
Author(s):  
YAOYONG LI ◽  
KALINA BONTCHEVA ◽  
HAMISH CUNNINGHAM

AbstractSupport Vector Machines (SVM) have been used successfully in many Natural Language Processing (NLP) tasks. The novel contribution of this paper is in investigating two techniques for making SVM more suitable for language learning tasks. Firstly, we propose an SVM with uneven margins (SVMUM) model to deal with the problem of imbalanced training data. Secondly, SVM active learning is employed in order to alleviate the difficulty in obtaining labelled training data. The algorithms are presented and evaluated on several Information Extraction (IE) tasks, where they achieved better performance than the standard SVM and the SVM with passive learning, respectively. Moreover, by combining SVMUM with the active learning algorithm, we achieve the best reported results on the seminars and jobs corpora, which are benchmark data sets used for evaluation and comparison of machine learning algorithms for IE. In addition, we also evaluate the token based classification framework for IE with three different entity tagging schemes. In comparison to previous methods dealing with the same problems, our methods are both effective and efficient, which are valuable features for real-world applications. Due to the similarity in the formulation of the learning problem for IE and for other NLP tasks, the two techniques are likely to be beneficial in a wide range of applications1.


2013 ◽  
Vol 333-335 ◽  
pp. 1344-1348
Author(s):  
Yu Kai Yao ◽  
Yang Liu ◽  
Zhao Li ◽  
Xiao Yun Chen

Support Vector Machine (SVM) is one of the most popular and effective data mining algorithms which can be used to resolve classification or regression problems, and has attracted much attention these years. SVM could find the optimal separating hyperplane between classes, which afford outstanding generalization ability with it. Usually all the labeled records are used as training set. However, the optimal separating hyperplane only depends on a few crucial samples (Support Vectors, SVs), we neednt train SVM model on the whole training set. In this paper a novel SVM model based on K-means clustering is presented, in which only a small subset of the original training set is selected to constitute the final training set, and the SVM classifier is built through training on these selected samples. This greatly decrease the scale of the training set, and effectively saves the training and predicting cost of SVM, meanwhile guarantees its generalization performance.


Author(s):  
Ade Nurhopipah ◽  
Uswatun Hasanah

The performance of classification models in machine learning algorithms is influenced by many factors, one of which is dataset splitting method. To avoid overfitting, it is important to apply a suitable dataset splitting strategy. This study presents comparison of four dataset splitting techniques, namely Random Sub-sampling Validation (RSV), k-Fold Cross Validation (k-FCV), Bootstrap Validation (BV) and Moralis Lima Martin Validation (MLMV). This comparison is done in face classification on CCTV images using Convolutional Neural Network (CNN) algorithm and Support Vector Machine (SVM) algorithm. This study is also applied in two image datasets. The results of the comparison are reviewed by using model accuracy in training set, validation set and test set, also bias and variance of the model. The experiment shows that k-FCV technique has more stable performance and provide high accuracy on training set as well as good generalizations on validation set and test set. Meanwhile, data splitting using MLMV technique has lower performance than the other three techniques since it yields lower accuracy. This technique also shows higher bias and variance values and it builds overfitting models, especially when it is applied on validation set.


2015 ◽  
Vol 27 (8) ◽  
pp. 1738-1765 ◽  
Author(s):  
Chun-Liang Li ◽  
Chun-Sung Ferng ◽  
Hsuan-Tien Lin

The abundance of real-world data and limited labeling budget calls for active learning, an important learning paradigm for reducing human labeling efforts. Many recently developed active learning algorithms consider both uncertainty and representativeness when making querying decisions. However, exploiting representativeness with uncertainty concurrently usually requires tackling sophisticated and challenging learning tasks, such as clustering. In this letter, we propose a new active learning framework, called hinted sampling, which takes both uncertainty and representativeness into account in a simpler way. We design a novel active learning algorithm within the hinted sampling framework with an extended support vector machine. Experimental results validate that the novel active learning algorithm can result in a better and more stable performance than that achieved by state-of-the-art algorithms. We also show that the hinted sampling framework allows improving another active learning algorithm designed from the transductive support vector machine.


2020 ◽  
Vol 10 (2) ◽  
Author(s):  
Daniel M Bittner ◽  
Alejandro E Brito ◽  
Mohsen Ghassemi ◽  
Shantanu Rane ◽  
Anand D Sarwate ◽  
...  

We consider privacy-preserving learning in the context of online learning. Insettings where data instances arrive sequentially in streaming fashion, incremental trainingalgorithms such as stochastic gradient descent (SGD) can be used to learn and updateprediction models. When labels are costly to acquire, active learning methods can beused to select samples to be labeled from a stream of unlabeled data. These labeled datasamples are then used to update the machine learning models. Privacy-preserving onlinelearning can be used to update predictors on data streams containing sensitive information.The differential privacy framework quantifies the privacy risk in such settings. This workproposes a differentially private online active learning algorithm using stochastic gradientdescent (SGD) to retrain the classifiers. We propose two methods for selecting informativesamples. We incorporated this into a general-purpose web application that allows a non-expert user to evaluate the privacy-aware classifier and visualize key privacy-utility tradeoffs.Our application supports linear support vector machines and logistic regression and enablesan analyst to configure and visualize the effect of using differentially private online activelearning versus a non-private counterpart. The application is useful for comparing theprivacy/utility tradeoff of different algorithms, which can be useful to decision makers inchoosing which algorithms and parameters to use. Additionally, we use the application toevaluate our SGD-based solution and to show that it generates predictions with a superiorprivacy-utility tradeoff than earlier methods.


2021 ◽  
Vol 6 (1) ◽  
pp. 55-59
Author(s):  
Yahya Dwikarsa ◽  
Abdul Basith

The scale value is an important part of the segmentation stage which is part of Object-Based Image Analysis (OBIA). Selection of scale value can determine the size of the object which affects the results of classification accuracy. In addition to setting the scale value (multiscale), selection of machine learning algorithm applied to classify shallow water benthic habitat objects can also determine the success of the classification. Combination of setting scale values and classification algorithms are aimed to get optimal results by examining classification accuracies. This study uses orthophoto images processed from Unmanned Aerial Vehicle (UAV) mission intended to capture benthic habitat in Karimunjawa waters. The classification algorithms used are Support Vector Machine (SVM), Bayes, and K-Nearest Neighbors (KNN). The results of the classification of combination are then tested for accuracy based on the sample and Training Test Area (TTA) masks. The result shows that SVM algorithm with scale of 300 produces the best level of accuracy. While the lowest accuracy is achieved by using SVM algorithm with scale of 100. The result shows that the optimal scale settings in segmenting objects sequentially are 300, 200, and 100


2020 ◽  
Vol 12 (11) ◽  
pp. 168781402097189
Author(s):  
Tsun-Kuo Lin

In this study, a dynamic weight-based method combined with principal component analysis (PCA) was developed for the first time for detecting measurement data in manufacturing. This weight-based learning technique can learn and train the measurement data sequence to isolate incorrect data sources for achieving high accuracy when detecting various types of data. Research has revealed that unsuitable image or data features might cause poor performance in industrial inspections. In contrast to the previous inspection methods, the weight-based learning method proposed in this study employs a dynamic learning algorithm for effectively and adaptively selecting optimal principle components to the support vector machine (SVM) algorithm and then establishes indicators. Finally, these PCA-based indicators act as substitutes for massive amounts of data in data processing and can be applied to timely detect data when the data contain redundant and incorrect inputs in a sequence. The experimental results indicate that the proposed method, which combines dynamic weight-based feature extraction with PCA, can provide useful indicators for detecting various types of manufacturing data and exhibited satisfactory performance in the data detection.


Entropy ◽  
2020 ◽  
Vol 22 (11) ◽  
pp. 1314
Author(s):  
Mofei Song

Currently, deep learning has shown state-of-the-art performance in image classification with pre-defined taxonomy. However, in a more real-world scenario, different users usually have different classification intents given an image collection. To satisfactorily personalize the requirement, we propose an interactive image classification system with an offline representation learning stage and an online classification stage. During the offline stage, we learn a deep model to extract the feature with higher flexibility and scalability for different users’ preferences. Instead of training the model only with the inter-class discrimination, we also encode the similarity between the semantic-embedding vectors of the category labels into the model. This makes the extracted feature adapt to multiple taxonomies with different granularities. During the online session, an annotation task iteratively alternates with a high-throughput verification task. When performing the verification task, the users are only required to indicate the incorrect prediction without giving the exact category label. For each iteration, our system chooses the images to be annotated or verified based on interactive efficiency optimization. To provide a high interactive rate, a unified active learning algorithm is used to search the optimal annotation and verification set by minimizing the expected time cost. After interactive annotation and verification, the new classified images are used to train a customized classifier online, which reflects the user-adaptive intent of categorization. The learned classifier is then used for subsequent annotation and verification tasks. Experimental results under several public image datasets show that our method outperforms existing methods.


Sign in / Sign up

Export Citation Format

Share Document