Automated selection of mid-height intervertebral disc slice in traverse lumbar spine MRI using a combination of deep learning feature and machine learning classifier

Abnormalities and defects that can cause lumbar spinal stenosis often occur in the Intervertebral Disc (IVD) of the patient’s lumbar spine. Their automatic detection and classification require an application of an image analysis algorithm on suitable images, such as mid-sagittal images or traverse mid-height intervertebral disc slices, as inputs. Hence the process of selecting and separating these images from other medical images in the patient’s set of scans is necessary. However, the technological progress in making this process automated is still lagging behind other areas in medical image classification research. In this paper, we report the result of our investigation on the suitability and performance of different approaches of machine learning to automatically select the best traverse plane that cuts closest to the half-height of an IVD from a database of lumbar spine MRI images. This study considers images features extracted using eleven different pre-trained Deep Convolution Neural Network (DCNN) models. We investigate the effectiveness of three dimensionality-reduction techniques and three feature-selection techniques on the classification performance. We also investigate the performance of five different Machine Learning (ML) algorithms and three Fully Connected (FC) neural network learning optimizers which are used to train an image classifier with hyperparameter optimization using a wide range of hyperparameter options and values. The different combinations of methods are tested on a publicly available lumbar spine MRI dataset consisting of MRI studies of 515 patients with symptomatic back pain. Our experiment shows that applying the Support Vector Machine algorithm with a short Gaussian kernel on full-length image features extracted using a pre-trained DenseNet201 model is the best approach to use. This approach gives the minimum per-class classification performance of around 0.88 when measured using the precision and recall metrics. The median performance measured using the precision metric ranges from 0.95 to 0.99 whereas that using the recall metric ranges from 0.93 to 1.0. When only considering the L3/L4, L4/L5, and L5/S1 classes, the minimum F1-Scores range between 0.93 to 0.95, whereas the median F1-Scores range between 0.97 to 0.99.

Download Full-text

Automated Discrimination of Dicentric and Monocentric Chromosomes by Machine Learning-based Image Processing

10.1101/037309 ◽

2016 ◽

Author(s):

Yanxin Li ◽

Joan H. Knoll ◽

Ruth Wilkins ◽

Farrah N. Flegal ◽

Peter K. Rogan

Keyword(s):

Machine Learning ◽

Peripheral Blood Lymphocytes ◽

Image Features ◽

Tuning Parameter ◽

Support Vector ◽

High Dose ◽

Predictive Values ◽

Wide Range ◽

Chromosome Separation

AbstractDose from radiation exposure can be estimated from dicentric chromosome (DC) frequencies in metaphase cells of peripheral blood lymphocytes. We automated DC detection by extracting features in Giemsa-stained metaphase chromosome images and classifying objects by machine learning (ML). DC detection involves i) intensity thresholded segmentation of metaphase objects, ii) chromosome separation by watershed transformation and elimination of inseparable chromosome clusters, fragments and staining debris using a morphological decision tree filter, iii) determination of chromosome width and centreline, iv) derivation of centromere candidates and v) distinction of DCs from monocentric chromosomes (MC) by ML. Centromere candidates are inferred from 14 image features input to a Support Vector Machine (SVM). 16 features derived from these candidates are then supplied to a Boosting classifier and a second SVM which determines whether a chromosome is either a DC or MC. The SVM was trained with 292 DCs and 3135 MCs, and then tested with cells exposed to either low (1 Gy) or high (2-4 Gy) radiation dose. Results were then compared with those of 3 experts. True positive rates (TPR) and positive predictive values (PPV) were determined for the tuning parameter, σ. At larger σ, PPV decreases and TPR increases. At high dose, for σ= 1.3, TPR = 0.52 and PPV = 0.83, while at σ= 1.6, the TPR = 0.65 and PPV = 0.72. At low dose and σ = 1.3, TPR = 0.67 and PPV = 0.26. The algorithm differentiates DCs from MCs, overlapped chromosomes and other objects with acceptable accuracy over a wide range of radiation exposures.

Download Full-text

Machine-learning algorithm for estimating oil-recovery factor using a combination of engineering and stratigraphic dependent parameters

Interpretation ◽

10.1190/int-2018-0211.1 ◽

2019 ◽

Vol 7 (3) ◽

pp. SE151-SE159 ◽

Cited By ~ 1

Author(s):

Kachalla Aliyuda ◽

John Howell

Keyword(s):

Machine Learning ◽

Mean Square Error ◽

North Sea ◽

Recovery Factor ◽

Gaussian Kernel ◽

Support Vector ◽

Mean Square ◽

Data Set ◽

Wide Range ◽

Testing Set

The methods used to estimate recovery factor change through the life cycle of a field. During appraisal, prior to development when there are no production data, we typically rely on analog fields and empirical methods. Given the absence of a perfect analog, these methods are typically associated with a wide range of uncertainty. During plateau, recovery factors are typically associated with simulation and dynamic modeling, whereas in later field life, once the field drops off the plateau, a decline curve analysis is also used. The use of different methods during different stages of the field life leads to uncertainty and potential inconsistencies in recovery estimates. A wide range of interacting, partially related, reservoir and production variables controls the production and recovery factor. Machine learning allows more complex multivariate analysis that can be used to investigate the roles of these variables using a training data set and then to ultimately predict future performance in fields. To investigate this approach, we used a data set consisting of producing reservoirs all of which are at plateau or in decline to train a series of machine-learning algorithms that can potentially predict the recovery factor with minimal percentage error. The database for this study consists of categorical and numerical properties for 93 reservoirs from the Norwegian Continental Shelf. Of these, 75 are from the Norwegian Sea, the Norwegian North Sea, and the Barents Sea, whereas the remaining 18 reservoirs are from the Viking Graben in the UK sector of the North Sea. The data set was divided into training and testing sets: The training set comprised approximately 80% of the total data, and the remaining 20% was the testing set. Linear regression models and a support vector machine (SVM) models were trained with all parameters in the data set (30 parameters); then with the 16 most influential parameters in the data set, the performance of these models was compared from results of fivefold crossvalidation. SVM training using a combination of 16 geologic/engineering parameters models with Gaussian kernel function has a root-mean-square error of 0.12, mean square error of 0.01, and [Formula: see text]-squared of 0.76. This model was tested on 18 reservoirs from the testing set; the test results are very similar to crossvalidation results during models training phase, suggesting that this method can potentially be used to predict the future recovery factor.

Download Full-text

Automatic Recognition of Asphalt Pavement Cracks Based on Image Processing and Machine Learning Approaches: A Comparative Study on Classifier Performance

Mathematical Problems in Engineering ◽

10.1155/2018/6290498 ◽

2018 ◽

Vol 2018 ◽

pp. 1-16 ◽

Cited By ~ 8

Author(s):

Nhat-Duc Hoang ◽

Quoc-Lam Nguyen

Keyword(s):

Neural Network ◽

Machine Learning ◽

Support Vector Machine ◽

Feature Extraction ◽

Comparative Study ◽

Asphalt Pavement ◽

Classification Performance ◽

Support Vector ◽

Learning Approaches ◽

Pavement Condition

Periodic surveys of asphalt pavement condition are very crucial in road maintenance. This work carries out a comparative study on the performance of machine learning approaches used for automatic pavement crack recognition. Six machine learning approaches, Naïve Bayesian Classifier (NBC), Classification Tree (CT), Backpropagation Artificial Neural Network (BPANN), Radial Basis Function Neural Network (RBFNN), Support Vector Machine (SVM), and Least Squares Support Vector Machine (LSSVM), have been employed. Additionally, Median Filter (MF), Steerable Filter (SF), and Projective Integral (PI) have been used to extract useful features from pavement images. In the feature extraction phase, performance comparison shows that the input pattern including the diagonal PIs enhances the classification performance significantly by creating more informative features. A simple moving average method is also employed to reduce the size of the feature set with positive effects on the model classification performance. Experimental results point out that LSSVM has achieved the highest classification accuracy rate. Therefore, this machine learning algorithm used with the feature extraction process proposed in this study can be a very promising tool to assist transportation agencies in the task of pavement condition survey.

Download Full-text

Performance of a semi-automatic machine leaning method for discriminating HER2 2+ status of breast cancers based on DCE-MRI (Preprint)

10.2196/preprints.16226 ◽

2019 ◽

Author(s):

Lirong Song ◽

Zejun Jiang ◽

Hecheng Lu ◽

Jiandong Yin

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Low Cost ◽

Principal Component ◽

Texture Features ◽

Classification Performance ◽

Image Features ◽

Support Vector ◽

Breast Cancers ◽

Dce Mri

BACKGROUND Amplification status of human epidermal growth factor receptor2 (HER2) 2+ is currently tested by fluorescence in situ hybridization (FISH). However, the FISH technique is expensive, time consuming, and requires off-site testing. The requirement for alternative low-cost and accurate surrogate measures to formal genetic analysis is urgent. In addition, machine learning is broadly accepted for its ability to decipher complicated connections between medical image features and gene expression status. OBJECTIVE To investigate the potential association between texture features extracted from dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) and HER2 2+ amplification status of breast cancer. METHODS 92 patients with HER2 2+ breast cancer who underwent 3T MRI and FISH detection in 2018 were retrospectively selected, including 52 HER2 2+ positive and 40 negative cases. The lesion area was delineated semi-automatically with MATLAB, and a total of 307 texture features were extracted from precontrast, postcontrast, and subtraction images, independently. The Student’s t-test or Mann-Whitney U test was performed to identify significant features between different HER2 2+ amplification status. Principal component analysis was used to eliminate the feature correlations. Three machine learning classifiers, logistic regression analysis, quadratic discriminant analysis, and support vector machine (SVM), were with a leave-one-outcross validation method used to establish the classification models of HER2 2+ amplification status. Classification performance was evaluated by receiver operating characteristic (ROC) analysis. RESULTS Texture features calculated from subtraction images showed more promising results than those obtained from pre- and postcontrast images. The model with the SVM based on features from subtraction image achieved the best performance, with an area under the ROC curve of 0.890, sensitivity of 80.77%, specificity of 85.00%, and accuracy of 82.61%. CONCLUSIONS To a certain extent, texture features of breast cancer extracted from DCE-MRI are associated with HER2 2+ amplification status. Additional studies are necessary to confirm the present preliminary findings.

Download Full-text

Binary Spectrum Feature for Improved Classiﬁer Performance

10.36227/techrxiv.12993122 ◽

2020 ◽

Author(s):

Nalika Ulapane ◽

Karthick Thiyagarajan ◽

sarath kodagoda

Keyword(s):

Machine Learning ◽

Classification Performance ◽

Feature Reduction ◽

Sensor Data ◽

Machine Learning Techniques ◽

Support Vector ◽

Svm Classifier ◽

Monitoring Task ◽

Classifier Performance ◽

Spectrum Feature

<div>Classiﬁcation has become a vital task in modern machine learning and Artiﬁcial Intelligence applications, including smart sensing. Numerous machine learning techniques are available to perform classiﬁcation. Similarly, numerous practices, such as feature selection (i.e., selection of a subset of descriptor variables that optimally describe the output), are available to improve classiﬁer performance. In this paper, we consider the case of a given supervised learning classiﬁcation task that has to be performed making use of continuous-valued features. It is assumed that an optimal subset of features has already been selected. Therefore, no further feature reduction, or feature addition, is to be carried out. Then, we attempt to improve the classiﬁcation performance by passing the given feature set through a transformation that produces a new feature set which we have named the “Binary Spectrum”. Via a case study example done on some Pulsed Eddy Current sensor data captured from an infrastructure monitoring task, we demonstrate how the classiﬁcation accuracy of a Support Vector Machine (SVM) classiﬁer increases through the use of this Binary Spectrum feature, indicating the feature transformation’s potential for broader usage.</div><div><br></div>

Download Full-text

Use of Machine Learning to Investigate the Quantitative Checklist for Autism in Toddlers (Q-CHAT) towards Early Autism Screening

Diagnostics ◽

10.3390/diagnostics11030574 ◽

2021 ◽

Vol 11 (3) ◽

pp. 574

Author(s):

Gennaro Tartarisco ◽

Giovanni Cicceri ◽

Davide Di Pietro ◽

Elisa Leonardi ◽

Stefania Aiello ◽

...

Keyword(s):

Machine Learning ◽

High Performance ◽

Behavioral Science ◽

Autistic Traits ◽

Classification Performance ◽

Recursive Feature Elimination ◽

Diagnostic Tools ◽

Support Vector ◽

K Nearest Neighbors ◽

Autism Screening

In the past two decades, several screening instruments were developed to detect toddlers who may be autistic both in clinical and unselected samples. Among others, the Quantitative CHecklist for Autism in Toddlers (Q-CHAT) is a quantitative and normally distributed measure of autistic traits that demonstrates good psychometric properties in different settings and cultures. Recently, machine learning (ML) has been applied to behavioral science to improve the classification performance of autism screening and diagnostic tools, but mainly in children, adolescents, and adults. In this study, we used ML to investigate the accuracy and reliability of the Q-CHAT in discriminating young autistic children from those without. Five different ML algorithms (random forest (RF), naïve Bayes (NB), support vector machine (SVM), logistic regression (LR), and K-nearest neighbors (KNN)) were applied to investigate the complete set of Q-CHAT items. Our results showed that ML achieved an overall accuracy of 90%, and the SVM was the most effective, being able to classify autism with 95% accuracy. Furthermore, using the SVM–recursive feature elimination (RFE) approach, we selected a subset of 14 items ensuring 91% accuracy, while 83% accuracy was obtained from the 3 best discriminating items in common to ours and the previously reported Q-CHAT-10. This evidence confirms the high performance and cross-cultural validity of the Q-CHAT, and supports the application of ML to create shorter and faster versions of the instrument, maintaining high classification accuracy, to be used as a quick, easy, and high-performance tool in primary-care settings.

Download Full-text

Detection of Malicious Software by Analyzing Distinct Artifacts Using Machine Learning and Deep Learning Algorithms

Electronics ◽

10.3390/electronics10141694 ◽

2021 ◽

Vol 10 (14) ◽

pp. 1694

Author(s):

Mathew Ashik ◽

A. Jyothish ◽

S. Anandaram ◽

P. Vinod ◽

Francesco Mercaldo ◽

...

Keyword(s):

Neural Network ◽

Machine Learning ◽

Deep Learning ◽

Support Vector ◽

Malware Analysis ◽

Learning Approaches ◽

Dynamic Features ◽

System Calls ◽

Prevention Methods ◽

Structural Aspects

Malware is one of the most significant threats in today’s computing world since the number of websites distributing malware is increasing at a rapid rate. Malware analysis and prevention methods are increasingly becoming necessary for computer systems connected to the Internet. This software exploits the system’s vulnerabilities to steal valuable information without the user’s knowledge, and stealthily send it to remote servers controlled by attackers. Traditionally, anti-malware products use signatures for detecting known malware. However, the signature-based method does not scale in detecting obfuscated and packed malware. Considering that the cause of a problem is often best understood by studying the structural aspects of a program like the mnemonics, instruction opcode, API Call, etc. In this paper, we investigate the relevance of the features of unpacked malicious and benign executables like mnemonics, instruction opcodes, and API to identify a feature that classifies the executable. Prominent features are extracted using Minimum Redundancy and Maximum Relevance (mRMR) and Analysis of Variance (ANOVA). Experiments were conducted on four datasets using machine learning and deep learning approaches such as Support Vector Machine (SVM), Naïve Bayes, J48, Random Forest (RF), and XGBoost. In addition, we also evaluate the performance of the collection of deep neural networks like Deep Dense network, One-Dimensional Convolutional Neural Network (1D-CNN), and CNN-LSTM in classifying unknown samples, and we observed promising results using APIs and system calls. On combining APIs/system calls with static features, a marginal performance improvement was attained comparing models trained only on dynamic features. Moreover, to improve accuracy, we implemented our solution using distinct deep learning methods and demonstrated a fine-tuned deep neural network that resulted in an F1-score of 99.1% and 98.48% on Dataset-2 and Dataset-3, respectively.

Download Full-text

Value of radiomics in differential diagnosis of chromophobe renal cell carcinoma and renal oncocytoma

Abdominal Radiology ◽

10.1007/s00261-019-02269-9 ◽

2019 ◽

Vol 45 (10) ◽

pp. 3193-3201 ◽

Cited By ~ 3

Author(s):

Yajuan Li ◽

Xialing Huang ◽

Yuwei Xia ◽

Liling Long

Keyword(s):

Machine Learning ◽

Differential Diagnosis ◽

Cell Carcinoma ◽

Area Under The Curve ◽

Image Features ◽

Renal Tumors ◽

Support Vector ◽

Svm Classifier ◽

Renal Oncocytoma ◽

Lasso Regression

Abstract Purpose To explore the value of CT-enhanced quantitative features combined with machine learning for differential diagnosis of renal chromophobe cell carcinoma (chRCC) and renal oncocytoma (RO). Methods Sixty-one cases of renal tumors (chRCC = 44; RO = 17) that were pathologically confirmed at our hospital between 2008 and 2018 were retrospectively analyzed. All patients had undergone preoperative enhanced CT scans including the corticomedullary (CMP), nephrographic (NP), and excretory phases (EP) of contrast enhancement. Volumes of interest (VOIs), including lesions on the images, were manually delineated using the RadCloud platform. A LASSO regression algorithm was used to screen the image features extracted from all VOIs. Five machine learning classifications were trained to distinguish chRCC from RO by using a fivefold cross-validation strategy. The performance of the classifier was mainly evaluated by areas under the receiver operating characteristic (ROC) curve and accuracy. Results In total, 1029 features were extracted from CMP, NP, and EP. The LASSO regression algorithm was used to screen out the four, four, and six best features, respectively, and eight features were selected when CMP and NP were combined. All five classifiers had good diagnostic performance, with area under the curve (AUC) values greater than 0.850, and support vector machine (SVM) classifier showed a diagnostic accuracy of 0.945 (AUC 0.964 ± 0.054; sensitivity 0.999; specificity 0.800), showing the best performance. Conclusions Accurate preoperative differential diagnosis of chRCC and RO can be facilitated by a combination of CT-enhanced quantitative features and machine learning.

Download Full-text

Analysis of the Nosema Cells Identification for Microscopic Images

Sensors ◽

10.3390/s21093068 ◽

2021 ◽

Vol 21 (9) ◽

pp. 3068

Author(s):

Soumaya Dghim ◽

Carlos M. Travieso-González ◽

Radim Burget

Keyword(s):

Neural Network ◽

Machine Learning ◽

Image Processing ◽

Deep Learning ◽

The Other ◽

Support Vector ◽

Learning Approaches ◽

Microscopic Images ◽

Trained Neural Network ◽

Nosema Disease

The use of image processing tools, machine learning, and deep learning approaches has become very useful and robust in recent years. This paper introduces the detection of the Nosema disease, which is considered to be one of the most economically significant diseases today. This work shows a solution for recognizing and identifying Nosema cells between the other existing objects in the microscopic image. Two main strategies are examined. The first strategy uses image processing tools to extract the most valuable information and features from the dataset of microscopic images. Then, machine learning methods are applied, such as a neural network (ANN) and support vector machine (SVM) for detecting and classifying the Nosema disease cells. The second strategy explores deep learning and transfers learning. Several approaches were examined, including a convolutional neural network (CNN) classifier and several methods of transfer learning (AlexNet, VGG-16 and VGG-19), which were fine-tuned and applied to the object sub-images in order to identify the Nosema images from the other object images. The best accuracy was reached by the VGG-16 pre-trained neural network with 96.25%.

Download Full-text

Possibility of Autonomous Estimation of Shiba Goat’s Estrus and Non-Estrus Behavior by Machine Learning Methods

Animals ◽

10.3390/ani10050771 ◽

2020 ◽

Vol 10 (5) ◽

pp. 771

Author(s):

Toshiya Arakawa

Keyword(s):

Neural Network ◽

Machine Learning ◽

Random Forest ◽

Markov Models ◽

Tracking System ◽

Video Tracking ◽

Training Data ◽

Support Vector ◽

Learning Methods ◽

Machine Learning Methods

Mammalian behavior is typically monitored by observation. However, direct observation requires a substantial amount of effort and time, if the number of mammals to be observed is sufficiently large or if the observation is conducted for a prolonged period. In this study, machine learning methods as hidden Markov models (HMMs), random forests, support vector machines (SVMs), and neural networks, were applied to detect and estimate whether a goat is in estrus based on the goat’s behavior; thus, the adequacy of the method was verified. Goat’s tracking data was obtained using a video tracking system and used to estimate whether they, which are in “estrus” or “non-estrus”, were in either states: “approaching the male”, or “standing near the male”. Totally, the PC of random forest seems to be the highest. However, The percentage concordance (PC) value besides the goats whose data were used for training data sets is relatively low. It is suggested that random forest tend to over-fit to training data. Besides random forest, the PC of HMMs and SVMs is high. However, considering the calculation time and HMM’s advantage in that it is a time series model, HMM is better method. The PC of neural network is totally low, however, if the more goat’s data were acquired, neural network would be an adequate method for estimation.

Download Full-text