Feature Extraction and Analysis for Lung Nodule Classification using Random Forest

Background: In the past decades, handwritten character recognition has received considerable attention from researchers across the globe because of its wide range of applications in daily life. From the literature, it has been observed that there is limited study on various handwritten Indian scripts and Odia is one of them. We revised some of the patents relating to handwritten character recognition. Methods: This paper deals with the development of an automatic recognition system for offline handwritten Odia character recognition. In this case, prior to feature extraction from images, preprocessing has been done on the character images. For feature extraction, first the gray level co-occurrence matrix (GLCM) is computed from all the sub-bands of two-dimensional discrete wavelet transform (2D DWT) and thereafter, feature descriptors such as energy, entropy, correlation, homogeneity, and contrast are calculated from GLCMs which are termed as the primary feature vector. In order to further reduce the feature space and generate more relevant features, principal component analysis (PCA) has been employed. Because of the several salient features of random forest (RF) and K- nearest neighbor (K-NN), they have become a significant choice in pattern classification tasks and therefore, both RF and K-NN are separately applied in this study for segregation of character images. Results: All the experiments were performed on a system having specification as windows 8, 64-bit operating system, and Intel (R) i7 – 4770 CPU @ 3.40 GHz. Simulations were conducted through Matlab2014a on a standard database named as NIT Rourkela Odia Database. Conclusion: The proposed system has been validated on a standard database. The simulation results based on 10-fold cross-validation scenario demonstrate that the proposed system earns better accuracy than the existing methods while requiring least number of features. The recognition rate using RF and K-NN classifier is found to be 94.6% and 96.4% respectively.

Download Full-text

Document Preprocessing with TF-IDF to Improve the Polarity Classification Performance of Unstructured Sentiment Analysis

Kinetik Game Technology Information System Computer Network Computing Electronics and Control ◽

10.22219/kinetik.v5i3.1066 ◽

2020 ◽

pp. 235-242

Author(s):

Farrikh Alzami ◽

Erika Devi Udayanti ◽

Dwi Puji Prabowo ◽

Rama Aria Megantara

Keyword(s):

Machine Learning ◽

Feature Extraction ◽

Random Forest ◽

Sentiment Analysis ◽

Classification Performance ◽

Document Preparation ◽

Learning Models ◽

Polarity Classification ◽

Negative Sentiment ◽

Machine Learning Models

Sentiment analysis in terms of polarity classification is very important in everyday life, with the existence of polarity, many people can find out whether the respected document has positive or negative sentiment so that it can help in choosing and making decisions. Sentiment analysis usually done manually. Therefore, an automatic sentiment analysis classification process is needed. However, it is rare to find studies that discuss extraction features and which learning models are suitable for unstructured sentiment analysis types with the Amazon food review case. This research explores some extraction features such as Word Bags, TF-IDF, Word2Vector, as well as a combination of TF-IDF and Word2Vector with several machine learning models such as Random Forest, SVM, KNN and Naïve Bayes to find out a combination of feature extraction and learning models that can help add variety to the analysis of polarity sentiments. By assisting with document preparation such as html tags and punctuation and special characters, using snowball stemming, TF-IDF results obtained with SVM are suitable for obtaining a polarity classification in unstructured sentiment analysis for the case of Amazon food review with a performance result of 87,3 percent.

Download Full-text

LNCDS: A 2D-3D cascaded CNN approach for lung nodule classification, detection and segmentation

Biomedical Signal Processing and Control ◽

10.1016/j.bspc.2021.102527 ◽

2021 ◽

Vol 67 ◽

pp. 102527

Author(s):

Prasad Dutande ◽

Ujjwal Baid ◽

Sanjay Talbar

Keyword(s):

Lung Nodule ◽

Nodule Classification

Download Full-text

Meta Ordinal Weighting Net For Improving Lung Nodule Classification

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp39728.2021.9413622 ◽

2021 ◽

Author(s):

Yiming Lei ◽

Hongming Shan ◽

Junping Zhang

Keyword(s):

Lung Nodule ◽

Nodule Classification

Download Full-text

Attention Aware and Multiple Granularity 3D Convolutional Neural Networks for Lung Nodule Classification on CT Image

2020 IEEE 5th International Conference on Signal and Image Processing (ICSIP) ◽

10.1109/icsip49896.2020.9339331 ◽

2020 ◽

Author(s):

Wen Wu ◽

Yanfeng Li ◽

Yuhao You ◽

Minjun Wang ◽

Kuan Chen ◽

...

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Lung Nodule ◽

Ct Image ◽

Multiple Granularity ◽

Nodule Classification

Download Full-text

Lung nodule classification using combination of CNN, second and higher order texture features

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-189847 ◽

2021 ◽

pp. 1-9

Author(s):

Amrita Naik ◽

Damodar Reddy Edla

Keyword(s):

Malignant Tumors ◽

Early Stage ◽

Lung Nodule ◽

Texture Features ◽

Svm Classifier ◽

Softmax Classifier ◽

Spatial Features ◽

High Level ◽

Learning Architectures ◽

Nodule Classification

Lung cancer is the most common cancer throughout the world and identification of malignant tumors at an early stage is needed for diagnosis and treatment of patient thus avoiding the progression to a later stage. In recent times, deep learning architectures such as CNN have shown promising results in effectively identifying malignant tumors in CT scans. In this paper, we combine the CNN features with texture features such as Haralick and Gray level run length matrix features to gather benefits of high level and spatial features extracted from the lung nodules to improve the accuracy of classification. These features are further classified using SVM classifier instead of softmax classifier in order to reduce the overfitting problem. Our model was validated on LUNA dataset and achieved an accuracy of 93.53%, sensitivity of 86.62%, the specificity of 96.55%, and positive predictive value of 94.02%.

Download Full-text

Categorisation of EEG suppression using enhanced feature extraction for SUDEP risk assessment

BMC Medical Informatics and Decision Making ◽

10.1186/s12911-020-01309-5 ◽

2020 ◽

Vol 20 (S12) ◽

Author(s):

Juan C. Mier ◽

Yejin Kim ◽

Xiaoqian Jiang ◽

Guo-Qiang Zhang ◽

Samden Lhatoo

Keyword(s):

Sensitivity Analysis ◽

Feature Extraction ◽

Random Forest ◽

Window Size ◽

Extraction Process ◽

Unexpected Death ◽

Power Spectral ◽

Eeg Data ◽

Boosted Decision Trees ◽

Future Work

Abstract Background Sudden Unexpected Death in Epilepsy (SUDEP) has increased in awareness considerably over the last two decades and is acknowledged as a serious problem in epilepsy. However, the scientific community remains unclear on the reason or possible bio markers that can discern potentially fatal seizures from other non-fatal seizures. The duration of postictal generalized EEG suppression (PGES) is a promising candidate to aid in identifying SUDEP risk. The length of time a patient experiences PGES after a seizure may be used to infer the risk a patient may have of SUDEP later in life. However, the problem becomes identifying the duration, or marking the end, of PGES (Tomson et al. in Lancet Neurol 7(11):1021–1031, 2008; Nashef in Epilepsia 38:6–8, 1997). Methods This work addresses the problem of marking the end to PGES in EEG data, extracted from patients during a clinically supervised seizure. This work proposes a sensitivity analysis on EEG window size/delay, feature extraction and classifiers along with associated hyperparameters. The resulting sensitivity analysis includes the Gradient Boosted Decision Trees and Random Forest classifiers trained on 10 extracted features rooted in fundamental EEG behavior using an EEG specific feature extraction process (pyEEG) and 5 different window sizes or delays (Bao et al. in Comput Intell Neurosci 2011:1687–5265, 2011). Results The machine learning architecture described above scored a maximum AUC score of 76.02% with the Random Forest classifier trained on all extracted features. The highest performing features included SVD Entropy, Petrosan Fractal Dimension and Power Spectral Intensity. Conclusion The methods described are effective in automatically marking the end to PGES. Future work should include integration of these methods into the clinical setting and using the results to be able to predict a patient’s SUDEP risk.

Download Full-text

Feature Extraction and Analysis for Lung Nodule Classification using Random Forest

Analysis of Lung Nodule Classification with Feature Extraction

Random forest based lung nodule classification aided by clustering

Computer-Aided Diagnosis System for Lung Nodule Classification Using Computer Tomography Scan Images

Gray-Level Co-occurrence Matrix and Random Forest Based Off-line Odia Handwritten Character Recognition

Document Preprocessing with TF-IDF to Improve the Polarity Classification Performance of Unstructured Sentiment Analysis

LNCDS: A 2D-3D cascaded CNN approach for lung nodule classification, detection and segmentation

Meta Ordinal Weighting Net For Improving Lung Nodule Classification

Attention Aware and Multiple Granularity 3D Convolutional Neural Networks for Lung Nodule Classification on CT Image

Lung nodule classification using combination of CNN, second and higher order texture features

Categorisation of EEG suppression using enhanced feature extraction for SUDEP risk assessment

Export Citation Format