Optimization of a Computer-Aided Detection Scheme Using a Logistic Regression Model and Information Gain Feature Selection Method

One of the major challenges in current Computer-Aided Detection (CADe) of polyps in CT Colonography (CTC) is to improve the specificity without sacrificing the sensitivity. If a large number of False Positive (FP) detections of polyps are produced by the scheme, radiologists might lose their confidence in the use of CADe. In this chapter, the authors used a nonlinear regression model operating on image voxels and a nonlinear classification model with extracted image features based on Support Vector Machines (SVMs). They investigated the feasibility of a Support Vector Regression (SVR) in the massive-training framework, and the authors developed a Massive-Training SVR (MTSVR) in order to reduce the long training time associated with the Massive-Training Artificial Neural Network (MTANN) for reduction of FPs in CADe of polyps in CTC. In addition, the authors proposed a feature selection method directly coupled with an SVM classifier to maximize the CADe system performance. They compared the proposed feature selection method with the conventional stepwise feature selection based on Wilks’ lambda with a linear discriminant analysis classifier. The FP reduction system based on the proposed feature selection method was able to achieve a 96.0% by-polyp sensitivity with an FP rate of 4.1 per patient. The performance is better than that of the stepwise feature selection based on Wilks’ lambda (which yielded the same sensitivity with 18.0 FPs/patient). To test the performance of the proposed MTSVR, the authors compared it with the original MTANN in the distinction between actual polyps and various types of FPs in terms of the training time reduction and FP reduction performance. The authors’ CTC database consisted of 240 CTC datasets obtained from 120 patients in the supine and prone positions. With MTSVR, they reduced the training time by a factor of 190, while achieving a performance (by-polyp sensitivity of 94.7% with 2.5 FPs/patient) comparable to that of the original MTANN (which has the same sensitivity with 2.6 FPs/patient).

Download Full-text

A ROC-based feature selection method for computer-aided detection and diagnosis

10.1117/12.2044003 ◽

2014 ◽

Author(s):

Songyuan Wang ◽

Guopeng Zhang ◽

Qimei Liao ◽

Junying Zhang ◽

Chun Jiao ◽

...

Keyword(s):

Feature Selection ◽

Feature Selection Method ◽

Selection Method ◽

Computer Aided Detection ◽

Computer Aided ◽

Detection And Diagnosis

Download Full-text

A Robust Gene selection Method for Microarray-based Cancer Classification

Cancer Informatics ◽

10.4137/cin.s3794 ◽

2010 ◽

Vol 9 ◽

pp. CIN.S3794 ◽

Cited By ~ 21

Author(s):

Xiaosheng Wang ◽

Osamu Gotoh

Keyword(s):

Gene Expression ◽

Feature Selection ◽

Gene Selection ◽

Information Gain ◽

Expression Profiles ◽

Feature Selection Method ◽

Gene Expression Profiles ◽

Molecular Classification ◽

Selection Method ◽

Chi Square

Gene selection is of vital importance in molecular classification of cancer using high-dimensional gene expression data. Because of the distinct characteristics inherent to specific cancerous gene expression profiles, developing flexible and robust feature selection methods is extremely crucial. We investigated the properties of one feature selection approach proposed in our previous work, which was the generalization of the feature selection method based on the depended degree of attribute in rough sets. We compared the feature selection method with the established methods: the depended degree, chi-square, information gain, Relief-F and symmetric uncertainty, and analyzed its properties through a series of classification experiments. The results revealed that our method was superior to the canonical depended degree of attribute based method in robustness and applicability. Moreover, the method was comparable to the other four commonly used methods. More importantly, the method can exhibit the inherent classification difficulty with respect to different gene expression datasets, indicating the inherent biology of specific cancers.

Download Full-text

A hybrid feature selection method based on genetic algorithm and information gain

2016 5th International Conference on Computer Science and Network Technology (ICCSNT) ◽

10.1109/iccsnt.2016.8070172 ◽

2016 ◽

Cited By ~ 1

Author(s):

Fei He ◽

Huamin Yang ◽

Yu Miao ◽

Rainbow Louis

Keyword(s):

Genetic Algorithm ◽

Feature Selection ◽

Information Gain ◽

Feature Selection Method ◽

Selection Method

Download Full-text

A NEW FEATURE SELECTION METHOD FOR TEXT CLASSIFICATION

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001407005466 ◽

2007 ◽

Vol 21 (02) ◽

pp. 423-438 ◽

Cited By ~ 9

Author(s):

GULDEN UCHYIGIT ◽

KEITH CLARK

Keyword(s):

Feature Selection ◽

Text Classification ◽

Information Gain ◽

Feature Selection Method ◽

Feature Space ◽

Selection Method ◽

Computational Time ◽

Small Subset ◽

Selection Methods ◽

New Feature

Text classification is the problem of classifying a set of documents into a pre-defined set of classes. A major problem with text classification problems is the high dimensionality of the feature space. Only a small subset of these words are feature words which can be used in determining a document's class, while the rest adds noise and can make the results unreliable and significantly increase computational time. A common approach in dealing with this problem is feature selection where the number of words in the feature space are significantly reduced. In this paper we present the experiments of a comparative study of feature selection methods used for text classification. Ten feature selection methods were evaluated in this study including the new feature selection method, called the GU metric. The other feature selection methods evaluated in this study are: Chi-Squared (χ2) statistic, NGL coefficient, GSS coefficient, Mutual Information, Information Gain, Odds Ratio, Term Frequency, Fisher Criterion, BSS/WSS coefficient. The experimental evaluations show that the GU metric obtained the best F1 and F2 scores. The experiments were performed on the 20 Newsgroups data sets with the Naive Bayesian Probabilistic Classifier.

Download Full-text

A developed feature selection method for classification based on united information gain

2017 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computed, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI) ◽

10.1109/uic-atc.2017.8397477 ◽

2017 ◽

Author(s):

Kun Niu ◽

Haizhen Jiao ◽

Zhipeng Gao ◽

Guannan Jia ◽

Guangyu Yang ◽

...

Keyword(s):

Feature Selection ◽

Information Gain ◽

Feature Selection Method ◽

Selection Method

Download Full-text

IG-C4.5:An Improved Feature Selection Method Based on Information Gain

Proceedings of the 2014 International Conference on Mechatronics, Electronic, Industrial and Control Engineering ◽

10.2991/meic-14.2014.244 ◽

2014 ◽

Author(s):

Kai Luo ◽

JunYong Luo ◽

MeiJuan Yin ◽

JianLin Li

Keyword(s):

Feature Selection ◽

Information Gain ◽

Feature Selection Method ◽

Selection Method

Download Full-text

SVM-Based Credit Rating and Feature Selection

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.618.573 ◽

2014 ◽

Vol 618 ◽

pp. 573-577 ◽

Cited By ~ 1

Author(s):

Yu Qiang Qin ◽

Yu Dong Qi ◽

Hui Ying

Keyword(s):

Logistic Regression ◽

Feature Selection ◽

Financial Institutions ◽

Credit Card ◽

Credit Rating ◽

Feature Selection Method ◽

Selection Method ◽

Support Vector ◽

Vector Machines ◽

Reference Agency

The assessment of risk of default on credit is important for financial institutions. Logistic regression and discriminant analysis are techniques traditionally used in credit rating for determining likelihood to default based on consumer application and credit reference agency data. We test support vector machines (SVM) against these traditional methods on a large credit card database. We find that they are competitive and can be used as the basis of a feature selection method to discover those features that are most significant in determining risk of default.

Download Full-text

A new feature selection method for text categorization based on information gain and particle swarm optimization

2014 IEEE 3rd International Conference on Cloud Computing and Intelligence Systems ◽

10.1109/ccis.2014.7175792 ◽

2014 ◽

Cited By ~ 5

Author(s):

Ferruh Yigit ◽

Omer Kaan Baykan

Keyword(s):

Feature Selection ◽

Particle Swarm Optimization ◽

Text Categorization ◽

Information Gain ◽

Particle Swarm ◽

Feature Selection Method ◽

Selection Method ◽

Swarm Optimization ◽

New Feature

Download Full-text

Massive-Training Support Vector Regression With Feature Selection in Application of Computer-Aided Detection of Polyps in CT Colonography

Emerging Developments and Practices in Oncology - Advances in Medical Diagnosis, Treatment, and Care ◽

10.4018/978-1-5225-3085-5.ch006 ◽

2018 ◽

pp. 153-190

Author(s):

Jianwu Xu ◽

Amin Zarshenas ◽

Yisong Chen ◽

Kenji Suzuki

Keyword(s):

Feature Selection ◽

False Positive ◽

Ct Colonography ◽

Feature Selection Method ◽

Selection Method ◽

Support Vector ◽

Training Time ◽

Computer Aided ◽

Wilks Lambda ◽

Cade System

A major challenge in the latest computer-aided detection (CADe) of polyps in CT colonography (CTC) is to improve the false positive (FP) rate while maintaining detection sensitivity. Radiologists prefer CADe system produce small number of false positive detections, otherwise they might not consider CADe system improve their workflow. Towards this end, in this study, we applied a nonlinear regression model operating on CTC image voxels directly and a nonlinear classification model with extracted image features based on support vector machines (SVMs) in order to improve the specificity of CADe of polyps. We investigated the feasibility of a support vector regression (SVR) in the massive-training framework, and we developed a massive-training SVR (MTSVR) in order to reduce the long training time associated with the massive-training artificial neural network (MTANN) for reduction of FPs in CADe of polyps in CTC. In addition, we proposed a feature selection method directly coupled with an SVM classifier to maximize the CADe system performance. We compared the proposed feature selection method with the conventional stepwise feature selection based on Wilks' lambda with a linear discriminant analysis classifier. The FP reduction system based on the proposed feature selection method was able to achieve a 96.0% by-polyp sensitivity with an FP rate of 4.1 per patient. The performance is better than that of the stepwise feature selection based on Wilks' lambda (which yielded the same sensitivity with 18.0 FPs/patient). To test the performance of the proposed MTSVR, we compared it with the original MTANN in the distinction between actual polyps and various types of FPs in terms of the training time reduction and FP reduction performance. The CTC database used in this study consisted of 240 CTC datasets obtained from 120 patients in the supine and prone positions. With MTSVR, we reduced the training time by a factor of 190, while achieving a performance (by-polyp sensitivity of 94.7% with 2.5 FPs/patient) comparable to that of the original MTANN (which has the same sensitivity with 2.6 FPs/patient).

Download Full-text