Optimization of a Computer-Aided Detection Scheme Using a Logistic Regression Model and Information Gain Feature Selection Method

Author(s):  
Zheng
Author(s):  
Jian-Wu Xu ◽  
Kenji Suzuki

One of the major challenges in current Computer-Aided Detection (CADe) of polyps in CT Colonography (CTC) is to improve the specificity without sacrificing the sensitivity. If a large number of False Positive (FP) detections of polyps are produced by the scheme, radiologists might lose their confidence in the use of CADe. In this chapter, the authors used a nonlinear regression model operating on image voxels and a nonlinear classification model with extracted image features based on Support Vector Machines (SVMs). They investigated the feasibility of a Support Vector Regression (SVR) in the massive-training framework, and the authors developed a Massive-Training SVR (MTSVR) in order to reduce the long training time associated with the Massive-Training Artificial Neural Network (MTANN) for reduction of FPs in CADe of polyps in CTC. In addition, the authors proposed a feature selection method directly coupled with an SVM classifier to maximize the CADe system performance. They compared the proposed feature selection method with the conventional stepwise feature selection based on Wilks’ lambda with a linear discriminant analysis classifier. The FP reduction system based on the proposed feature selection method was able to achieve a 96.0% by-polyp sensitivity with an FP rate of 4.1 per patient. The performance is better than that of the stepwise feature selection based on Wilks’ lambda (which yielded the same sensitivity with 18.0 FPs/patient). To test the performance of the proposed MTSVR, the authors compared it with the original MTANN in the distinction between actual polyps and various types of FPs in terms of the training time reduction and FP reduction performance. The authors’ CTC database consisted of 240 CTC datasets obtained from 120 patients in the supine and prone positions. With MTSVR, they reduced the training time by a factor of 190, while achieving a performance (by-polyp sensitivity of 94.7% with 2.5 FPs/patient) comparable to that of the original MTANN (which has the same sensitivity with 2.6 FPs/patient).


2014 ◽  
Author(s):  
Songyuan Wang ◽  
Guopeng Zhang ◽  
Qimei Liao ◽  
Junying Zhang ◽  
Chun Jiao ◽  
...  

2010 ◽  
Vol 9 ◽  
pp. CIN.S3794 ◽  
Author(s):  
Xiaosheng Wang ◽  
Osamu Gotoh

Gene selection is of vital importance in molecular classification of cancer using high-dimensional gene expression data. Because of the distinct characteristics inherent to specific cancerous gene expression profiles, developing flexible and robust feature selection methods is extremely crucial. We investigated the properties of one feature selection approach proposed in our previous work, which was the generalization of the feature selection method based on the depended degree of attribute in rough sets. We compared the feature selection method with the established methods: the depended degree, chi-square, information gain, Relief-F and symmetric uncertainty, and analyzed its properties through a series of classification experiments. The results revealed that our method was superior to the canonical depended degree of attribute based method in robustness and applicability. Moreover, the method was comparable to the other four commonly used methods. More importantly, the method can exhibit the inherent classification difficulty with respect to different gene expression datasets, indicating the inherent biology of specific cancers.


Author(s):  
GULDEN UCHYIGIT ◽  
KEITH CLARK

Text classification is the problem of classifying a set of documents into a pre-defined set of classes. A major problem with text classification problems is the high dimensionality of the feature space. Only a small subset of these words are feature words which can be used in determining a document's class, while the rest adds noise and can make the results unreliable and significantly increase computational time. A common approach in dealing with this problem is feature selection where the number of words in the feature space are significantly reduced. In this paper we present the experiments of a comparative study of feature selection methods used for text classification. Ten feature selection methods were evaluated in this study including the new feature selection method, called the GU metric. The other feature selection methods evaluated in this study are: Chi-Squared (χ2) statistic, NGL coefficient, GSS coefficient, Mutual Information, Information Gain, Odds Ratio, Term Frequency, Fisher Criterion, BSS/WSS coefficient. The experimental evaluations show that the GU metric obtained the best F1 and F2 scores. The experiments were performed on the 20 Newsgroups data sets with the Naive Bayesian Probabilistic Classifier.


2014 ◽  
Vol 618 ◽  
pp. 573-577 ◽  
Author(s):  
Yu Qiang Qin ◽  
Yu Dong Qi ◽  
Hui Ying

The assessment of risk of default on credit is important for financial institutions. Logistic regression and discriminant analysis are techniques traditionally used in credit rating for determining likelihood to default based on consumer application and credit reference agency data. We test support vector machines (SVM) against these traditional methods on a large credit card database. We find that they are competitive and can be used as the basis of a feature selection method to discover those features that are most significant in determining risk of default.


Author(s):  
Jianwu Xu ◽  
Amin Zarshenas ◽  
Yisong Chen ◽  
Kenji Suzuki

A major challenge in the latest computer-aided detection (CADe) of polyps in CT colonography (CTC) is to improve the false positive (FP) rate while maintaining detection sensitivity. Radiologists prefer CADe system produce small number of false positive detections, otherwise they might not consider CADe system improve their workflow. Towards this end, in this study, we applied a nonlinear regression model operating on CTC image voxels directly and a nonlinear classification model with extracted image features based on support vector machines (SVMs) in order to improve the specificity of CADe of polyps. We investigated the feasibility of a support vector regression (SVR) in the massive-training framework, and we developed a massive-training SVR (MTSVR) in order to reduce the long training time associated with the massive-training artificial neural network (MTANN) for reduction of FPs in CADe of polyps in CTC. In addition, we proposed a feature selection method directly coupled with an SVM classifier to maximize the CADe system performance. We compared the proposed feature selection method with the conventional stepwise feature selection based on Wilks' lambda with a linear discriminant analysis classifier. The FP reduction system based on the proposed feature selection method was able to achieve a 96.0% by-polyp sensitivity with an FP rate of 4.1 per patient. The performance is better than that of the stepwise feature selection based on Wilks' lambda (which yielded the same sensitivity with 18.0 FPs/patient). To test the performance of the proposed MTSVR, we compared it with the original MTANN in the distinction between actual polyps and various types of FPs in terms of the training time reduction and FP reduction performance. The CTC database used in this study consisted of 240 CTC datasets obtained from 120 patients in the supine and prone positions. With MTSVR, we reduced the training time by a factor of 190, while achieving a performance (by-polyp sensitivity of 94.7% with 2.5 FPs/patient) comparable to that of the original MTANN (which has the same sensitivity with 2.6 FPs/patient).


Sign in / Sign up

Export Citation Format

Share Document