candidate feature Latest Research Papers

The crossing compression of retinal artery and vein is closely related to retinal vein occlusion, so detecting the contraction angle of the crossed vein blood vessel can assist to diagnose the retinal vein occlusion diseases. Through pretreating methods such as filtering, enhancement and edge extraction, the binary edge images can be extracted. The candidate feature points can be obtained by the corner point detection method based on chord-to-point distance accumulation (CPDA). The self-adaptive rectangular filter is used to screen out the crossing point of candidate angle, so as to fit the edge curves and figure out the contraction degree of vein. The experimental results show that this algorithm can better detect the contraction degree of crossed vein blood vessel, with an average error remaining at ± 1∘ under different resolutions.

Download Full-text

A novel hybrid feature selection and modified KNN prediction model for coal and gas outbursts

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-200937 ◽

2020 ◽

Vol 39 (5) ◽

pp. 7671-7691

Author(s):

Xuning Liu ◽

Guoying Zhang ◽

Zixian Zhang

Keyword(s):

Feature Selection ◽

Prediction Models ◽

Gradient Boosting ◽

Feature Subset ◽

Maximum Information ◽

Coal And Gas Outbursts ◽

Optimal Feature Subset ◽

Prediction Problems ◽

Optimal Feature ◽

Candidate Feature

The feature selection of influencing factors of coal and gas outbursts is of great significance for presenting the most discriminative features and improving prediction performance of a classifier, the paper presents an effective hybrid feature selection and modified outbursts classifier framework which aims at solving exiting coal and gas outbursts prediction problems. First, a measurement standard based on maximum information coefficient(MIC) is employed to identify the wide correlations between two variables; Second, based on a ranking procedure using non-dominated sorting genetic algorithm(NSGAII), maximum relevance minimum redundancy(MRMR) algorithm is subsequently performed to find out candidate feature set highly related to the class label and uncorrelated with each other; Third, random forest(RF) is employed to search the optimal feature subset from the candidate feature set, then the optimal feature subset that influences the classification performance of coal and gas outbursts is obtained; Finally, an improved classifier model has been proposed that combines gradient boosting decision tree(GBDT) and k-nearest neighbor(KNN) for outbursts prediction. In the modified classifier model, the GBDT is utilized to assign different weights to features, then the weighted features are input into the KNN to verify the effectiveness of proposed method on coal and gas outbursts dataset. The experimental results conclude that our proposed scheme is effective in the number of feature and prediction accuracy when compared with other related state-of-the-art prediction models based on feature selection for coal and gas outbursts.

Download Full-text

Candidate Feature Extraction and Categorization for Unstructured Text Document

International Journal of Scientific Research in Computer Science Engineering and Information Technology ◽

10.32628/cseit20639 ◽

2020 ◽

pp. 81-87

Author(s):

Prajakta P Shelke ◽

Aditya A Pardeshi

Keyword(s):

Machine Learning ◽

Feature Extraction ◽

Extraction Process ◽

Grammatical Structure ◽

Part Of Speech ◽

Text Document ◽

Unstructured Text ◽

Input Sentence ◽

Crucial Information ◽

Candidate Feature

In the phrases words contains crucial information which helps in feature extraction process. The established techniques for such has huge problem and has limitations in feature extraction process and also it ignores the grammatical structure for the phrases. So results as poor features get extracted. So to overcome this problem a system is proposed which is based on generation of parse tree for the input sentence and cut down into sub-tree subsequently. The branches of the tree are extracted using part-of-speech (POS) labelling intended for candidate phrase. To stay away from redundant phrases filtering is recommended. Finally machine learning is used for the Feature categorization progression. The result illustrates the effectiveness of the approach.

Download Full-text

Review on Candidate Feature Extraction and Categorization for Unstructured Text Document

2020 Fourth International Conference on Computing Methodologies and Communication (ICCMC) ◽

10.1109/iccmc48092.2020.iccmc-00017 ◽

2020 ◽

Author(s):

Prajakta P Shelke ◽

Aditya A Pardeshi

Keyword(s):

Feature Extraction ◽

Text Document ◽

Unstructured Text ◽

Candidate Feature

Download Full-text

PredLnc-GFStack: A Global Sequence Feature Based on a Stacked Ensemble Learning Method for Predicting lncRNAs from Transcripts

Genes ◽

10.3390/genes10090672 ◽

2019 ◽

Vol 10 (9) ◽

pp. 672 ◽

Cited By ~ 2

Author(s):

Shuai Liu ◽

Xiaohan Zhao ◽

Guangyan Zhang ◽

Weiyang Li ◽

Feng Liu ◽

...

Keyword(s):

Ensemble Learning ◽

High Throughput Sequencing ◽

Learning Method ◽

Sequencing Technology ◽

Base Pairs ◽

Feature List ◽

Feature Based ◽

Novel Transcripts ◽

Non Coding Rnas ◽

Candidate Feature

Long non-coding RNAs (lncRNAs) are a class of RNAs with the length exceeding 200 base pairs (bps), which do not encode proteins, nevertheless, lncRNAs have many vital biological functions. A large number of novel transcripts were discovered as a result of the development of high-throughput sequencing technology. Under this circumstance, computational methods for lncRNA prediction are in great demand. In this paper, we consider global sequence features and propose a stacked ensemble learning-based method to predict lncRNAs from transcripts, abbreviated as PredLnc-GFStack. We extract the critical features from the candidate feature list using the genetic algorithm (GA) and then employ the stacked ensemble learning method to construct PredLnc-GFStack model. Computational experimental results show that PredLnc-GFStack outperforms several state-of-the-art methods for lncRNA prediction. Furthermore, PredLnc-GFStack demonstrates an outstanding ability for cross-species ncRNA prediction.

Download Full-text

A Comprehensive Fault Diagnosis Method for Rolling Bearings Based on Refined Composite Multiscale Dispersion Entropy and Fast Ensemble Empirical Mode Decomposition

Entropy ◽

10.3390/e21070680 ◽

2019 ◽

Vol 21 (7) ◽

pp. 680 ◽

Cited By ~ 4

Author(s):

Zhang ◽

Zhou

Keyword(s):

Fault Diagnosis ◽

Fault Detection ◽

Empirical Mode Decomposition ◽

Ensemble Empirical Mode Decomposition ◽

Fault Classification ◽

Rolling Bearings ◽

Feature Vectors ◽

Mode Decomposition ◽

Diagnosis Method ◽

Candidate Feature

This study presents a comprehensive fault diagnosis method for rolling bearings. The method includes two parts: the fault detection and the fault classification. In the stage of fault detection, a threshold based on refined composite multiscale dispersion entropy (RCMDE) at a local maximum scale is defined to judge the health state of rolling bearings. If the bearing is in fault, a generalized multi-scale feature extraction method is developed to fully extract fault information by combining fast ensemble empirical mode decomposition (FEEMD) and RCMDE. Firstly, the fault vibration signals are decomposed into a set of intrinsic mode functions (IMFs) by FEEMD. Secondly, the RCMDE value of multiple IMFs is calculated to generate a candidate feature pool. Then, the maximum-relevance and minimum-redundancy (mRMR) approach is employed to select the sensitive features from the candidate feature pool to construct the final feature vectors, and the final feature vectors are fed into random forest (RF) classifier to identify different fault working conditions. Finally, experiments and comparative research are carried out to verify the performance of the proposed method. The results show that the proposed method can detect faults effectively. Meanwhile, it has a more robust and excellent ability to identify different fault types and severity compared with other conventional approaches.

Download Full-text

Personalized Content Extraction and Text Classification Using Effective Web Scraping Techniques

International Journal of Web Portals ◽

10.4018/ijwp.2019070103 ◽

2019 ◽

Vol 11 (2) ◽

pp. 41-52 ◽

Cited By ~ 4

Author(s):

Karthikeyan T. ◽

Karthik Sekaran ◽

Ranjith D. ◽

Vinoth Kumar V. ◽

Balajee J M

Keyword(s):

Text Classification ◽

Machine Learning Algorithms ◽

Recursive Feature Elimination ◽

Feature Subset ◽

Data Set ◽

Web Documents ◽

Web Scraping ◽

Proposed Model ◽

Structured Representation ◽

Candidate Feature

Web scraping is a technique to extract information from various web documents automatically. It retrieves the related contents based on the query, aggregates and transforms the data from an unstructured format into a structured representation. Text classification becomes a vital phase to summarize the data and in categorizing the webpages adequately. In this article, using effective web scraping methodologies, the data is initially extracted from websites, then transformed into a structured form. Based on the keywords from the data, the documents are classified and labeled. A recursive feature elimination technique is applied to the data to select the best candidate feature subset. The final data-set trained with standard machine learning algorithms. The proposed model performs well on classifying the documents from the extracted data with a better accuracy rate.

Download Full-text

Social Marketplace Monitoring and Sentiment Analysis

International Journal of Scientific Research in Computer Science Engineering and Information Technology ◽

10.32628/cseit1952136 ◽

2019 ◽

pp. 127-133

Author(s):

P. Monisha ◽

R. Rubanya ◽

N. Malarvizhi

Keyword(s):

Natural Language ◽

Language Processing ◽

Natural Language Text ◽

Domain Specific ◽

Keyboard Input ◽

On Line ◽

The Given ◽

Domain Independent ◽

Language Text ◽

Candidate Feature

The overwhelming majority of existing approaches to opinion feature extraction trust mining patterns for one review corpus, ignoring the nontrivial disparities in word spacing characteristics of opinion options across completely different corpora. During this research a unique technique to spot opinion options from on-line reviews by exploiting the distinction in opinion feature statistics across two corpora, one domain-specific corpus (i.e., the given review corpus) and one domain-independent corpus (i.e., the contrasting corpus). The tendency to capture this inequality called domain relevance (DR), characterizes the relevancy of a term to a text assortment. The tendency to extract an inventory of candidate opinion options from the domain review corpus by shaping a group of grammar dependence rules. for every extracted candidate feature, to have a tendency to estimate its intrinsic-domain relevancy (IDR) and extrinsic-domain relevance(EDR) scores on the domain-dependent and domain-independent corpora, severally. Natural language processing (NLP) refers to computer systems that analyze, attempt understand, or produce one or more human languages, such as English, Japanese, Italian, or Russian. Process information contained in natural language text. The input might be text, spoken language, or keyboard input. The field of NLP is primarily concerned with getting computers to perform useful and interesting tasks with human languages. The field of NLP is secondarily concerned with helping us come to a better understanding of human language

Download Full-text

On the Optimal Size of Candidate Feature Set in Random forest

Applied Sciences ◽

10.3390/app9050898 ◽

2019 ◽

Vol 9 (5) ◽

pp. 898 ◽

Cited By ~ 3

Author(s):

Sunwoo Han ◽

Hyunjoong Kim

Keyword(s):

Random Forest ◽

Specific Pattern ◽

Search Method ◽

Optimal Size ◽

Grid Search ◽

Random Subset ◽

Typical Size ◽

Grid Search Method ◽

Candidate Feature ◽

Novel Algorithm

Random forest is an ensemble method that combines many decision trees. Each level of trees is determined by an optimal rule among a candidate feature set. The candidate feature set is a random subset of all features, and is different at each level of trees. In this article, we investigated whether the accuracy of Random forest is affected by the size of the candidate feature set. We found that the optimal size differs from data to data without any specific pattern. To estimate the optimal size of feature set, we proposed a novel algorithm which uses the out-of-bag error and the ‘SearchSize’ exploration. The proposed method is significantly faster than the standard grid search method while giving almost the same accuracy. Finally, we demonstrated that the accuracy of Random forest using the proposed algorithm has increased significantly compared to using a typical size of feature set.

Download Full-text

Feature subset evaluation method for upper limb rehabilitation training based on joint feature discernibility

International Journal of Distributed Sensor Networks ◽

10.1177/1550147719838467 ◽

2019 ◽

Vol 15 (3) ◽

pp. 155014771983846

Author(s):

Guoyu Zuo ◽

Zhaokun Xu ◽

Jiahao Lu ◽

Daoxiong Gong

Keyword(s):

Support Vector Machine ◽

Upper Limb ◽

Evaluation Method ◽

Support Vector ◽

Feature Subset ◽

Fisher Score ◽

Upper Limb Rehabilitation ◽

Rehabilitation Training ◽

Limb Rehabilitation ◽

Candidate Feature

A feature subset discernibility hybrid evaluation method using Fisher score based on joint feature and support vector machine is proposed for the feature selection problem of the upper limb rehabilitation training motion of Brunnstrom 4–5 stage patients. In this method, the joint feature is introduced to evaluate the discernibility between classes due to the joint effect of both candidate and selected features. A feature subset search strategy is used to search a set of candidate feature subsets. The Fisher score based on joint feature method is used to evaluate the candidate feature subsets and the best subset is selected as a new selected feature subset. From these selected subsets such as obtained by the above process, the subset with the best performance of support vector machine classification is finally selected as the optimal feature subset. Experiments were carried out on the upper limb routine rehabilitation training samples of the Brunnstrom 4–5 stage. Compared with both the F-score and the discernibility of feature subset methods, the experimental results show the effectiveness and feasibility of the proposed method which can obtain the feature subsets with higher accuracy and smaller feature dimension.

Download Full-text

candidate feature
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Angle detection system for crossing compression of retinal artery and vein

A novel hybrid feature selection and modified KNN prediction model for coal and gas outbursts

Candidate Feature Extraction and Categorization for Unstructured Text Document

Review on Candidate Feature Extraction and Categorization for Unstructured Text Document

PredLnc-GFStack: A Global Sequence Feature Based on a Stacked Ensemble Learning Method for Predicting lncRNAs from Transcripts

A Comprehensive Fault Diagnosis Method for Rolling Bearings Based on Refined Composite Multiscale Dispersion Entropy and Fast Ensemble Empirical Mode Decomposition

Personalized Content Extraction and Text Classification Using Effective Web Scraping Techniques

Social Marketplace Monitoring and Sentiment Analysis

On the Optimal Size of Candidate Feature Set in Random forest

Feature subset evaluation method for upper limb rehabilitation training based on joint feature discernibility

Export Citation Format

candidate featureRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Angle detection system for crossing compression of retinal artery and vein

A novel hybrid feature selection and modified KNN prediction model for coal and gas outbursts

Candidate Feature Extraction and Categorization for Unstructured Text Document

Review on Candidate Feature Extraction and Categorization for Unstructured Text Document

PredLnc-GFStack: A Global Sequence Feature Based on a Stacked Ensemble Learning Method for Predicting lncRNAs from Transcripts

A Comprehensive Fault Diagnosis Method for Rolling Bearings Based on Refined Composite Multiscale Dispersion Entropy and Fast Ensemble Empirical Mode Decomposition

Personalized Content Extraction and Text Classification Using Effective Web Scraping Techniques

Social Marketplace Monitoring and Sentiment Analysis

On the Optimal Size of Candidate Feature Set in Random forest

Feature subset evaluation method for upper limb rehabilitation training based on joint feature discernibility

candidate feature
Recently Published Documents