candidate feature
Recently Published Documents


TOTAL DOCUMENTS

19
(FIVE YEARS 10)

H-INDEX

3
(FIVE YEARS 3)

Author(s):  
Chunping Liao ◽  
Xuefang Chen ◽  
Bifu Li ◽  
Xiaofang Zhao ◽  
Li Yu

The crossing compression of retinal artery and vein is closely related to retinal vein occlusion, so detecting the contraction angle of the crossed vein blood vessel can assist to diagnose the retinal vein occlusion diseases. Through pretreating methods such as filtering, enhancement and edge extraction, the binary edge images can be extracted. The candidate feature points can be obtained by the corner point detection method based on chord-to-point distance accumulation (CPDA). The self-adaptive rectangular filter is used to screen out the crossing point of candidate angle, so as to fit the edge curves and figure out the contraction degree of vein. The experimental results show that this algorithm can better detect the contraction degree of crossed vein blood vessel, with an average error remaining at ± 1∘ under different resolutions.


2020 ◽  
Vol 39 (5) ◽  
pp. 7671-7691
Author(s):  
Xuning Liu ◽  
Guoying Zhang ◽  
Zixian Zhang

The feature selection of influencing factors of coal and gas outbursts is of great significance for presenting the most discriminative features and improving prediction performance of a classifier, the paper presents an effective hybrid feature selection and modified outbursts classifier framework which aims at solving exiting coal and gas outbursts prediction problems. First, a measurement standard based on maximum information coefficient(MIC) is employed to identify the wide correlations between two variables; Second, based on a ranking procedure using non-dominated sorting genetic algorithm(NSGAII), maximum relevance minimum redundancy(MRMR) algorithm is subsequently performed to find out candidate feature set highly related to the class label and uncorrelated with each other; Third, random forest(RF) is employed to search the optimal feature subset from the candidate feature set, then the optimal feature subset that influences the classification performance of coal and gas outbursts is obtained; Finally, an improved classifier model has been proposed that combines gradient boosting decision tree(GBDT) and k-nearest neighbor(KNN) for outbursts prediction. In the modified classifier model, the GBDT is utilized to assign different weights to features, then the weighted features are input into the KNN to verify the effectiveness of proposed method on coal and gas outbursts dataset. The experimental results conclude that our proposed scheme is effective in the number of feature and prediction accuracy when compared with other related state-of-the-art prediction models based on feature selection for coal and gas outbursts.


Author(s):  
Prajakta P Shelke ◽  
Aditya A Pardeshi

In the phrases words contains crucial information which helps in feature extraction process. The established techniques for such has huge problem and has limitations in feature extraction process and also it ignores the grammatical structure for the phrases. So results as poor features get extracted. So to overcome this problem a system is proposed which is based on generation of parse tree for the input sentence and cut down into sub-tree subsequently. The branches of the tree are extracted using part-of-speech (POS) labelling intended for candidate phrase. To stay away from redundant phrases filtering is recommended. Finally machine learning is used for the Feature categorization progression. The result illustrates the effectiveness of the approach.


Genes ◽  
2019 ◽  
Vol 10 (9) ◽  
pp. 672 ◽  
Author(s):  
Shuai Liu ◽  
Xiaohan Zhao ◽  
Guangyan Zhang ◽  
Weiyang Li ◽  
Feng Liu ◽  
...  

Long non-coding RNAs (lncRNAs) are a class of RNAs with the length exceeding 200 base pairs (bps), which do not encode proteins, nevertheless, lncRNAs have many vital biological functions. A large number of novel transcripts were discovered as a result of the development of high-throughput sequencing technology. Under this circumstance, computational methods for lncRNA prediction are in great demand. In this paper, we consider global sequence features and propose a stacked ensemble learning-based method to predict lncRNAs from transcripts, abbreviated as PredLnc-GFStack. We extract the critical features from the candidate feature list using the genetic algorithm (GA) and then employ the stacked ensemble learning method to construct PredLnc-GFStack model. Computational experimental results show that PredLnc-GFStack outperforms several state-of-the-art methods for lncRNA prediction. Furthermore, PredLnc-GFStack demonstrates an outstanding ability for cross-species ncRNA prediction.


Entropy ◽  
2019 ◽  
Vol 21 (7) ◽  
pp. 680 ◽  
Author(s):  
Zhang ◽  
Zhou

This study presents a comprehensive fault diagnosis method for rolling bearings. The method includes two parts: the fault detection and the fault classification. In the stage of fault detection, a threshold based on refined composite multiscale dispersion entropy (RCMDE) at a local maximum scale is defined to judge the health state of rolling bearings. If the bearing is in fault, a generalized multi-scale feature extraction method is developed to fully extract fault information by combining fast ensemble empirical mode decomposition (FEEMD) and RCMDE. Firstly, the fault vibration signals are decomposed into a set of intrinsic mode functions (IMFs) by FEEMD. Secondly, the RCMDE value of multiple IMFs is calculated to generate a candidate feature pool. Then, the maximum-relevance and minimum-redundancy (mRMR) approach is employed to select the sensitive features from the candidate feature pool to construct the final feature vectors, and the final feature vectors are fed into random forest (RF) classifier to identify different fault working conditions. Finally, experiments and comparative research are carried out to verify the performance of the proposed method. The results show that the proposed method can detect faults effectively. Meanwhile, it has a more robust and excellent ability to identify different fault types and severity compared with other conventional approaches.


2019 ◽  
Vol 11 (2) ◽  
pp. 41-52 ◽  
Author(s):  
Karthikeyan T. ◽  
Karthik Sekaran ◽  
Ranjith D. ◽  
Vinoth Kumar V. ◽  
Balajee J M

Web scraping is a technique to extract information from various web documents automatically. It retrieves the related contents based on the query, aggregates and transforms the data from an unstructured format into a structured representation. Text classification becomes a vital phase to summarize the data and in categorizing the webpages adequately. In this article, using effective web scraping methodologies, the data is initially extracted from websites, then transformed into a structured form. Based on the keywords from the data, the documents are classified and labeled. A recursive feature elimination technique is applied to the data to select the best candidate feature subset. The final data-set trained with standard machine learning algorithms. The proposed model performs well on classifying the documents from the extracted data with a better accuracy rate.


Author(s):  
P. Monisha ◽  
R. Rubanya ◽  
N. Malarvizhi

The overwhelming majority of existing approaches to opinion feature extraction trust mining patterns for one review corpus, ignoring the nontrivial disparities in word spacing characteristics of opinion options across completely different corpora. During this research a unique technique to spot opinion options from on-line reviews by exploiting the distinction in opinion feature statistics across two corpora, one domain-specific corpus (i.e., the given review corpus) and one domain-independent corpus (i.e., the contrasting corpus). The tendency to capture this inequality called domain relevance (DR), characterizes the relevancy of a term to a text assortment. The tendency to extract an inventory of candidate opinion options from the domain review corpus by shaping a group of grammar dependence rules. for every extracted candidate feature, to have a tendency to estimate its intrinsic-domain relevancy (IDR) and extrinsic-domain relevance(EDR) scores on the domain-dependent and domain-independent corpora, severally. Natural language processing (NLP) refers to computer systems that analyze, attempt understand, or produce one or more human languages, such as English, Japanese, Italian, or Russian. Process information contained in natural language text. The input might be text, spoken language, or keyboard input. The field of NLP is primarily concerned with getting computers to perform useful and interesting tasks with human languages. The field of NLP is secondarily concerned with helping us come to a better understanding of human language


2019 ◽  
Vol 9 (5) ◽  
pp. 898 ◽  
Author(s):  
Sunwoo Han ◽  
Hyunjoong Kim

Random forest is an ensemble method that combines many decision trees. Each level of trees is determined by an optimal rule among a candidate feature set. The candidate feature set is a random subset of all features, and is different at each level of trees. In this article, we investigated whether the accuracy of Random forest is affected by the size of the candidate feature set. We found that the optimal size differs from data to data without any specific pattern. To estimate the optimal size of feature set, we proposed a novel algorithm which uses the out-of-bag error and the ‘SearchSize’ exploration. The proposed method is significantly faster than the standard grid search method while giving almost the same accuracy. Finally, we demonstrated that the accuracy of Random forest using the proposed algorithm has increased significantly compared to using a typical size of feature set.


2019 ◽  
Vol 15 (3) ◽  
pp. 155014771983846
Author(s):  
Guoyu Zuo ◽  
Zhaokun Xu ◽  
Jiahao Lu ◽  
Daoxiong Gong

A feature subset discernibility hybrid evaluation method using Fisher score based on joint feature and support vector machine is proposed for the feature selection problem of the upper limb rehabilitation training motion of Brunnstrom 4–5 stage patients. In this method, the joint feature is introduced to evaluate the discernibility between classes due to the joint effect of both candidate and selected features. A feature subset search strategy is used to search a set of candidate feature subsets. The Fisher score based on joint feature method is used to evaluate the candidate feature subsets and the best subset is selected as a new selected feature subset. From these selected subsets such as obtained by the above process, the subset with the best performance of support vector machine classification is finally selected as the optimal feature subset. Experiments were carried out on the upper limb routine rehabilitation training samples of the Brunnstrom 4–5 stage. Compared with both the F-score and the discernibility of feature subset methods, the experimental results show the effectiveness and feasibility of the proposed method which can obtain the feature subsets with higher accuracy and smaller feature dimension.


Sign in / Sign up

Export Citation Format

Share Document