A Classification Model for Drug Addicts Based on Improved Random Forests Algorithm

Author(s):  
Tianyue Chen ◽  
Haiyan Gu
2018 ◽  
Vol 11 (4) ◽  
pp. 86 ◽  
Author(s):  
Lei Xu ◽  
Takuji Kinkyo ◽  
Shigeyuki Hamori

We propose a novel approach that combines random forests and the wavelet transform to model the prediction of currency crises. Our classification model of random forests, built using both standard predictors and wavelet predictors, and obtained from the wavelet transform, achieves a demonstrably high level of predictive accuracy. We also use variable importance measures to find that wavelet predictors are key predictors of crises. In particular, we find that real exchange rate appreciation and overvaluation, which are measured over a horizon of 16–32 months, are the most important.


2018 ◽  
Vol 10 (5) ◽  
pp. 1530 ◽  
Author(s):  
Katsuyuki Tanaka ◽  
Takuji Kinkyo ◽  
Shigeyuki Hamori

Entropy ◽  
2019 ◽  
Vol 21 (1) ◽  
pp. 96 ◽  
Author(s):  
Xiaoming Xue ◽  
Chaoshun Li ◽  
Suqun Cao ◽  
Jinchao Sun ◽  
Liyan Liu

This study presents a two-step fault diagnosis scheme combined with statistical classification and random forests-based classification for rolling element bearings. Considering the inequality of features sensitivity in different diagnosis steps, the proposed method utilizes permutation entropy and variational mode decomposition to depict vibration signals under single scale and multiscale. In the first step, the permutation entropy features on the single scale of original signals are extracted and the statistical classification model based on Chebyshev’s inequality is constructed to detect the faults with a preliminary acquaintance of the bearing condition. In the second step, vibration signals with fault conditions are firstly decomposed into a collection of intrinsic mode functions by using variational mode decomposition and then multiscale permutation entropy features derived from each mono-component are extracted to identify the specific fault types. In order to improve the classification ability of the characteristic data, the out-of-bag estimation of random forests is firstly employed to reelect and refine the original multiscale permutation entropy features. Then the refined features are considered as the input data to train the random forests-based classification model. Finally, the condition data of bearings with different fault conditions are employed to evaluate the performance of the proposed method. The results indicate that the proposed method can effectively identify the working conditions and fault types of rolling element bearings.


2018 ◽  
Vol 75 (11) ◽  
pp. 1811-1822 ◽  
Author(s):  
Jason Daniels ◽  
Gérald Chaput ◽  
Jonathan Carr

Differentiating detections of a telemetered fish from those of predators that may have consumed that telemetered fish presents problems and opportunities. Previous efforts to classify predation events quantitatively have had to rely on data from unknown states of fish (i.e., unsupervised learning techniques) with the consequence that model performance cannot be refined or compared with alternate models. We circumvent this limitation by analysing acoustic telemetry track data to differentiate movement patterns of tagged striped bass (Morone saxitilis) from those of Atlantic salmon (Salmo salar) smolts, which were known to not have been predated by striped bass over a 3-year period in the Miramichi River estuary. A random forests classification model (i.e., supervised learning technique) was used to differentiate the movement patterns of these two species and the model was applied to Atlantic salmon smolt movement characteristics to provide an index of striped bass predation-derived mortality. The optimized random forests model inferred that predation rates by striped bass were highly variable between years for two smolt stocks, ranging from 1.9% to 17.5%. Spatial and temporal overlap of the two species is a likely factor defining the between stock and annual variation of predation rate estimates.


Sensors ◽  
2019 ◽  
Vol 19 (17) ◽  
pp. 3723 ◽  
Author(s):  
Jacob Thorson ◽  
Ashley Collier-Oxandale ◽  
Michael Hannigan

An array of low-cost sensors was assembled and tested in a chamber environment wherein several pollutant mixtures were generated. The four classes of sources that were simulated were mobile emissions, biomass burning, natural gas emissions, and gasoline vapors. A two-step regression and classification method was developed and applied to the sensor data from this array. We first applied regression models to estimate the concentrations of several compounds and then classification models trained to use those estimates to identify the presence of each of those sources. The regression models that were used included forms of multiple linear regression, random forests, Gaussian process regression, and neural networks. The regression models with human-interpretable outputs were investigated to understand the utility of each sensor signal. The classification models that were trained included logistic regression, random forests, support vector machines, and neural networks. The best combination of models was determined by maximizing the F1 score on ten-fold cross-validation data. The highest F1 score, as calculated on testing data, was 0.72 and was produced by the combination of a multiple linear regression model utilizing the full array of sensors and a random forest classification model.


2016 ◽  
Vol 2016 ◽  
pp. 1-8 ◽  
Author(s):  
Bingbing Xia ◽  
Huiyan Jiang ◽  
Huiling Liu ◽  
Dehui Yi

This paper proposed a novel voting ranking random forests (VRRF) method for solving hepatocellular carcinoma (HCC) image classification problem. Firstly, in preprocessing stage, this paper used bilateral filtering for hematoxylin-eosin (HE) pathological images. Next, this paper segmented the bilateral filtering processed image and got three different kinds of images, which include single binary cell image, single minimum exterior rectangle cell image, and single cell image with a size ofn⁎n. After that, this paper defined atypia features which include auxiliary circularity, amendment circularity, and cell symmetry. Besides, this paper extracted some shape features, fractal dimension features, and several gray features like Local Binary Patterns (LBP) feature, Gray Level Cooccurrence Matrix (GLCM) feature, and Tamura features. Finally, this paper proposed a HCC image classification model based on random forests and further optimized the model by voting ranking method. The experiment results showed that the proposed features combined with VRRF method have a good performance in HCC image classification problem.


Author(s):  
Narongsak Chayangkoon ◽  
Anongnart Srivihok

<span>Methamphetamine addiction is a prominent problem in Southeast Asia. Drug addicts often discuss illegal activities on popular social networking services. These individuals spread messages on social media as a means of both buying and selling drugs online. This paper proposes a model, the “text classification model of methamphetamine tweets in Southeast Asia” (TMTA), to identify whether a tweet from Southeast Asia is related to methamphetamine abuse. The research addresses the weakness of bag of words (BoW) by introducing BoW and Word2Vec feature selection (BWF) techniques. A domain-based feature selection method was performed using the BoW dataset and Word2Vec. The BWF dataset provided a smaller number of features than the BoW and TF–IDF dataset. We experimented with three candidate classifiers: Support vector machine (SVM), decision tree (J48) and naive bayes (NB). We found that the J48 classifier with the BWF dataset provided the best performance for the TMTA in terms of accuracy (0.815), F-measure (0.818), Kappa (0.528), Matthews correlation coefficient (0.529) and high area under the ROC Curve (0.763). Moreover, TMTA provided the lowest runtime (3.480 seconds) using the J48 with the BWF dataset.</span>


2019 ◽  
Vol 26 (1) ◽  
pp. 49-71 ◽  
Author(s):  
Renkui Hou ◽  
Chu-Ren Huang

AbstractIn this article, we propose an innovative and robust approach to stylometric analysis without annotation and leveraging lexical and sub-lexical information. In particular, we propose to leverage the phonological information of tones and rimes in Mandarin Chinese automatically extracted from unannotated texts. The texts from different authors were represented by tones, tone motifs, and word length motifs as well as rimes and rime motifs. Support vector machines and random forests were used to establish the text classification model for authorship attribution. From the results of the experiments, we conclude that the combination of bigrams of rimes, word-final rimes, and segment-final rimes can discriminate the texts from different authors effectively when using random forests to establish the classification model. This robust approach can in principle be applied to other languages with established phonological inventory of onset and rimes.


1950 ◽  
Vol 15 (4) ◽  
pp. 642-646 ◽  
Author(s):  
Frederick Steigmann ◽  
Samuel Hyman ◽  
Robert Goldbloom

Sign in / Sign up

Export Citation Format

Share Document