random forest method
Recently Published Documents


TOTAL DOCUMENTS

200
(FIVE YEARS 125)

H-INDEX

19
(FIVE YEARS 5)

Author(s):  
Qiang Zhong ◽  
Xiaoming Liu

There are serious hidden dangers in the mental health of contemporary higher vocational colleges. In order to solve this situation, an improved random forest method for mental health education is proposed. According to the psychological characteristics of higher vocational colleges, this paper briefly introduces the mental health status of Higher Vocational Colleges in school. Message entropy is a concept in message theory. As the leading organization of vocational education, colleges and universities have absolute responsibility for cultural psychology teaching. Put forward corresponding effective methods for mental health education in higher vocational colleges.


2021 ◽  
Vol 70 (12) ◽  
pp. 876-880
Author(s):  
Nobuo NAGASHIMA ◽  
Masa HAYAKAWA ◽  
Hiroyuku MASUDA ◽  
Kotobu NAGAI

2021 ◽  
Author(s):  
Jiatong Liu ◽  
Changbin Pan ◽  
Dongdong Chen ◽  
WeiPing Lin ◽  
Shangyuan Feng ◽  
...  

2021 ◽  
Vol 2021 ◽  
pp. 1-9
Author(s):  
Lu Liu ◽  
Shushan Zhang ◽  
Xiaofei Yao ◽  
Hongmei Gao ◽  
Zhihua Wang ◽  
...  

Liquefaction evaluation on the sands induced by earthquake is of significance for engineers in seismic design. In this study, the random forest (RF) method is introduced and adopted to evaluate the seismic liquefaction potential of soils based on the shear wave velocity. The RF model was developed using the Andrus database as a training dataset comprising 225 sets of liquefaction performance and shear wave velocity measurements. Five training parameters are selected for RF model including seismic magnitude ( M w ), peak horizontal ground surface acceleration ( a max ), stress-corrected shear wave velocity of soil ( V s 1 ), sandy-layer buried depth (ds), and a new introduced parameter, stress ratio (k). In addition, the optimal hyperparameters for the random forest model are determined based on the minimum error rate for the out-of-bag dataset (ERROOB) such as the number of classification trees, maximum depth of trees, and maximum number of features. The established random forest model was validated using the Kayen database as testing dataset and compared with the Chinese code and the Andrus methods. The results indicated that the random forest method established based on the training dataset was credible. The random forest method gave a success rate for liquefied sites and even a total success rate for all cases higher than 80%, which is completely acceptable. By contrast, the Chinese code method and the Andrus methods gave a high success rate for liquefaction but very low for nonliquefaction which led to the increase of engineering cost. The developed RF model can provide references for engineers to evaluate liquefaction potential.


2021 ◽  
Vol 6 (2(62)) ◽  
pp. 15-17
Author(s):  
Eduard Kinshakov ◽  
Yuliia Parfenenko ◽  
Vira Shendryk

The object of research is the process of choosing a method for predicting continuous numerical features on big datasets. The importance of the study is due to the fact that today in various subject areas it is necessary to solve the problem of predicting performance indicators based on data collected from different sources and presented in different formats, which is the task of big data analysis. To solve the problem, the methods of statistical analysis were considered, namely multiple linear regression, decision trees and a random forest. An array of extensive data was built without specifying the subject area, its preliminary processing, analysis was carried out to establish the correlation between the features. The processing of the big data array was carried out using the technology of parallel computing by means of the Dask library of the Python language. Since working with big data requires significant computing resources, this approach does not require the use of powerful computer technology. Prediction models were built using multiple linear regression methods, decision trees and a random forest, visualization of the prediction results and analysis of the reliability of the constructed models. Based on the results of calculating the prediction error, it was found that the greatest prediction accuracy among the considered methods is the random forest method. When applying this method, the prediction accuracy for a dataset of numerical features was approximately 97 %, which indicates a high reliability of the constructed model. Thus, it is possible to conclude that the random forest method is suitable for solving prediction problems using large data sets, it can be used for datasets with a large number of features and is not sensitive to data scaling. The developed software application in Python can be used to predict numerical features from different subject areas, the prediction results are imported into a text file.


Author(s):  
Meltem Eryılmaz ◽  
Önder Ertan ◽  
Furkan Yalçınkaya ◽  
Ekin Kara

Coronavirus pandemic has been going on since late 2019, millions of people died worldwide, vaccination has recently started in many countries and new strategies are sought by countries since they are still struggling to defeat the virus. So, this research is made to predict the possible ending time of the coronavirus pandemic  in Turkey using data mining and statistical studies. Data mining is a computer science study that processes large amounts of data and produces new useful information. It is especially used to support decision making in companies today. So, this project could support the decision making of authorities in developing an effective strategy against the on-going pandemic. During the research we have practiced on Turkey’s coronavirus and vaccination data between 13 January 2021 and  28 May 2021. We used Rapidminer and the Random Forest method for data mining. After all the simulations we have applied and observed during our research, it was clearly seen that vaccination parameters were decreasing the new cases. Also, the stringency index did not affect the new cases. As a conclusion of our research and observations, we think that the government should vaccinate as many people as it can in order to relax restrictions for the last time.


Pharmaceutics ◽  
2021 ◽  
Vol 13 (12) ◽  
pp. 2063
Author(s):  
Indu Muthancheri ◽  
Rohit Ramachandran

In this study, a hybrid modeling framework was developed for predicting size distribution and content uniformity of granules in a bi-component wet granulation system with components of differing hydrophobicities. Two bi-component formulations, (1) ibuprofen-USP and micro-crystalline cellulose and (2) micronized acetaminophen and micro-crystalline cellulose, were used in this study. First, a random forest method was used for predicting the probability of nucleation mechanism (immersion and solid spread), depending upon the formulation hydrophobicity. The predicted nucleation mechanism probability is used to determine the aggregation rate as well as the initial particle distribution in the population balance model. The aggregation process was modeled as Type-I: Sticking aggregation and Type-II: Deformation driven aggregation. In Type-I, the capillary force dominant aggregation mechanism is represented by the particles sticking together without deformation. In the case of Type-II, the particle deformation causes an increase in the contact area, representing a viscous force dominant aggregation mechanism. The choice between Type-I and II aggregation is determined based on the difference in nucleation mechanism that is predicted using the random forest method. The model was optimized and validated using the granule content uniformity data and size distribution data obtained from the experimental studies. The proposed framework predicted content non-uniform behavior for formulations that favored immersion nucleation and uniform behavior for formulations that favored solid-spreading nucleation.


Sign in / Sign up

Export Citation Format

Share Document