A Novel Differential Essential Genes Prediction Method Based on Random Forests Model

Genes, the nucleotide sequences that encode a polypeptide chain or functional RNA, are the basic genetic unit controlling biological traits. They are the guarantee of the basic structures and functions in organisms, and they store information related to biological factors and processes such as blood type, gestation, growth, and apoptosis. The environment and genetics jointly affect important physiological processes such as reproduction, cell division, and protein synthesis. Genes are related to a wide range of phenomena including growth, decline, illness, aging, and death. During the evolution of organisms, there is a class of genes that exist in a conserved form in multiple species. These genes are often located on the dominant strand of DNA and tend to have higher expression levels. The protein encoded by it usually either performs very important functions or is responsible for maintaining and repairing these essential functions. Such genes are called persistent genes. Among them, the irreplaceable part of the body’s life activities is the essential gene. For example, when starch is the only source of energy, the genes related to starch digestion are essential genes. Without them, the organism will die because it cannot obtain enough energy to maintain basic functions. The function of the proteins encoded by these genes is thought to be fundamental to life. Nowadays, DNA can be extracted from blood, saliva, or tissue cells for genetic testing, and detailed genetic information can be obtained using the most advanced scientific instruments and technologies. The information gained from genetic testing is useful to assess the potential risks of disease, and to help determine the prognosis and development of diseases. Such information is also useful for developing personalized medication and providing targeted health guidance to improve the quality of life. Therefore, it is of great theoretical and practical significance to identify important and essential genes. In this paper, the research status of essential genes and the essential genome database of bacteria are reviewed, the computational prediction method of essential genes based on communication coding theory is expounded, and the significance and practical application value of essential genes are discussed.

Download Full-text

Can the Random Forests Model Improve the Power to Predict the Intention of the Elderly in a Community to Participate in a Cognitive Health Promotion Program?

Iranian Journal of Public Health ◽

10.18502/ijph.v50i2.5346 ◽

2021 ◽

Author(s):

Haewon BYEON

Keyword(s):

Health Promotion ◽

Random Forests ◽

Family Income ◽

Panel Study ◽

The Elderly ◽

Health Promotion Program ◽

Outcome Variable ◽

Cognitive Health ◽

Promotion Program ◽

Random Forests Model

Background: We aimed to develop a model predicting the participation of the elderly in a cognitive health program using the random forest algorithm and presented baseline information for enhancing cognitive health. Methods: This study analyzed the raw data of Seoul Welfare Panel Study (SWPS) (20), which was surveyed by Seoul Welfare Foundation for the residents of Seoul from Jun 1st to Aug 31st, 2015. Subjects were 2,111 (879 men and 1232 women) persons aged 60 yr and older living in the community who were not diagnosed with dementia. The outcome variable was the intention to participate in a cognitive health promotion program. A prediction model was developed by the use of a Random forests and the results of the developed model were compared with those of a decision tree analysis based on classification and regression tree (CART). Results: The random forests model predicted education level, subjective health, subjective friendship, subjective family bond, mean monthly family income, age, smoking, living with a spouse or not, depression history, drinking, and regular exercise as the major variables. The analysis results of test data showed that the accuracy of the random forests was 72.3% and that of the CART model was 70.9%. Conclusion: It is necessary to develop a customized health promotion program considering the characteristics of subjects in order to implement a program effectively based on the developed model to predict participation in a cognitive health promotion program.

Download Full-text

Application of Unmanned Aerial Vehicle and Random Forests Model in Alpine Grassland Cover Estimation: A Case Study in the Xiahe County, China

Proceedings of the International Workshop on Environmental Management, Science and Engineering ◽

10.5220/0007556500130019 ◽

2018 ◽

Author(s):

B. P. Meng ◽

T. G. Liang ◽

Q. S. Feng ◽

J. L. Gao ◽

J. Ge ◽

...

Keyword(s):

Unmanned Aerial Vehicle ◽

Random Forests ◽

Alpine Grassland ◽

Aerial Vehicle ◽

Random Forests Model

Download Full-text

Random Forests Model of Data Mining Classification Used to Identify Unknown Authorship of Tamil Articles

International Journal of Applied Research on Information Technology and Computing ◽

10.5958/0975-8089.2015.00022.6 ◽

2015 ◽

Vol 6 (3) ◽

pp. 168

Author(s):

R. Lakshmi Priya ◽

G. Manimannan

Keyword(s):

Data Mining ◽

Random Forests ◽

Random Forests Model

Download Full-text

Two-layer random forests model for case reuse in case-based reasoning

Expert Systems with Applications ◽

10.1016/j.eswa.2015.08.005 ◽

2015 ◽

Vol 42 (24) ◽

pp. 9412-9425 ◽

Cited By ~ 9

Author(s):

Shisheng Zhong ◽

Xiaolong Xie ◽

Lin Lin

Keyword(s):

Random Forests ◽

Case Based Reasoning ◽

Case Based ◽

Random Forests Model

Download Full-text

Super Learner

Statistical Applications in Genetics and Molecular Biology ◽

10.2202/1544-6115.1309 ◽

2007 ◽

Vol 6 (1) ◽

Cited By ~ 386

Author(s):

Mark J. van der Laan ◽

Eric C Polley ◽

Alan E. Hubbard

Keyword(s):

Random Forests ◽

Loss Function ◽

Cross Validation ◽

Prediction Method ◽

Spline Regression ◽

Least Angle Regression ◽

Super Learner ◽

Practical Demonstration ◽

Fold Cross Validation ◽

Weighted Combination

When trying to learn a model for the prediction of an outcome given a set of covariates, a statistician has many estimation procedures in their toolbox. A few examples of these candidate learners are: least squares, least angle regression, random forests, and spline regression. Previous articles (van der Laan and Dudoit (2003); van der Laan et al. (2006); Sinisi et al. (2007)) theoretically validated the use of cross validation to select an optimal learner among many candidate learners. Motivated by this use of cross validation, we propose a new prediction method for creating a weighted combination of many candidate learners to build the super learner. This article proposes a fast algorithm for constructing a super learner in prediction which uses V-fold cross-validation to select weights to combine an initial set of candidate learners. In addition, this paper contains a practical demonstration of the adaptivity of this so called super learner to various true data generating distributions. This approach for construction of a super learner generalizes to any parameter which can be defined as a minimizer of a loss function.

Download Full-text

The random forests model of detecting network-based buffer overflow attacks

Information Science and Electronic Engineering ◽

10.1201/9781315265278-97 ◽

2016 ◽

pp. 425-428

Keyword(s):

Random Forests ◽

Buffer Overflow ◽

Random Forests Model

Download Full-text

Random Forests Model Based Flood Process Simulation in the Qiushui River Basin

Journal of Water Resources Research ◽

10.12677/jwrr.2018.75051 ◽

2018 ◽

Vol 07 (05) ◽

pp. 456-463

Author(s):

甜甜唐

Keyword(s):

River Basin ◽

Random Forests ◽

Process Simulation ◽

Model Based ◽

Random Forests Model

Download Full-text

The Study of Classification Analysis on Oil Exploration Research Trends and Digging Technology Based on Properties of Biochemical Materials (2012-2013)

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.859.280 ◽

2013 ◽

Vol 859 ◽

pp. 280-283

Author(s):

Shiang Hau Wu ◽

Jiann Jong Guo

Keyword(s):

Text Mining ◽

Random Forests ◽

Research Trends ◽

Oil Exploration ◽

Mining Method ◽

Research Papers ◽

Classification Analysis ◽

Relationship Of ◽

The Relationship ◽

Random Forests Model

The study aimed at analyzing the keywords of the oil exploration research papers abstracts in 2012 and 2013 and using the random forests model to make the classification analysis in order to find the importance and similarities of 2012 and 2013 research trends. The contribution of the study included the following two points. First, the study used the text mining method in order to explore the content of oil exploration research paper abstracts. Second, the study applied the AdaBoost classification analysis to explore the relationship of the keywords between the two years’ keywords.

Download Full-text