scholarly journals Prognosis of Cancer and Proposition of Therapeutics

Cancer is becoming one of the common diseases in day today life, identifying it in a prior stage is still difficult. Identification of environmental and genetic factors is necessary to predict the cancer. We developed a cancer prediction system to predict lung and oral cancer based on the symptoms. The gathered data is pre-processed and the data mining algorithm such as decision tree, logistic regression, Random Forest and Support Vector machines are used to measure the performance. The attribute selection algorithms are used to obtain the mandatory attributes. The main aim of this system is to predict the type of cancer and the suggested therapy using random forest algorithm.

Author(s):  
Bhavani M ◽  
Pavithra V ◽  
Monesh R

Cancer is becoming one among the common diseases in day to today life, determining cancer in an earlier stage is still problematic. Identification of genetic and environmental factors is necessary to predict the type of cancer. The idea is to develop a cancer prediction system that predict lung and oral cancer depending on the symptoms. The gathered data is pre-processed and the data mining algorithm such as decision tree, logistic regression, Random Forest and Support Vector machines are used to measure the performance. The attribute selection algorithms are used to obtain the mandatory attributes. The main aim of this study is to do a comparative analysis using different algorithms for cancer prediction and suggestion of therapy.


2020 ◽  
Vol 44 (4) ◽  
pp. 627-635
Author(s):  
A.M. Belov ◽  
A.Y. Denisova

Earth remote sensing data fusion is intended to produce images of higher quality than the original ones. However, the fusion impact on further thematic processing remains an open question because fusion methods are mostly used to improve the visual data representation. This article addresses an issue of the effect of fusion with increasing spatial and spectral resolution of data on thematic classification of images using various state-of-the-art classifiers and features extraction methods. In this paper, we use our own algorithm to perform multi-frame image fusion over optical remote sensing images with different spatial and spectral resolutions. For classification, we applied support vector machines and Random Forest algorithms. For features, we used spectral channels, extended attribute profiles and local feature attribute profiles. An experimental study was carried out using model images of four imaging systems. The resulting image had a spatial resolution of 2, 3, 4 and 5 times better than for the original images of each imaging system, respectively. As a result of our studies, it was revealed that for the support vector machines method, fusion was inexpedient since excessive spatial details had a negative effect on the classification. For the Random Forest algorithm, the classification results of a fused image were more accurate than for the original low-resolution images in 90% of cases. For example, for images with the smallest difference in spatial resolution (2 times) from the fusion result, the classification accuracy of the fused image was on average 4% higher. In addition, the results obtained for the Random Forest algorithm with fusion were better than the results for the support vector machines method without fusion. Additionally, it was shown that the classification accuracy of a fused image using the Random Forest method could be increased by an average of 9% due to the use of extended attribute profiles as features. Thus, when using data fusion, it is better to use the Random Forest classifier, whereas using fusion with the support vector machines method is not recommended.


Sensors ◽  
2019 ◽  
Vol 19 (20) ◽  
pp. 4523 ◽  
Author(s):  
Carlos Cabo ◽  
Celestino Ordóñez ◽  
Fernando Sáchez-Lasheras ◽  
Javier Roca-Pardiñas ◽  
and Javier de Cos-Juez

We analyze the utility of multiscale supervised classification algorithms for object detection and extraction from laser scanning or photogrammetric point clouds. Only the geometric information (the point coordinates) was considered, thus making the method independent of the systems used to collect the data. A maximum of five features (input variables) was used, four of them related to the eigenvalues obtained from a principal component analysis (PCA). PCA was carried out at six scales, defined by the diameter of a sphere around each observation. Four multiclass supervised classification models were tested (linear discriminant analysis, logistic regression, support vector machines, and random forest) in two different scenarios, urban and forest, formed by artificial and natural objects, respectively. The results obtained were accurate (overall accuracy over 80% for the urban dataset, and over 93% for the forest dataset), in the range of the best results found in the literature, regardless of the classification method. For both datasets, the random forest algorithm provided the best solution/results when discrimination capacity, computing time, and the ability to estimate the relative importance of each variable are considered together.


2019 ◽  
Vol 11 (11) ◽  
pp. 3222 ◽  
Author(s):  
Pascal Schirmer ◽  
Iosif Mporas

In this paper we evaluate several well-known and widely used machine learning algorithms for regression in the energy disaggregation task. Specifically, the Non-Intrusive Load Monitoring approach was considered and the K-Nearest-Neighbours, Support Vector Machines, Deep Neural Networks and Random Forest algorithms were evaluated across five datasets using seven different sets of statistical and electrical features. The experimental results demonstrated the importance of selecting both appropriate features and regression algorithms. Analysis on device level showed that linear devices can be disaggregated using statistical features, while for non-linear devices the use of electrical features significantly improves the disaggregation accuracy, as non-linear appliances have non-sinusoidal current draw and thus cannot be well parametrized only by their active power consumption. The best performance in terms of energy disaggregation accuracy was achieved by the Random Forest regression algorithm.


2021 ◽  
Vol 13 (18) ◽  
pp. 3573
Author(s):  
Chunfang Kong ◽  
Yiping Tian ◽  
Xiaogang Ma ◽  
Zhengping Weng ◽  
Zhiting Zhang ◽  
...  

Regarding the ever increasing and frequent occurrence of serious landslide disaster in eastern Guangxi, the current study was implemented to adopt support vector machines (SVM), particle swarm optimization support vector machines (PSO-SVM), random forest (RF), and particle swarm optimization random forest (PSO-RF) methods to assess landslide susceptibility in Zhaoping County. To this end, 10 landslide disaster-related variables including digital elevation model (DEM)-derived, meteorology-derived, Landsat8-derived, geology-derived, and human activities factors were provided. Of 345 landslide disaster locations found, 70% were used to train the models, and the rest of them were performed for model verification. The aforementioned four models were run, and landslide susceptibility evaluation maps were produced. Then, receiver operating characteristics (ROC) curves, statistical analysis, and field investigation were performed to test and verify the efficiency of these models. Analysis and comparison of the results denoted that all four landslide models performed well for the landslide susceptibility evaluation as indicated by the area under curve (AUC) values of ROC curves from 0.863 to 0.934. Among them, it has been shown that the PSO-RF model has the highest accuracy in comparison to other landslide models, followed by the PSO-SVM model, the RF model, and the SVM model. Moreover, the results also showed that the PSO algorithm has a good effect on SVM and RF models. Furthermore, the landslide models devolved in the present study are promising methods that could be transferred to other regions for landslide susceptibility evaluation. In addition, the evaluation results can provide suggestions for disaster reduction and prevention in Zhaoping County of eastern Guangxi.


2019 ◽  
Vol 8 (4) ◽  
pp. 4879-4881

One of the most dreadful disease is breast cancer and it has a potential cause for death in women. Every year, death rate increases drastically due to breast cancer. An effective way to classify data is through classification or data mining. This becomes very handy, especially in the medical field where diagnosis and analysis are done through these techniques. Wisconsin Breast cancer dataset is used to perform a comparison between SVM, Logistic Regression, Naïve Bayes and Random Forest. Evaluating the correctness in classifying data based on accuracy and time consumption is used to determine the efficiency of the algorithms, which is the main objective. Based on the result of performed experiments, the Random Forest algorithm shows the highest accuracy (99.76%) with the least error rate. ANACONDA Data Science Platform is used to execute all the experiments in a simulated environment.


Author(s):  
Lin Li ◽  
Yixiang Huang ◽  
Jianfeng Tao ◽  
Chengliang Liu

Monitoring for internal leakage of hydraulic cylinders is vital to maintain the efficiency and safety of hydraulic systems. An intelligent classifier is proposed to automatically evaluate internal leakage levels based on the newly extracted features and random forest algorithm. The inlet and outlet pressures as well as the pressure differences of two chambers are chosen as the monitoring parameters for leakage identification. The empirical mode decomposition method is used to decompose the raw pressure signals into a series of intrinsic mode functions to obtain the essence in experimental signals. Then, the features extracted from intrinsic mode functions in terms of statistical analysis are formed the input vector to train the leakage detector. The classifier based on random forest is established to categorize internal leakage into proper levels. The accuracy of the internal leakage evaluator is verified by the experimental pressure signals. Moreover, an internal leakage evaluator is established based on the support vector machine algorithm, in which the wavelet transform is applied for feature extraction. The accuracy and efficiency of different classifiers are compared based on leakage experiments. The results show that the classifier trained by the intrinsic mode function features in terms of random forest algorithm may more effectively and accurately identify internal leakage levels of hydraulic cylinders. The leakage evaluator provides probability for online monitoring of the internal leakage of hydraulic cylinders based on the inherent sensors.


Sign in / Sign up

Export Citation Format

Share Document