Application of Support Vector Machine in Determination of Real Estate Price

2012 ◽  
Vol 461 ◽  
pp. 818-821
Author(s):  
Shi Hu Zhang

The problem of real estate prices are the current focus of the community's concern. Support Vector Machine is a new machine learning algorithm, as its excellent performance of the study, and in small samples to identify many ways, and so has its unique advantages, is now used in many areas. Determination of real estate price is a complicated problem due to its non-linearity and the small quantity of training data. In this study, support vector machine (SVM) is proposed to forecast the price of real estate price in China. The experimental results indicate that the SVM method can achieve greater accuracy than grey model, artificial neural network under the circumstance of small training data. It was also found that the predictive ability of the SVM outperformed those of some traditional pattern recognition methods for the data set used here.

2020 ◽  
Vol 8 (5) ◽  
pp. 1557-1560

Support vector machine (SVM) is a commonly known efficient supervised learning algorithm for classification problems. However, the classification accuracy of the SVM classifier depends on its training parameters and the training data set as well. The main objective of this paper is to optimize its parameters and feature weighting in order to improve the strength of the SVM simultaneously. In this paper, the Imperialist Competitive Algorithm based Support Vector Machine (ICA-SVM) classifier is proposed to classify the efficient weed detection. This enhanced ICA-SVM classifier is able to select the appropriate input features and to optimize the parameters of SVM and is improving the classification accuracy. Experimental results show that the ICA-SVM classification algorithm reduces the computational complexity tremendously and improves classification Accuracy.


2021 ◽  
pp. 52-66
Author(s):  
Huang-Mei He ◽  
Yi Chen ◽  
Jia-Ying Xiao ◽  
Xue-Qing Chen ◽  
Zne-Jung Lee

China has carried out a large number of real estate market reforms that change the real estate market demand considerably. At the same time, the real estate price has soared in some cities and has surpassed the spending power of many ordinary people. As the real estate price has received widespread attention from society, it is important to understand what factors affect the real estate price. Therefore, we propose a data analysis method for finding out the influencing factors of real estate prices. The method performs data cleaning and conversion on the used data first. To discretize the real estate price, we use the mean ± standard deviation (SD), mean ± 0.5 SD, and mean ± 2 SD of the price and divide it into three categories as the output variable. Then, we establish the decision tree and random forest model for six different situations for comparison. When the data set is divided into training data (70%) and testing data (30%), it has the highest testing accuracy. In addition, by observing the importance of each input variable, it is found that the main influencing factors of real estate price are cost, interior decoration, location, and status. The results suggest that both the real estate industry and buyers should pay attention to these factors to adjust or purchase real estate.


2011 ◽  
Vol 148-149 ◽  
pp. 369-373
Author(s):  
Wen Chao Li ◽  
Hong Sen Yan

The job-shop-like knowledgeable manufacturing cell scheduling is a NP-complete problem and there has not been a completely valid algorithm for it until now. An algorithm with self -learning ability is proposed through the addition of precedence constraint of operations on the basis of directed graph. A method based on support vector machine is constructed to choose accurately interchangeable operations by small samples earning to obtain the better scheduling. The classification accuracy can be improved by the continuous addition of new instances to the sample library. The results of simulation show that the algorithm performs well for the job-shop-like knowledgeable manufacturing cell.


2018 ◽  
Vol 2018 ◽  
pp. 1-9 ◽  
Author(s):  
Hyungsik Shin ◽  
Jeongyeup Paek

Automatic task classification is a core part of personal assistant systems that are widely used in mobile devices such as smartphones and tablets. Even though many industry leaders are providing their own personal assistant services, their proprietary internals and implementations are not well known to the public. In this work, we show through real implementation and evaluation that automatic task classification can be implemented for mobile devices by using the support vector machine algorithm and crowdsourcing. To train our task classifier, we collected our training data set via crowdsourcing using the Amazon Mechanical Turk platform. Our classifier can classify a short English sentence into one of the thirty-two predefined tasks that are frequently requested while using personal mobile devices. Evaluation results show high prediction accuracy of our classifier ranging from 82% to 99%. By using large amount of crowdsourced data, we also illustrate the relationship between training data size and the prediction accuracy of our task classifier.


2021 ◽  
Author(s):  
Qifei Zhao ◽  
Xiaojun Li ◽  
Yunning Cao ◽  
Zhikun Li ◽  
Jixin Fan

Abstract Collapsibility of loess is a significant factor affecting engineering construction in loess area, and testing the collapsibility of loess is costly. In this study, A total of 4,256 loess samples are collected from the north, east, west and middle regions of Xining. 70% of the samples are used to generate training data set, and the rest are used to generate verification data set, so as to construct and validate the machine learning models. The most important six factors are selected from thirteen factors by using Grey Relational analysis and multicollinearity analysis: burial depth、water content、specific gravity of soil particles、void rate、geostatic stress and plasticity limit. In order to predict the collapsibility of loess, four machine learning methods: Support Vector Machine (SVM), Random Subspace Based Support Vector Machine (RSSVM), Random Forest (RF) and Naïve Bayes Tree (NBTree), are studied and compared. The receiver operating characteristic (ROC) curve indicators, standard error (SD) and 95% confidence interval (CI) are used to verify and compare the models in different research areas. The results show that: RF model is the most efficient in predicting the collapsibility of loess in Xining, and its AUC average is above 80%, which can be used in engineering practice.


2019 ◽  
Vol 9 (18) ◽  
pp. 3800
Author(s):  
Rebekka Weixer ◽  
Jonas Koch ◽  
Patrick Plany ◽  
Simon Ohlendorf ◽  
Stephan Pachnicke

A support vector machine (SVM) based detection is applied to different equalization schemes for a data center interconnect link using coherent 64 GBd 64-QAM over 100 km standard single mode fiber (SSMF). Without any prior knowledge or heuristic assumptions, the SVM is able to learn and capture the transmission characteristics from only a short training data set. We show that, with the use of suitable kernel functions, the SVM can create nonlinear decision thresholds and reduce the errors caused by nonlinear phase noise (NLPN), laser phase noise, I/Q imbalances and so forth. In order to apply the SVM to 64-QAM we introduce a binary coding SVM, which provides a binary multiclass classification with reduced complexity. We investigate the performance of this SVM and show how it can improve the bit-error rate (BER) of the entire system. After 100 km the fiber-induced nonlinear penalty is reduced by 2 dB at a BER of 3.7 × 10 − 3 . Furthermore, we apply a nonlinear Volterra equalizer (NLVE), which is based on the nonlinear Volterra theory, as another method for mitigating nonlinear effects. The combination of SVM and NLVE reduces the large computational complexity of the NLVE and allows more accurate compensation of nonlinear transmission impairments.


2013 ◽  
Vol 2013 ◽  
pp. 1-10 ◽  
Author(s):  
Jianwei Liu ◽  
Shuang Cheng Li ◽  
Xionglin Luo

Support vector machine is an effective classification and regression method that uses machine learning theory to maximize the predictive accuracy while avoiding overfitting of data.L2regularization has been commonly used. If the training dataset contains many noise variables,L1regularization SVM will provide a better performance. However, bothL1andL2are not the optimal regularization method when handing a large number of redundant values and only a small amount of data points is useful for machine learning. We have therefore proposed an adaptive learning algorithm using the iterative reweightedp-norm regularization support vector machine for 0 <p≤ 2. A simulated data set was created to evaluate the algorithm. It was shown that apvalue of 0.8 was able to produce better feature selection rate with high accuracy. Four cancer data sets from public data banks were used also for the evaluation. All four evaluations show that the new adaptive algorithm was able to achieve the optimal prediction error using apvalue less thanL1norm. Moreover, we observe that the proposedLppenalty is more robust to noise variables than theL1andL2penalties.


Author(s):  
Dmitrii Dikii

Introduction: For the development of cyberphysical systems, new technologies and data transfer protocols are being developed, in order to reduce the energy costs of communication devices. One of the modern approaches to data transmission in cyberphysical systems is the publish-subscribe model, which is subject to a denial-of-service attack. Purpose: Development of a model for detecting a DoS attack implemented at the application level of publish-subscribe networks based on the analysis of their traffic using machine learning methods. Results: A model is developed for detecting a DoS attack, operating with three classifiers depending on the message type: connection, subscription, and publication. This approach makes it possible to identify the source of an attack. That can be a network node, a particular device, or a user account. A multi-layer perceptron, the random forest algorithm, and a support vector machine of various configurations were considered as classifiers. Training and test data sets were generated for the proposed feature vector. The classification quality was evaluated by calculating the F1 score, the Matthews correlation coefficient, and accuracy. The multilayer perceptron model and the support vector machine with a polynomial kernel and SMO optimization method showed the best values of all metrics. However, in the case of the support vector machine, a slight decrease in the prediction quality was detected when the width of the traffic analysis window was close to the longest period of sending legitimate messages from the training data set. Practical relevance: The results of the research can be used in the development of intrusion detection features for cyberphysical systems using the publish-subscribe model, or other systems based on the same approach


2015 ◽  
Vol 7 (4) ◽  
pp. 3383-3408 ◽  
Author(s):  
F. Khan ◽  
F. Enzmann ◽  
M. Kersten

Abstract. In X-ray computed microtomography (μXCT) image processing is the most important operation prior to image analysis. Such processing mainly involves artefact reduction and image segmentation. We propose a new two-stage post-reconstruction procedure of an image of a geological rock core obtained by polychromatic cone-beam μXCT technology. In the first stage, the beam-hardening (BH) is removed applying a best-fit quadratic surface algorithm to a given image data set (reconstructed slice), which minimizes the BH offsets of the attenuation data points from that surface. The final BH-corrected image is extracted from the residual data, or the difference between the surface elevation values and the original grey-scale values. For the second stage, we propose using a least square support vector machine (a non-linear classifier algorithm) to segment the BH-corrected data as a pixel-based multi-classification task. A combination of the two approaches was used to classify a complex multi-mineral rock sample. The Matlab code for this approach is provided in the Appendix. A minor drawback is that the proposed segmentation algorithm may become computationally demanding in the case of a high dimensional training data set.


2020 ◽  
Vol 8 (6) ◽  
pp. 4684-4688

Per the statistics received from BBC, data varies for every earthquake occurred till date. Approximately, up to thousands are dead, about 50,000 are injured, around 1-3 Million are dislocated, while a significant amount go missing and homeless. Almost 100% structural damage is experienced. It also affects the economic loss, varying from 10 to 16 million dollars. A magnitude corresponding to 5 and above is classified as deadliest. The most life-threatening earthquake occurred till date took place in Indonesia where about 3 million were dead, 1-2 million were injured and the structural damage accounted to 100%. Hence, the consequences of earthquake are devastating and are not limited to loss and damage of living as well as nonliving, but it also causes significant amount of change-from surrounding and lifestyle to economic. Every such parameter desiderates into forecasting earthquake. A couple of minutes’ notice and individuals can act to shield themselves from damage and demise; can decrease harm and monetary misfortunes, and property, characteristic assets can be secured. In current scenario, an accurate forecaster is designed and developed, a system that will forecast the catastrophe. It focuses on detecting early signs of earthquake by using machine learning algorithms. System is entitled to basic steps of developing learning systems along with life cycle of data science. Data-sets for Indian sub-continental along with rest of the World are collected from government sources. Pre-processing of data is followed by construction of stacking model that combines Random Forest and Support Vector Machine Algorithms. Algorithms develop this mathematical model reliant on “training data-set”. Model looks for pattern that leads to catastrophe and adapt to it in its building, so as to settle on choices and forecasts without being expressly customized to play out the task. After forecast, we broadcast the message to government officials and across various platforms. The focus of information to obtain is keenly represented by the 3 factors – Time, Locality and Magnitude.


Sign in / Sign up

Export Citation Format

Share Document