Support vector machine based on low-rank tensor train decomposition for big data applications

Author(s):  
Yongkang Wang ◽  
Weicheng Zhang ◽  
Zhuliang Yu ◽  
Zhenghui Gu ◽  
Hao Liu ◽  
...  
2014 ◽  
Vol 2014 ◽  
pp. 1-8 ◽  
Author(s):  
Chia-Hui Huang ◽  
Keng-Chieh Yang ◽  
Han-Ying Kao

Big data is a new trend at present, forcing the significant impacts on information technologies. In big data applications, one of the most concerned issues is dealing with large-scale data sets that often require computation resources provided by public cloud services. How to analyze big data efficiently becomes a big challenge. In this paper, we collaborate interval regression with the smooth support vector machine (SSVM) to analyze big data. Recently, the smooth support vector machine (SSVM) was proposed as an alternative of the standard SVM that has been proved more efficient than the traditional SVM in processing large-scale data. In addition the soft margin method is proposed to modify the excursion of separation margin and to be effective in the gray zone that the distribution of data becomes hard to be described and the separation margin between classes.


2019 ◽  
Vol 2 (2) ◽  
pp. 43
Author(s):  
Lalu Mutawalli ◽  
Mohammad Taufan Asri Zaen ◽  
Wire Bagye

In the era of technological disruption of mass communication, social media became a reference in absorbing public opinion. The digitalization of data is very rapidly produced by social media users because it is an attempt to represent the feelings of the audience. Data production in question is the user posts the status and comments on social media. Data production by the public in social media raises a very large set of data or can be referred to as big data. Big data is a collection of data sets in very large numbers, complex, has a relatively fast appearance time, so that makes it difficult to handle. Analysis of big data with data mining methods to get knowledge patterns in it. This study analyzes the sentiments of netizens on Twitter social media on Mr. Wiranto stabbing case. The results of the sentiment analysis showed 41% gave positive comments, 29% commented neutrally, and 29% commented negatively on events. Besides, modeling of the data is carried out using a support vector machine algorithm to create a system capable of classifying positive, neutral, and negative connotations. The classification model that has been made is then tested using the confusion matrix technique with each result is a precision value of 83%, a recall value of 80%, and finally, as much as 80% obtained in testing the accuracy.


2021 ◽  
Vol 2021 ◽  
pp. 1-9
Author(s):  
Yao Huimin

With the development of cloud computing and distributed cluster technology, the concept of big data has been expanded and extended in terms of capacity and value, and machine learning technology has also received unprecedented attention in recent years. Traditional machine learning algorithms cannot solve the problem of effective parallelization, so a parallelization support vector machine based on Spark big data platform is proposed. Firstly, the big data platform is designed with Lambda architecture, which is divided into three layers: Batch Layer, Serving Layer, and Speed Layer. Secondly, in order to improve the training efficiency of support vector machines on large-scale data, when merging two support vector machines, the “special points” other than support vectors are considered, that is, the points where the nonsupport vectors in one subset violate the training results of the other subset, and a cross-validation merging algorithm is proposed. Then, a parallelized support vector machine based on cross-validation is proposed, and the parallelization process of the support vector machine is realized on the Spark platform. Finally, experiments on different datasets verify the effectiveness and stability of the proposed method. Experimental results show that the proposed parallelized support vector machine has outstanding performance in speed-up ratio, training time, and prediction accuracy.


2018 ◽  
Vol 22 (S6) ◽  
pp. 14709-14720 ◽  
Author(s):  
Nattar Kannan ◽  
S. Sivasubramanian ◽  
M. Kaliappan ◽  
S. Vimal ◽  
A. Suresh

Author(s):  
Yiqing Fan ◽  
Zhihui Sun

In order to effectively improve the accuracy of Consumer Price Index (CPI) prediction so as to more truly reflect the overall level of the country’s macroeconomic situation, a CPI big data prediction method based on wavelet twin support vector machine (SVM) is proposed. First, the historical CPI data are decomposed into high-frequency part and low-frequency part by wavelet transform. Then a more advanced twin SVM is used to build a prediction model to obtain two kinds of prediction results. Finally, the wavelet reconstruction method is used to fuse the two kinds of prediction results to obtain the final CPI prediction results. The wavelet twin SVM model is used to fit and predict CPI index. Experimental results show that compared with the similar prediction methods, the proposed prediction method has higher fitting accuracy and smaller root mean square error.


2022 ◽  
Vol 2022 ◽  
pp. 1-8
Author(s):  
Yiming Li

In China, universities are important centers for SR (scientific research) and innovation, and the quality of SR management has a significant impact on university innovation. The informatization of SR management is a critical component of university development in the big data environment. As a result, it is crucial to figure out how to improve SR management. As a result, this paper builds a four-tier B/W/D/C (Browser/Web/Database/Client) university SR management innovation information system based on big data technology and thoroughly examines the system’s hardware and software configuration. The SVM-WNB (Support Vector Machine-Weighted NB) classification algorithm is proposed, and the improved algorithm runs in parallel on the Hadoop cloud computing platform, allowing the algorithm to process large amounts of data efficiently. The optimization strategy proposed in this paper can effectively optimize the execution of scientific big data applications according to a large number of simulation experiments and real-world multidata center environment experiments.


Sign in / Sign up

Export Citation Format

Share Document