Stock Selection Strategy Based on Support Vector Machine and eXtreme Gradient Boosting Methods

Abstract Objective We explored how a deep learning (DL) approach based on hierarchical attention networks (HANs) can improve model performance for multiple information extraction tasks from unstructured cancer pathology reports compared to conventional methods that do not sufﬁciently capture syntactic and semantic contexts from free-text documents. Materials and Methods Data for our analyses were obtained from 942 deidentiﬁed pathology reports collected by the National Cancer Institute Surveillance, Epidemiology, and End Results program. The HAN was implemented for 2 information extraction tasks: (1) primary site, matched to 12 International Classification of Diseases for Oncology topography codes (7 breast, 5 lung primary sites), and (2) histological grade classiﬁcation, matched to G1–G4. Model performance metrics were compared to conventional machine learning (ML) approaches including naive Bayes, logistic regression, support vector machine, random forest, and extreme gradient boosting, and other DL models, including a recurrent neural network (RNN), a recurrent neural network with attention (RNN w/A), and a convolutional neural network. Results Our results demonstrate that for both information tasks, HAN performed signiﬁcantly better compared to the conventional ML and DL techniques. In particular, across the 2 tasks, the mean micro and macroF-scores for the HAN with pretraining were (0.852,0.708), compared to naive Bayes (0.518, 0.213), logistic regression (0.682, 0.453), support vector machine (0.634, 0.434), random forest (0.698, 0.508), extreme gradient boosting (0.696, 0.522), RNN (0.505, 0.301), RNN w/A (0.637, 0.471), and convolutional neural network (0.714, 0.460). Conclusions HAN-based DL models show promise in information abstraction tasks within unstructured clinical pathology reports.

Download Full-text

Multi-factor Stock Selection Model Based on Kernel Support Vector Machine

Journal of Mathematics Research ◽

10.5539/jmr.v10n5p9 ◽

2018 ◽

Vol 10 (5) ◽

pp. 9 ◽

Cited By ~ 1

Author(s):

Ru Zhang ◽

Zi-ang Lin ◽

Shaozhen Chen ◽

Zhixuan Lin ◽

Xingwei Liang

Keyword(s):

Support Vector Machine ◽

Predictive Power ◽

Factor Model ◽

Selection Strategy ◽

Support Vector ◽

Stock Selection ◽

Financial Investment ◽

Model Based ◽

Kernel Support Vector Machine ◽

Selection Of

In recent years, the combination of machine learning method and traditional financial investment field has become a hotspot in academic and industry. This paper takes CSI 300 and CSI 500 stocks as the research objects. First, this paper carries out kernel function test and parameter optimization for the kernel support vector machine system, and then predict and optimize the combination of market-neutral stock selection strategy and stock right strategy. The results of the experiment show that the multi-factor model based on SVM has a strong predictive power for the selection of stock, and it has a difference in the predictive power of different nuclear functions.

Download Full-text

Modelos de machine learning para predição do sucesso de startups

Revista de Gestão e Projetos ◽

10.5585/gep.v12i2.18942 ◽

2021 ◽

Vol 12 (2) ◽

pp. 28-55

Author(s):

Fabiano Rodrigues ◽

Francisco Aparecido Rodrigues ◽

Thelma Valéria Rocha Rodrigues

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Random Forest ◽

Decision Tree ◽

Initial Public Offering ◽

Gradient Boosting ◽

Support Vector ◽

Trade Offs ◽

Extreme Gradient Boosting ◽

Public Offering

Este estudo analisa resultados obtidos com modelos de machine learning para predição do sucesso de startups. Como proxy de sucesso considera-se a perspectiva do investidor, na qual a aquisição da startup ou realização de IPO (Initial Public Offering) são formas de recuperação do investimento. A revisão da literatura aborda startups e veículos de financiamento, estudos anteriores sobre predição do sucesso de startups via modelos de machine learning, e trade-offs entre técnicas de machine learning. Na parte empírica, foi realizada uma pesquisa quantitativa baseada em dados secundários oriundos da plataforma americana Crunchbase, com startups de 171 países. O design de pesquisa estabeleceu como filtro startups fundadas entre junho/2010 e junho/2015, e uma janela de predição entre junho/2015 e junho/2020 para prever o sucesso das startups. A amostra utilizada, após etapa de pré-processamento dos dados, foi de 18.571 startups. Foram utilizados seis modelos de classificação binária para a predição: Regressão Logística, Decision Tree, Random Forest, Extreme Gradiente Boosting, Support Vector Machine e Rede Neural. Ao final, os modelos Random Forest e Extreme Gradient Boosting apresentaram os melhores desempenhos na tarefa de classificação. Este artigo, envolvendo machine learning e startups, contribui para áreas de pesquisa híbridas ao mesclar os campos da Administração e Ciência de Dados. Além disso, contribui para investidores com uma ferramenta de mapeamento inicial de startups na busca de targets com maior probabilidade de sucesso.

Download Full-text

Adaboost-SVM Multi-Factor Stock Selection Model Based on Adaboost Enhancement

International Journal of Statistics and Probability ◽

10.5539/ijsp.v7n5p9 ◽

2018 ◽

Vol 7 (5) ◽

pp. 9

Author(s):

Ru Zhang ◽

Zi-ang Lin ◽

Shaozhen Chen ◽

Min Zhao ◽

Mingjie Yuan

Keyword(s):

Support Vector Machine ◽

Selection Model ◽

Selection Strategy ◽

Support Vector ◽

Linear Support Vector Machine ◽

Financial Industry ◽

Stock Selection ◽

Financial Investment ◽

Original Algorithm ◽

Model Based

In recent years, the applications of machine learning techniques to perfect traditional financial investment models has gained a widespread attention from the academic circle and the financial industry. This paper takes CSI300 stocks as the object of the research, uses Adaboost to enhance the classification ability of original linear support vector machine, and combines all major factors to build Adaboost-SVM multi-factor stock selection model based on Adaboost enhancement. In the backtesting analysis, the stock selection strategy of original linear support vector machine was compared with the Adaboost-SVM multi-factor stock selection strategy based on Adaboost enhancement. The result shows that the Adaboost-SVM multi-factor stock selection strategy based on Adaboost enhancement possesses stronger profitability and smaller income fluctuation than the original algorithm model.

Download Full-text

Ischemic heart disease detection using support vector Machine and extreme gradient boosting method

Materials Today Proceedings ◽

10.1016/j.matpr.2021.01.715 ◽

2021 ◽

Author(s):

Ladda Ashish ◽

Sravan Kumar V ◽

Sahithi Yeligeti

Keyword(s):

Support Vector Machine ◽

Heart Disease ◽

Ischemic Heart Disease ◽

Ischemic Heart ◽

Gradient Boosting ◽

Support Vector ◽

Disease Detection ◽

Extreme Gradient Boosting ◽

Boosting Method

Download Full-text

Sign language dactyl recognition based on machine learning algorithms

Eastern-European Journal of Enterprise Technologies ◽

10.15587/1729-4061.2021.239253 ◽

2021 ◽

Vol 4 (2(112)) ◽

pp. 58-72

Author(s):

Chingiz Kenshimov ◽

Zholdas Buribayev ◽

Yedilkhan Amirgaliyev ◽

Aisulyu Ataniyazova ◽

Askhat Aitimov

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Random Forest ◽

Sign Language ◽

Gesture Recognition ◽

Research Work ◽

Gradient Boosting ◽

Support Vector ◽

Extreme Gradient Boosting

In the course of our research work, the American, Russian and Turkish sign languages were analyzed. The program of recognition of the Kazakh dactylic sign language with the use of machine learning methods is implemented. A dataset of 5000 images was formed for each gesture, gesture recognition algorithms were applied, such as Random Forest, Support Vector Machine, Extreme Gradient Boosting, while two data types were combined into one database, which caused a change in the architecture of the system as a whole. The quality of the algorithms was also evaluated. The research work was carried out due to the fact that scientific work in the field of developing a system for recognizing the Kazakh language of sign dactyls is currently insufficient for a complete representation of the language. There are specific letters in the Kazakh language, because of the peculiarities of the spelling of the language, problems arise when developing recognition systems for the Kazakh sign language. The results of the work showed that the Support Vector Machine and Extreme Gradient Boosting algorithms are superior in real-time performance, but the Random Forest algorithm has high recognition accuracy. As a result, the accuracy of the classification algorithms was 98.86 % for Random Forest, 98.68 % for Support Vector Machine and 98.54 % for Extreme Gradient Boosting. Also, the evaluation of the quality of the work of classical algorithms has high indicators. The practical significance of this work lies in the fact that scientific research in the field of gesture recognition with the updated alphabet of the Kazakh language has not yet been conducted and the results of this work can be used by other researchers to conduct further research related to the recognition of the Kazakh dactyl sign language, as well as by researchers, engaged in the development of the international sign language

Download Full-text