Integrated Long-Term Stock Selection Models Based on Feature Selection and Machine Learning Algorithms for China Stock Market

Stock market is one of the most complicated and sophisticated ways to do business. Small ownerships, brokerage corporations, banking sectors, all depend on this very body to make revenue and divide risks; a very complicated model. However, this paper proposes to use machine learning algorithms to predict the future stock price for exchange by using pre-existing algorithms to help make this unpredictable format of business a little more predictable. The use of machine learning which makes predictions based on the values of current stock market indices by training on their previous values. Machine learning itself employs different models to make prediction easier and authentic. The data has to be cleansed before it can be used for predictions. This paper focuses on categorizing various methods used for predictive analytics in different domains to date, their shortcomings.

Download Full-text

Techniques for Detecting Malware Traffic: A Comprehensive Approach to Feature Selection and Classification

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.39088 ◽

2021 ◽

Vol 9 (12) ◽

pp. 1-10

Author(s):

Harsha A K

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Random Forest ◽

Learning Algorithms ◽

Malware Detection ◽

Machine Learning Algorithms ◽

Gradient Boosting ◽

Support Vector ◽

Steady Increase ◽

Extreme Gradient Boosting

Abstract: Since the advent of encryption, there has been a steady increase in malware being transmitted over encrypted networks. Traditional approaches to detect malware like packet content analysis are inefficient in dealing with encrypted data. In the absence of actual packet contents, we can make use of other features like packet size, arrival time, source and destination addresses and other such metadata to detect malware. Such information can be used to train machine learning classifiers in order to classify malicious and benign packets. In this paper, we offer an efficient malware detection approach using classification algorithms in machine learning such as support vector machine, random forest and extreme gradient boosting. We employ an extensive feature selection process to reduce the dimensionality of the chosen dataset. The dataset is then split into training and testing sets. Machine learning algorithms are trained using the training set. These models are then evaluated against the testing set in order to assess their respective performances. We further attempt to tune the hyper parameters of the algorithms, in order to achieve better results. Random forest and extreme gradient boosting algorithms performed exceptionally well in our experiments, resulting in area under the curve values of 0.9928 and 0.9998 respectively. Our work demonstrates that malware traffic can be effectively classified using conventional machine learning algorithms and also shows the importance of dimensionality reduction in such classification problems. Keywords: Malware Detection, Extreme Gradient Boosting, Random Forest, Feature Selection.

Download Full-text

A Regression Analysis of Stock Market Prediction Using Machine Learning Algorithms

SSRN Electronic Journal ◽

10.2139/ssrn.3880509 ◽

2021 ◽

Author(s):

Niraj Shukla ◽

Subham Sanoriya ◽

Narendra Yadav ◽

Sudhakar Mourya ◽

A S Mohammed Shariff

Keyword(s):

Machine Learning ◽

Regression Analysis ◽

Stock Market ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Stock Market Prediction

Download Full-text

Comparative Analysis of Machine Learning Algorithms for Stock Market Prediction During COVID-19 Outbreak

Artificial Intelligence Systems and the Internet of Things in the Digital Era - Lecture Notes in Networks and Systems ◽

10.1007/978-3-030-77246-8_15 ◽

2021 ◽

pp. 154-161

Author(s):

Jolly Masih ◽

Rajkumar Rajasekaran ◽

Neha Saini ◽

Damandeep Kaur

Keyword(s):

Machine Learning ◽

Comparative Analysis ◽

Stock Market ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Stock Market Prediction

Download Full-text

Evaluating Variable Selection and Machine Learning Algorithms for Estimating Forest Heights by Combining Lidar and Hyperspectral Data

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi9090507 ◽

2020 ◽

Vol 9 (9) ◽

pp. 507

Author(s):

Sanjiwana Arjasakusuma ◽

Sandiaga Swahyu Kusuma ◽

Stuart Phinn

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Learning Algorithms ◽

Principal Component ◽

Hyperspectral Data ◽

Machine Learning Algorithms ◽

Gradient Boosting ◽

Support Vector ◽

Forest Height ◽

Extreme Gradient Boosting

Machine learning has been employed for various mapping and modeling tasks using input variables from different sources of remote sensing data. For feature selection involving high- spatial and spectral dimensionality data, various methods have been developed and incorporated into the machine learning framework to ensure an efficient and optimal computational process. This research aims to assess the accuracy of various feature selection and machine learning methods for estimating forest height using AISA (airborne imaging spectrometer for applications) hyperspectral bands (479 bands) and airborne light detection and ranging (lidar) height metrics (36 metrics), alone and combined. Feature selection and dimensionality reduction using Boruta (BO), principal component analysis (PCA), simulated annealing (SA), and genetic algorithm (GA) in combination with machine learning algorithms such as multivariate adaptive regression spline (MARS), extra trees (ET), support vector regression (SVR) with radial basis function, and extreme gradient boosting (XGB) with trees (XGbtree and XGBdart) and linear (XGBlin) classifiers were evaluated. The results demonstrated that the combinations of BO-XGBdart and BO-SVR delivered the best model performance for estimating tropical forest height by combining lidar and hyperspectral data, with R2 = 0.53 and RMSE = 1.7 m (18.4% of nRMSE and 0.046 m of bias) for BO-XGBdart and R2 = 0.51 and RMSE = 1.8 m (15.8% of nRMSE and −0.244 m of bias) for BO-SVR. Our study also demonstrated the effectiveness of BO for variables selection; it could reduce 95% of the data to select the 29 most important variables from the initial 516 variables from lidar metrics and hyperspectral data.

Download Full-text

Feature Selection and Polarity Classification Using Machine Learning Algorithms NB & SVM

SSRN Electronic Journal ◽

10.2139/ssrn.3419763 ◽

2019 ◽

Author(s):

Smita Bhanap ◽

Seema Babrekar

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Polarity Classification

Download Full-text