Nvidia's Stock Returns Prediction Using Machine Learning Techniques for Time Series Forecasting Problem

2021 ◽  
Vol 8 (55) ◽  
pp. 44-62
Author(s):  
Marcin Chlebus ◽  
Michał Dyczko ◽  
Michał Woźniak

Abstract Statistical learning models have profoundly changed the rules of trading on the stock exchange. Quantitative analysts try to utilise them predict potential profits and risks in a better manner. However, the available studies are mostly focused on testing the increasingly complex machine learning models on a selected sample of stocks, indexes etc. without a thorough understanding and consideration of their economic environment. Therefore, the goal of the article is to create an effective forecasting machine learning model of daily stock returns for a preselected company characterised by a wide portfolio of strategic branches influencing its valuation. We use Nvidia Corporation stock covering the period from 07/2012 to 12/2018 and apply various econometric and machine learning models, considering a diverse group of exogenous features, to analyse the research problem. The results suggest that it is possible to develop predictive machine learning models of Nvidia stock returns (based on many independent environmental variables) which outperform both simple naïve and econometric models. Our contribution to literature is twofold. First, we provide an added value to the strand of literature on the choice of model class to the stock returns prediction problem. Second, our study contributes to the thread of selecting exogenous variables and the need for their stationarity in the case of time series models.

2020 ◽  
Vol 12 (16) ◽  
pp. 2655 ◽  
Author(s):  
Hugo Crisóstomo de Castro Filho ◽  
Osmar Abílio de Carvalho Júnior ◽  
Osmar Luiz Ferreira de Carvalho ◽  
Pablo Pozzobon de Bem ◽  
Rebeca dos Santos de Moura ◽  
...  

The Synthetic Aperture Radar (SAR) time series allows describing the rice phenological cycle by the backscattering time signature. Therefore, the advent of the Copernicus Sentinel-1 program expands studies of radar data (C-band) for rice monitoring at regional scales, due to the high temporal resolution and free data distribution. Recurrent Neural Network (RNN) model has reached state-of-the-art in the pattern recognition of time-sequenced data, obtaining a significant advantage at crop classification on the remote sensing images. One of the most used approaches in the RNN model is the Long Short-Term Memory (LSTM) model and its improvements, such as Bidirectional LSTM (Bi-LSTM). Bi-LSTM models are more effective as their output depends on the previous and the next segment, in contrast to the unidirectional LSTM models. The present research aims to map rice crops from Sentinel-1 time series (band C) using LSTM and Bi-LSTM models in West Rio Grande do Sul (Brazil). We compared the results with traditional Machine Learning techniques: Support Vector Machines (SVM), Random Forest (RF), k-Nearest Neighbors (k-NN), and Normal Bayes (NB). The developed methodology can be subdivided into the following steps: (a) acquisition of the Sentinel time series over two years; (b) data pre-processing and minimizing noise from 3D spatial-temporal filters and smoothing with Savitzky-Golay filter; (c) time series classification procedures; (d) accuracy analysis and comparison among the methods. The results show high overall accuracy and Kappa (>97% for all methods and metrics). Bi-LSTM was the best model, presenting statistical differences in the McNemar test with a significance of 0.05. However, LSTM and Traditional Machine Learning models also achieved high accuracy values. The study establishes an adequate methodology for mapping the rice crops in West Rio Grande do Sul.


2020 ◽  
Vol 28 (2) ◽  
pp. 253-265 ◽  
Author(s):  
Gabriela Bitencourt-Ferreira ◽  
Amauri Duarte da Silva ◽  
Walter Filgueira de Azevedo

Background: The elucidation of the structure of cyclin-dependent kinase 2 (CDK2) made it possible to develop targeted scoring functions for virtual screening aimed to identify new inhibitors for this enzyme. CDK2 is a protein target for the development of drugs intended to modulate cellcycle progression and control. Such drugs have potential anticancer activities. Objective: Our goal here is to review recent applications of machine learning methods to predict ligand- binding affinity for protein targets. To assess the predictive performance of classical scoring functions and targeted scoring functions, we focused our analysis on CDK2 structures. Methods: We have experimental structural data for hundreds of binary complexes of CDK2 with different ligands, many of them with inhibition constant information. We investigate here computational methods to calculate the binding affinity of CDK2 through classical scoring functions and machine- learning models. Results: Analysis of the predictive performance of classical scoring functions available in docking programs such as Molegro Virtual Docker, AutoDock4, and Autodock Vina indicated that these methods failed to predict binding affinity with significant correlation with experimental data. Targeted scoring functions developed through supervised machine learning techniques showed a significant correlation with experimental data. Conclusion: Here, we described the application of supervised machine learning techniques to generate a scoring function to predict binding affinity. Machine learning models showed superior predictive performance when compared with classical scoring functions. Analysis of the computational models obtained through machine learning could capture essential structural features responsible for binding affinity against CDK2.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Moojung Kim ◽  
Young Jae Kim ◽  
Sung Jin Park ◽  
Kwang Gi Kim ◽  
Pyung Chun Oh ◽  
...  

Abstract Background Annual influenza vaccination is an important public health measure to prevent influenza infections and is strongly recommended for cardiovascular disease (CVD) patients, especially in the current coronavirus disease 2019 (COVID-19) pandemic. The aim of this study is to develop a machine learning model to identify Korean adult CVD patients with low adherence to influenza vaccination Methods Adults with CVD (n = 815) from a nationally representative dataset of the Fifth Korea National Health and Nutrition Examination Survey (KNHANES V) were analyzed. Among these adults, 500 (61.4%) had answered "yes" to whether they had received seasonal influenza vaccinations in the past 12 months. The classification process was performed using the logistic regression (LR), random forest (RF), support vector machine (SVM), and extreme gradient boosting (XGB) machine learning techniques. Because the Ministry of Health and Welfare in Korea offers free influenza immunization for the elderly, separate models were developed for the < 65 and ≥ 65 age groups. Results The accuracy of machine learning models using 16 variables as predictors of low influenza vaccination adherence was compared; for the ≥ 65 age group, XGB (84.7%) and RF (84.7%) have the best accuracies, followed by LR (82.7%) and SVM (77.6%). For the < 65 age group, SVM has the best accuracy (68.4%), followed by RF (64.9%), LR (63.2%), and XGB (61.4%). Conclusions The machine leaning models show comparable performance in classifying adult CVD patients with low adherence to influenza vaccination.


2021 ◽  
Vol 11 (3) ◽  
pp. 1323
Author(s):  
Medard Edmund Mswahili ◽  
Min-Jeong Lee ◽  
Gati Lother Martin ◽  
Junghyun Kim ◽  
Paul Kim ◽  
...  

Cocrystals are of much interest in industrial application as well as academic research, and screening of suitable coformers for active pharmaceutical ingredients is the most crucial and challenging step in cocrystal development. Recently, machine learning techniques are attracting researchers in many fields including pharmaceutical research such as quantitative structure-activity/property relationship. In this paper, we develop machine learning models to predict cocrystal formation. We extract descriptor values from simplified molecular-input line-entry system (SMILES) of compounds and compare the machine learning models by experiments with our collected data of 1476 instances. As a result, we found that artificial neural network shows great potential as it has the best accuracy, sensitivity, and F1 score. We also found that the model achieved comparable performance with about half of the descriptors chosen by feature selection algorithms. We believe that this will contribute to faster and more accurate cocrystal development.


2021 ◽  
Author(s):  
Erik Otović ◽  
Marko Njirjak ◽  
Dario Jozinović ◽  
Goran Mauša ◽  
Alberto Michelini ◽  
...  

&lt;p&gt;In this study, we compared the performance of machine learning models trained using transfer learning and those that were trained from scratch - on time series data. Four machine learning models were used for the experiment. Two models were taken from the field of seismology, and the other two are general-purpose models for working with time series data. The accuracy of selected models was systematically observed and analyzed when switching within the same domain of application (seismology), as well as between mutually different domains of application (seismology, speech, medicine, finance). In seismology, we used two databases of local earthquakes (one in counts, and the other with the instrument response removed) and a database of global earthquakes for predicting earthquake magnitude; other datasets targeted classifying spoken words (speech), predicting stock prices (finance) and classifying muscle movement from EMG signals (medicine).&lt;br&gt;In practice, it is very demanding and sometimes impossible to collect datasets of tagged data large enough to successfully train a machine learning model. Therefore, in our experiment, we use reduced data sets of 1,500 and 9,000 data instances to mimic such conditions. Using the same scaled-down datasets, we trained two sets of machine learning models: those that used transfer learning for training and those that were trained from scratch. We compared the performances between pairs of models in order to draw conclusions about the utility of transfer learning. In order to confirm the validity of the obtained results, we repeated the experiments several times and applied statistical tests to confirm the significance of the results. The study shows when, within the set experimental framework, the transfer of knowledge brought improvements in terms of model accuracy and in terms of model convergence rate.&lt;br&gt;&lt;br&gt;Our results show that it is possible to achieve better performance and faster convergence by transferring knowledge from the domain of global earthquakes to the domain of local earthquakes; sometimes also vice versa. However, improvements in seismology can sometimes also be achieved by transferring knowledge from medical and audio domains. The results show that the transfer of knowledge between other domains brought even more significant improvements, compared to those within the field of seismology. For example, it has been shown that models in the field of sound recognition have achieved much better performance compared to classical models and that the domain of sound recognition is very compatible with knowledge from other domains. We came to similar conclusions for the domains of medicine and finance. Ultimately, the paper offers suggestions when transfer learning is useful, and the explanations offered can provide a good starting point for knowledge transfer using time series data.&lt;/p&gt;


Sensors ◽  
2020 ◽  
Vol 20 (21) ◽  
pp. 6019
Author(s):  
José Manuel Lozano Domínguez ◽  
Faroq Al-Tam ◽  
Tomás de J. Mateo Sanguino ◽  
Noélia Correia

Improving road safety through artificial intelligence-based systems is now crucial turning smart cities into a reality. Under this highly relevant and extensive heading, an approach is proposed to improve vehicle detection in smart crosswalks using machine learning models. Contrarily to classic fuzzy classifiers, machine learning models do not require the readjustment of labels that depend on the location of the system and the road conditions. Several machine learning models were trained and tested using real traffic data taken from urban scenarios in both Portugal and Spain. These include random forest, time-series forecasting, multi-layer perceptron, support vector machine, and logistic regression models. A deep reinforcement learning agent, based on a state-of-the-art double-deep recurrent Q-network, is also designed and compared with the machine learning models just mentioned. Results show that the machine learning models can efficiently replace the classic fuzzy classifier.


2019 ◽  
Vol 175 ◽  
pp. 72-86 ◽  
Author(s):  
Domingos S. de O. Santos Júnior ◽  
João F.L. de Oliveira ◽  
Paulo S.G. de Mattos Neto

Sign in / Sign up

Export Citation Format

Share Document