Intelligent Portfolio Management via NLP Analysis of Financial 10-k Statements

Author(s):  
Purva Singh

The paper attempts to analyze if the sentiment stability of financial 10-K reports over time can determine the company’s future mean returns. A diverse portfolio of stocks was selected to test this hypothesis. The proposed framework downloads 10-K reports of the companies from SEC’s EDGAR database. It passes them through the preprocessing pipeline to extract critical sections of the filings to perform NLP analysis. Using Loughran and McDonald sentiment word list, the framework generates sentiment TF-IDF from the 10-K documents to calculate the cosine similarity between two consecutive 10-K reports and proposes to leverage this cosine similarity as the alpha factor. For analyzing the effectiveness of our alpha factor at predicting future returns, the framework uses the alphalens library to perform factor return analysis, turnover analysis, and for comparing the Sharpe ratio of potential alpha factors. The results show that there exists a strong correlation between the sentiment stability of our portfolio’s 10-K statements and its future mean returns. For the benefit of the research community, the code and Jupyter notebooks related to this paper have been open-sourced on Github1.

2020 ◽  
Vol 17 (3) ◽  
pp. 67-81
Author(s):  
Sebastian Lahajnar ◽  
Alenka Rožanec

The article explores the correlation strength of the ten most important cryptocurrencies, emphasizing the examination of differences during the periods of rising and falling prices. The daily and weekly returns of selected cryptocurrencies are taken as the basis for calculating and determining the correlation strength using the Pearson correlation coefficient. The survey covers the period from the beginning of 2017 to Bitcoin’s last local bottom in mid-March 2020. Research findings are as follows: 1) the most important cryptocurrencies are mostly moderately positively correlated with each other over time; 2) correlation strength decreases slightly during the bull period, but mostly remain in the range of moderate correlation; 3) correlation strength increases significantly during the bear period, with most cryptocurrencies strongly correlated with each other. The results do not change significantly if the daily or weekly cryptocurrency returns are used as the basis. A strong correlation in the period of falling prices prevents the effective diversification of the cryptocurrency portfolio, which must be considered when investing funds in the cryptocurrency market.


Author(s):  
Niddal Imam ◽  
Biju Issac ◽  
Seibu Mary Jacob

Twitter has changed the way people get information by allowing them to express their opinion and comments on the daily tweets. Unfortunately, due to the high popularity of Twitter, it has become very attractive to spammers. Unlike other types of spam, Twitter spam has become a serious issue in the last few years. The large number of users and the high amount of information being shared on Twitter play an important role in accelerating the spread of spam. In order to protect the users, Twitter and the research community have been developing different spam detection systems by applying different machine-learning techniques. However, a recent study showed that the current machine learning-based detection systems are not able to detect spam accurately because spam tweet characteristics vary over time. This issue is called “Twitter Spam Drift”. In this paper, a semi-supervised learning approach (SSLA) has been proposed to tackle this. The new approach uses the unlabeled data to learn the structure of the domain. Different experiments were performed on English and Arabic datasets to test and evaluate the proposed approach and the results show that the proposed SSLA can reduce the effect of Twitter spam drift and outperform the existing techniques.


Author(s):  
Jinho Lee ◽  
Raehyun Kim ◽  
Seok-Won Yi ◽  
Jaewoo Kang

Generating an investment strategy using advanced deep learning methods in stock markets has recently been a topic of interest. Most existing deep learning methods focus on proposing an optimal model or network architecture by maximizing return. However, these models often fail to consider and adapt to the continuously changing market conditions. In this paper, we propose the Multi-Agent reinforcement learning-based Portfolio management System (MAPS). MAPS is a cooperative system in which each agent is an independent "investor" creating its own portfolio. In the training procedure, each agent is guided to act as diversely as possible while maximizing its own return with a carefully designed loss function. As a result, MAPS as a system ends up with a diversified portfolio. Experiment results with 12 years of US market data show that MAPS outperforms most of the baselines in terms of Sharpe ratio. Furthermore, our results show that adding more agents to our system would allow us to get a higher Sharpe ratio by lowering risk with a more diversified portfolio.


2017 ◽  
Author(s):  
Ji Zhou ◽  
Christopher Applegate ◽  
Albor Dobon Alonso ◽  
Daniel Reynolds ◽  
Simon Orford ◽  
...  

AbstractBackgroundPlants demonstrate dynamic growth phenotypes that are determined by genetic and environmental factors. Phenotypic analysis of growth features over time is a key approach to understand how plants interact with environmental change as well as respond to different treatments. Although the importance of measuring dynamic growth traits is widely recognised, available open software tools are limited in terms of batch processing of image datasets, multiple trait analysis, software usability and cross-referencing results between experiments, making automated phenotypic analysis problematic.ResultsHere, we present Leaf-GP (Growth Phenotypes), an easy-to-use and open software application that can be executed on different platforms. To facilitate diverse scientific user communities, we provide three versions of the software, including a graphic user interface (GUI) for personal computer (PC) users, a command-line interface for high-performance computer (HPC) users, and an interactive Jupyter Notebook (also known as the iPython Notebook) for computational biologists and computer scientists. The software is capable of extracting multiple growth traits automatically from large image datasets. We have utilised it in Arabidopsis thaliana and wheat (Triticum aestivum) growth studies at the Norwich Research Park (NRP, UK). By quantifying growth phenotypes over time, we are able to identify diverse plant growth patterns based on a variety of key growth-related phenotypes under varied experimental conditions.As Leaf-GP has been evaluated with noisy image series acquired by different imaging devices and still produced reliable biologically relevant outputs, we believe that our automated analysis workflow and customised computer vision based feature extraction algorithms can facilitate a broader plant research community for their growth and development studies. Furthermore, because we implemented Leaf-GP based on open Python-based computer vision, image analysis and machine learning libraries, our software can not only contribute to biological research, but also exhibit how to utilise existing open numeric and scientific libraries (including Scikit-image, OpenCV, SciPy and Scikit-learn) to build sound plant phenomics analytic solutions, efficiently and effectively.ConclusionsLeaf-GP is a comprehensive software application that provides three approaches to quantify multiple growth phenotypes from large image series. We demonstrate its usefulness and high accuracy based on two biological applications: (1) the quantification of growth traits for Arabidopsis genotypes under two temperature conditions; and (2) measuring wheat growth in the glasshouse over time. The software is easy-to-use and cross-platform, which can be executed on Mac OS, Windows and high-performance computing clusters (HPC), with open Python-based scientific libraries preinstalled. We share our modulated source code and executables (.exe for Windows; .app for Mac) together with this paper to serve the plant research community. The software, source code and experimental results are freely available at https://github.com/Crop-Phenomics-Group/Leaf-GP/releases.


2017 ◽  
Vol 41 (4) ◽  
pp. 387-409
Author(s):  
Jing Chen ◽  
Randall Jackson

The year 2015 marked the fiftieth anniversary of West Virginia University’s (WVU) Regional Research Institute (RRI), which has played an important role in many scientific collaboration networks. Through social network analysis (SNA) focusing on the RRI research community since its inception in 1965, this article illustrates the role that organizations and the networks they promote can play in scientific problem domains, promoting scholarly collaborations and coauthorship in the field of regional science. We analyzed an evolving WVU RRI coauthorship network that has grown and gained in complexity over time in terms of (1) global metrics, (2) components and cluster analysis, (3) centrality, and (4) PageRank and AuthorRank. The results of these analyses depict a well-developed and influential scientific collaboration structure within both WVU and the regional science research community.


1982 ◽  
Vol 2 (2) ◽  
pp. 103-113 ◽  
Author(s):  
Debora L. Dubreuil ◽  
Nicholas P. Spanos ◽  
Lorne D. Bertrand

Two interrelated experiments investigated the hypothesis that hypnotic amnesia dissipates “spontaneously” over time. After the administration of a hypnotic amnesia suggestion for a previously learned word list, Retest subjects in Experiment 1 received two successive recall challenges before cancellation of the suggestion. Delay subjects received only one challenge. It occurred at the same time that Retest subjects received their second challenge. No differences in amount of amnesia were found between the Delay trial and either of the Retest trials, thereby failing to provide support for the dissipation hypothesis. Experiment 2 manipulated subjects' expectations concerning the amount of amnesia typically shown on a second challenge. Subjects in the “Remember More” or “Remember Less” conditions were led to believe that they would recall either more or less critical material on the second amnesia challenge. Both Retest (no expectancy) and Remember More subjects recalled significantly more words on the second challenge than on the first. However, a significantly greater proportion of Remember More subjects than Retest subjects showed recall increments on the second challenge. Remember Less subjects showed no significant difference in the amount recalled on the two challenges. These results do not support the dissipation hypothesis of hypnotic amnesia. Instead, they are consistent with theoretical accounts of amnesia as a strategic enactment carried out by subjects in response to the unfolding social demands of the testing situation.


2020 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Tihana Škrinjarić ◽  
Zrinka Lovretin Golubić ◽  
Zrinka Orlović

Purpose This paper aims to analyze the effects of investors’ sentiment, return and risk series on one to another of selected exchange rates. The empirical analysis consists of a time-varying inter-dependence between the observed variables, with the focus on spillovers between the variables. Design/methodology/approach Monthly data on the index Sentix, exchange rates EUR–USD, EUR–CHF and EUR–JPY are analyzed from February 2003 to December 2019. The applied methodology consists of vector autoregression models (VAR) with Diebold and Yilmaz (2009, 2011) spillover indices. Findings The results of the empirical research indicate that using static analysis could result in misleading conclusions, with dynamic analysis indicating that the financial of 2007-2008 and specific negative events increase the spillovers of shock between the observed variables for all three exchange rates. The sources of shocks in the model change over time because of variables changing their positions being net emitters and net receivers of shocks. Research limitations/implications The shortfalls of this study include using the monthly data frequency, as this was available for the authors, namely, investors are interested to obtain new information on a weekly and daily basis, not only monthly. However, at the time of writing this research, we could obtain only monthly data. Practical implications As the obtained results are in line with previous literature and were found to be robust, there exists the potential to use such analysis in the future when forecasting risk and return series for portfolio management purposes. Thus, a basic comparison was made regarding the investment strategies, which were based on the results from the estimation. It was shown that using information about shock spillovers could result in strategies that can obtain better portfolio value over time compared to basic benchmark strategies. Originality/value First, this paper allows for the spillovers of shocks in variables within the VAR models in all directions. Second, a dynamic analysis is included in the study. Third, the mentioned spillover indices are included in the study as well.


Sign in / Sign up

Export Citation Format

Share Document