scholarly journals Simulating Multi-Asset Classes Prices Using Wasserstein Generative Adversarial Network: A Study of Stocks, Futures and Cryptocurrency

2022 ◽  
Vol 15 (1) ◽  
pp. 26
Author(s):  
Feng Han ◽  
Xiaojuan Ma ◽  
Jiheng Zhang

Financial data are expensive and highly sensitive with limited access. We aim to generate abundant datasets given the original prices while preserving the original statistical features. We introduce the Wasserstein Generative Adversarial Network with Gradient Penalty (WGAN-GP) into the field of the stock market, futures market and cryptocurrency market. We train our model on various datasets, including the Hong Kong stock market, Hang Seng Index Composite stocks, precious metal futures contracts listed on the Chicago Mercantile Exchange and Japan Exchange Group, and cryptocurrency spots and perpetual contracts on Binance at various minute-level intervals. We quantify the difference of generated results (836,280 data points) and original data by MAE, MSE, RMSE and K-S distances. Results show that WGAN-GP can simulate assets prices and show the potential of a market simulator for trading analysis. We might be the first to look into multi-asset classes in a systematic approach with minute intervals across stocks, futures and cryptocurrency markets. We also contribute to quantitative analysis methodology for generated and original price data quality.

2021 ◽  
Vol 11 (5) ◽  
pp. 2166
Author(s):  
Van Bui ◽  
Tung Lam Pham ◽  
Huy Nguyen ◽  
Yeong Min Jang

In the last decade, predictive maintenance has attracted a lot of attention in industrial factories because of its wide use of the Internet of Things and artificial intelligence algorithms for data management. However, in the early phases where the abnormal and faulty machines rarely appeared in factories, there were limited sets of machine fault samples. With limited fault samples, it is difficult to perform a training process for fault classification due to the imbalance of input data. Therefore, data augmentation was required to increase the accuracy of the learning model. However, there were limited methods to generate and evaluate the data applied for data analysis. In this paper, we introduce a method of using the generative adversarial network as the fault signal augmentation method to enrich the dataset. The enhanced data set could increase the accuracy of the machine fault detection model in the training process. We also performed fault detection using a variety of preprocessing approaches and classified the models to evaluate the similarities between the generated data and authentic data. The generated fault data has high similarity with the original data and it significantly improves the accuracy of the model. The accuracy of fault machine detection reaches 99.41% with 20% original fault machine data set and 93.1% with 0% original fault machine data set (only use generate data only). Based on this, we concluded that the generated data could be used to mix with original data and improve the model performance.


Author(s):  
Chi Seng Pun ◽  
Lei Wang ◽  
Hoi Ying Wong

Modern day trading practice resembles a thought experiment, where investors imagine various possibilities of future stock market and invest accordingly. Generative adversarial network (GAN) is highly relevant to this trading practice in two ways. First, GAN generates synthetic data by a neural network that is technically indistinguishable from the reality, which guarantees the reasonableness of the experiment. Second, GAN generates multitudes of fake data, which implements half of the experiment. In this paper, we present a new architecture of GAN and adapt it to portfolio risk minimization problem by adding a regression network to GAN (implementing the second half of the experiment). The new architecture is termed GANr. Battling against two distinctive networks: discriminator and regressor, GANr's generator aims to simulate a stock market that is close to the reality while allow for all possible scenarios. The resulting portfolio resembles a robust portfolio with data-driven ambiguity. Our empirical studies show that GANr portfolio is more resilient to bleak financial scenarios than CLSGAN and LASSO portfolios.


2019 ◽  
Vol 1 (2) ◽  
pp. 99-120 ◽  
Author(s):  
Tongtao Zhang ◽  
Heng Ji ◽  
Avirup Sil

We propose a new framework for entity and event extraction based on generative adversarial imitation learning—an inverse reinforcement learning method using a generative adversarial network (GAN). We assume that instances and labels yield to various extents of difficulty and the gains and penalties (rewards) are expected to be diverse. We utilize discriminators to estimate proper rewards according to the difference between the labels committed by the ground-truth (expert) and the extractor (agent). Our experiments demonstrate that the proposed framework outperforms state-of-the-art methods.


Symmetry ◽  
2021 ◽  
Vol 13 (1) ◽  
pp. 126
Author(s):  
Zhixian Yin ◽  
Kewen Xia ◽  
Ziping He ◽  
Jiangnan Zhang ◽  
Sijie Wang ◽  
...  

The use of low-dose computed tomography (LDCT) in medical practice can effectively reduce the radiation risk of patients, but it may increase noise and artefacts, which can compromise diagnostic information. The methods based on deep learning can effectively improve image quality, but most of them use a training set of aligned image pairs, which are difficult to obtain in practice. In order to solve this problem, on the basis of the Wasserstein generative adversarial network (GAN) framework, we propose a generative adversarial network combining multi-perceptual loss and fidelity loss. Multi-perceptual loss uses the high-level semantic features of the image to achieve the purpose of noise suppression by minimizing the difference between the LDCT image and the normal-dose computed tomography (NDCT) image in the feature space. In addition, L2 loss is used to calculate the loss between the generated image and the original image to constrain the difference between the denoised image and the original image, so as to ensure that the image generated by the network using the unpaired images is not distorted. Experiments show that the proposed method performs comparably to the current deep learning methods which utilize paired image for image denoising.


2020 ◽  
Author(s):  
Mi Jiaqi ◽  
Hao Xia ◽  
Yang Si ◽  
Gao Wanlin ◽  
Li Minzan ◽  
...  

Abstract Background: The artificial identification of rare plants is always a challenging problem in plant taxonomy. Although the convolutional neural network (CNN) in the deep learning method can better realize the automatic classification of plant samples through training, the model accuracy is difficult to reach the human eye discrimination due to the quantitative limit of training samples. Thus, effective data enhancement is vital to improve the generalization ability and robustness of deep learning models, especially for plant small-scale data classification task. Different from traditional methods, the Generative adversarial network (GAN) mimics original data distribution and produces new samples with similar features which can help classifiers equip with extraordinary generalization ability. It has not been studied that data enhancement for plant samples’ characteristics with GAN since sliced bread. Result: In this study, we present a novel GAN model named as Residual Wasserstein GAN (Res-WGAN) for data enhancement. To further adapt to plant small-scale datasets, residual blocks were introduced into the classic WGAN-GP as the basic network unit. These blocks enrich the presentation skills and sustained parameters unchanged simultaneously. Moreover, we enforce the idea from SRGAN to take content loss into a final function, which guarantes the similarity between generated samples and original samples in high-dimensional features. Benefiting from these improvements, the proposed Res-WGAN expanded original datasets efficiently. We test it on the ResNet and the experimental results show that new datasets combined transfer learning significantly promoted the accuracy of classification, especially at testing data. To illustrate the generalization of the model, more particular small datasets are applied for expansion and classification in this paper. Conclusions: Our works report competitive accuracy results than other data enhancement methods, and user study confirms it’s an ideal alternative strategy for small-scale plant datasets enhancement. Developing robust and effective small-scale plants classification method to replace expert testimony, is highly relevant for agricultural automation development.


2020 ◽  
Author(s):  
Belén Vega-Márquez ◽  
Cristina Rubio-Escudero ◽  
Isabel Nepomuceno-Chamorro

Abstract The generation of synthetic data is becoming a fundamental task in the daily life of any organization due to the new protection data laws that are emerging. Because of the rise in the use of Artificial Intelligence, one of the most recent proposals to address this problem is the use of Generative Adversarial Networks (GANs). These types of networks have demonstrated a great capacity to create synthetic data with very good performance. The goal of synthetic data generation is to create data that will perform similarly to the original dataset for many analysis tasks, such as classification. The problem of GANs is that in a classification problem, GANs do not take class labels into account when generating new data, it is treated as any other attribute. This research work has focused on the creation of new synthetic data from datasets with different characteristics with a Conditional Generative Adversarial Network (CGAN). CGANs are an extension of GANs where the class label is taken into account when the new data is generated. The performance of our results has been measured in two different ways: firstly, by comparing the results obtained with classification algorithms, both in the original datasets and in the data generated; secondly, by checking that the correlation between the original data and those generated is minimal.


2019 ◽  
Vol 11 (23) ◽  
pp. 6699
Author(s):  
Suyang Zhou ◽  
Zijian Hu ◽  
Zhi Zhong ◽  
Di He ◽  
Meng Jiang

The convergence of energy security and environmental protection has given birth to the development of integrated energy systems (IES). However, the different physical characteristics and complex coupling of different energy sources have deeply troubled researchers. With the rapid development of AI and big data, some attempts to apply data-driven methods to IES have been made. Data-driven technologies aim to abandon complex IES modeling, instead mining the mapping relationships between different parameters based on massive volumes of operating data. However, integrated energy system construction is still in the initial stage of development and operational data are difficult to obtain, or the operational scenarios contained in the data are not enough to support data-driven technologies. In this paper, we first propose an IES operating scenario generator, based on a Generative Adversarial Network (GAN), to produce high quality IES operational data, including energy price, load, and generator output. We estimate the quality of the generated data, in both visual and quantitative aspects. Secondly, we propose a control strategy based on the Q-learning algorithm for a renewable energy and storage system with high uncertainty. The agent can accurately map between the control strategy and the operating states. Furthermore, we use the original data set and the expanded data set to train an agent; the latter works better, confirming that the generated data complements the original data set and enriches the running scenarios.


2021 ◽  
Vol 2 (4) ◽  
Author(s):  
Yuta Takahashi ◽  
Han-ten Chang ◽  
Akie Nakai ◽  
Rina Kagawa ◽  
Hiroyasu Ando ◽  
...  

AbstractMachine learning, applied to medical data, can uncover new knowledge and support medical practices. However, analyzing medical data by machine learning methods presents a trade-off between accuracy and privacy. To overcome the trade-off, we apply the data collaboration analysis method to medical data. This method using artificial dummy data enables analysis to compare distributed information without using the original data. The purpose of our experiment is to identify patients diagnosed with diabetes mellitus (DM), using 29,802 instances of real data obtained from the University of Tsukuba Hospital between 01/03/2013 and 30/09/2018. The whole data is divided into a number of datasets to simulate different hospitals. We propose the following improvements for the data collaboration analysis. (1) Making the dummy data which has a reality and (2) using non-linear reconverting functions into the comparable space. Both can be realized using the generative adversarial network (GAN) and Node2Vec, respectively. The improvement effects of dummy data with GAN scores more than 10% over the effects of dummy data with random numbers. Furthermore, the improvement effect of the re-conversion by Node2Vec with GAN anchor data scores about 20% higher than the linear method with random dummy data. Our results reveal that the data collaboration method with appropriate modifications, depending on data type, improves analysis performance.


GigaScience ◽  
2021 ◽  
Vol 10 (2) ◽  
Author(s):  
Ruichen Rong ◽  
Shuang Jiang ◽  
Lin Xu ◽  
Guanghua Xiao ◽  
Yang Xie ◽  
...  

Abstract Background Trillions of microbes inhabit the human body and have a profound effect on human health. The recent development of metagenome-wide association studies and other quantitative analysis methods accelerate the discovery of the associations between human microbiome and diseases. To assess the strengths and limitations of these analytical tools, simulating realistic microbiome datasets is critically important. However, simulating the real microbiome data is challenging because it is difficult to model their correlation structure using explicit statistical models. Results To address the challenge of simulating realistic microbiome data, we designed a novel simulation framework termed MB-GAN, by using a generative adversarial network (GAN) and utilizing methodology advancements from the deep learning community. MB-GAN can automatically learn from given microbial abundances and compute simulated abundances that are indistinguishable from them. In practice, MB-GAN showed the following advantages. First, MB-GAN avoids explicit statistical modeling assumptions, and it only requires real datasets as inputs. Second, unlike the traditional GANs, MB-GAN is easily applicable and can converge efficiently. Conclusions By applying MB-GAN to a case-control gut microbiome study of 396 samples, we demonstrated that the simulated data and the original data had similar first-order and second-order properties, including sparsity, diversities, and taxa-taxa correlations. These advantages are suitable for further microbiome methodology development where high-fidelity microbiome data are needed.


Sign in / Sign up

Export Citation Format

Share Document