Automatic Melody Composition Using Enhanced GAN

In traditional music composition, the composer has a special knowledge of music and combines emotion and creative experience to create music. As computer technology has evolved, various music-related technologies have been developed. To create new music, a considerable amount of time is required. Therefore, a system is required that can automatically compose music from input music. This study proposes a novel melody composition method that enhanced the original generative adversarial network (GAN) model based on individual bars. Two discriminators were used to form the enhanced GAN model: one was a long short-term memory (LSTM) model that was used to ensure correlation between the bars, and the other was a convolutional neural network (CNN) model that was used to ensure rationality of the bar structure. Experiments were conducted using bar encoding and the enhanced GAN model to compose a new melody and evaluate the quality of the composition melody. In the evaluation method, the TFIDF algorithm was also used to calculate the structural differences between four types of musical instrument digital interface (MIDI) file (i.e., randomly composed melody, melody composed by the original GAN, melody composed by the proposed method, and the real melody). Using the TFIDF algorithm, the structures of the melody composed were compared by the proposed method with the real melody and the structure of the traditional melody was compared with the structure of the real melody. The experimental results showed that the melody composed by the proposed method had more similarity with real melody structure with a difference of only 8% than that of the traditional melody structure.

Download Full-text

A generative adversarial network–based method for generating negative financial samples

International Journal of Distributed Sensor Networks ◽

10.1177/1550147720907053 ◽

2020 ◽

Vol 16 (2) ◽

pp. 155014772090705

Author(s):

Zhaohui Zhang ◽

Lijun Yang ◽

Ligong Chen ◽

Qiuwen Liu ◽

Ying Meng ◽

...

Keyword(s):

Evaluation Method ◽

Short Term Memory ◽

Data Distribution ◽

Real Data ◽

Generation Model ◽

Short Term ◽

Generative Adversarial Network ◽

Term Memory ◽

Adversarial Network ◽

Long Short Term Memory

In financial anti-fraud field, negative samples are small and sparse with serious sample imbalanced problem. Generating negative samples consistent with original data to naturally solve imbalanced problem is a serious problem. This article proposes a new method to solve this problem. We introduce a new generation model, combined Generative Adversarial Network with Long Short-Term Memory network for one-dimensional negative financial samples. The characteristic association between transaction sequences can be learned by long short-term memory layer, and the generator covers real data distribution by the adversarial discriminator with time-sequence. Mapping data distribution to feature space is a common evaluation method of synthetic data; however, relationships between data attributes have been ignored in online transactions. We define a comprehensive evaluation method to evaluate the validity of generated samples from data distribution and attribute characteristics. Experimental results on real bank B2B transaction data show that the proposed model has higher overall ratings, which is 10% higher than traditional generation models. Finally, well-trained model is used to generate negative samples and form new dataset. The classification results on new datasets show that precision and recall are all higher than baseline models. Our work has a certain practical value and provides a new idea to solve imbalanced problem in whatever fields.

Download Full-text

Generative adversarial network for glioblastoma ensures morphologic variations and improves diagnostic model for isocitrate dehydrogenase mutant type

Scientific Reports ◽

10.1038/s41598-021-89477-w ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Ji Eun Park ◽

Dain Eun ◽

Ho Sung Kim ◽

Da Hyun Lee ◽

Ryoung Woo Jang ◽

...

Keyword(s):

Isocitrate Dehydrogenase ◽

Synthetic Data ◽

Diagnostic Model ◽

Generative Adversarial Network ◽

Classification Rate ◽

Mutant Type ◽

Clinical Model ◽

The Real ◽

Adversarial Network ◽

Synthetic Images

AbstractGenerative adversarial network (GAN) creates synthetic images to increase data quantity, but whether GAN ensures meaningful morphologic variations is still unknown. We investigated whether GAN-based synthetic images provide sufficient morphologic variations to improve molecular-based prediction, as a rare disease of isocitrate dehydrogenase (IDH)-mutant glioblastomas. GAN was initially trained on 500 normal brains and 110 IDH-mutant high-grade astocytomas, and paired contrast-enhanced T1-weighted and FLAIR MRI data were generated. Diagnostic models were developed from real IDH-wild type (n = 80) with real IDH-mutant glioblastomas (n = 38), or with synthetic IDH-mutant glioblastomas, or augmented by adding both real and synthetic IDH-mutant glioblastomas. Turing tests showed synthetic data showed reality (classification rate of 55%). Both the real and synthetic data showed that a more frontal or insular location (odds ratio [OR] 1.34 vs. 1.52; P = 0.04) and distinct non-enhancing tumor margins (OR 2.68 vs. 3.88; P < 0.001), which become significant predictors of IDH-mutation. In an independent validation set, diagnostic accuracy was higher for the augmented model (90.9% [40/44] and 93.2% [41/44] for each reader, respectively) than for the real model (84.1% [37/44] and 86.4% [38/44] for each reader, respectively). The GAN-based synthetic images yield morphologically variable, realistic-seeming IDH-mutant glioblastomas. GAN will be useful to create a realistic training set in terms of morphologic variations and quality, thereby improving diagnostic performance in a clinical model.

Download Full-text

Convolutional Neural Network Audio Classifier for Alarm Sound Detection

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.f8866.088619 ◽

2019 ◽

Vol 8 (6) ◽

pp. 4554-4557

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Short Term Memory ◽

Sound Recognition ◽

Generative Adversarial Network ◽

Adversarial Network ◽

Differential Network ◽

Sound Detection ◽

Long Short Term Memory ◽

Lstm Network

Neural Networks (ANN) has evolved through many stages in the last three decades with many researchers contributing in this challenging field. With the power of math complex problems can also be solved by ANNs. ANNs like Convolutional Neural Network (CNN), Deep Neural network, Generative Adversarial Network (GAN), Long Short Term Memory (LSTM) network, Recurrent Neural Network (RNN), Ordinary Differential Network etc., are playing promising roles in many MNCs and IT industries for their predictions and accuracy. In this paper, Convolutional Neural Network is used for prediction of Beep sounds in high noise levels. Based on Supervised Learning, the research is developed the best CNN architecture for Beep sound recognition in noisy situations. The proposed method gives better results with an accuracy of 96%. The prototype is tested with few architectures for the training and test data out of which a two layer CNN classifier predictions were the best.

Download Full-text

An enhanced OCT image captioning system to assist ophthalmologists in detecting and classifying eye diseases

Journal of X-Ray Science and Technology ◽

10.3233/xst-200697 ◽

2020 ◽

Vol 28 (5) ◽

pp. 975-988

Author(s):

Sivamurugan Vellakani ◽

Indumathi Pushbam

Keyword(s):

Short Term Memory ◽

Clinical Decision ◽

True Positive Rate ◽

Eye Diseases ◽

Age Related Macular Degeneration ◽

Superior Performance ◽

Image Captioning ◽

Generative Adversarial Network ◽

Adversarial Network ◽

Age Related

Human eye is affected by the different eye diseases including choroidal neovascularization (CNV), diabetic macular edema (DME) and age-related macular degeneration (AMD). This work aims to design an artificial intelligence (AI) based clinical decision support system for eye disease detection and classification to assist the ophthalmologists more effectively detecting and classifying CNV, DME and drusen by using the Optical Coherence Tomography (OCT) images depicting different tissues. The methodology used for designing this system involves different deep learning convolutional neural network (CNN) models and long short-term memory networks (LSTM). The best image captioning model is selected after performance analysis by comparing nine different image captioning systems for assisting ophthalmologists to detect and classify eye diseases. The quantitative data analysis results obtained for the image captioning models designed using DenseNet201 with LSTM have superior performance in terms of overall accuracy of 0.969, positive predictive value of 0.972 and true-positive rate of 0.969using OCT images enhanced by the generative adversarial network (GAN). The corresponding performance values for the Xception with LSTM image captioning models are 0.969, 0.969 and 0.938, respectively. Thus, these two models yield superior performance and have potential to assist ophthalmologists in making optimal diagnostic decision.

Download Full-text

Remaining Useful Life Estimation Using Deep Convolutional Generative Adversarial Networks Based on an Autoencoder Scheme

Computational Intelligence and Neuroscience ◽

10.1155/2020/9601389 ◽

2020 ◽

Vol 2020 ◽

pp. 1-14

Author(s):

Guisheng Hou ◽

Shuo Xu ◽

Nan Zhou ◽

Lei Yang ◽

Quanhao Fu

Keyword(s):

Feature Extraction ◽

Short Term Memory ◽

Health Management ◽

Remaining Useful Life ◽

Fine Tuning ◽

Generative Adversarial Networks ◽

Generative Adversarial Network ◽

Adversarial Network ◽

Useful Life ◽

Prediction Approach

Accurate predictions of remaining useful life (RUL) of important components play a crucial role in system reliability, which is the basis of prognostics and health management (PHM). This paper proposed an integrated deep learning approach for RUL prediction of a turbofan engine by integrating an autoencoder (AE) with a deep convolutional generative adversarial network (DCGAN). In the pretraining stage, the reconstructed data of the AE not only participate in its error reconstruction but also take part in the DCGAN parameter training as the generated data of the DCGAN. Through double-error reconstructions, the capability of feature extraction is enhanced, and high-level abstract information is obtained. In the fine-tuning stage, a long short-term memory (LSTM) network is used to extract the sequential information from the features to predict the RUL. The effectiveness of the proposed scheme is verified on the NASA commercial modular aero-propulsion system simulation (C-MAPSS) dataset. The superiority of the proposed method is demonstrated via excellent prediction performance and comparisons with other existing state-of-the-art prognostics. The results of this study suggest that the proposed data-driven prognostic method offers a new and promising prediction approach and an efficient feature extraction scheme.

Download Full-text

Implementation of generative adversarial network-CLS combined with bidirectional long short-term memory for lithium-ion battery state prediction

Journal of Energy Storage ◽

10.1016/j.est.2020.101489 ◽

2020 ◽

Vol 31 ◽

pp. 101489 ◽

Cited By ~ 1

Author(s):

Haoliang Zhang ◽

Wei Tang ◽

Woonki Na ◽

Pyeong-Yeon Lee ◽

Jonghoon Kim

Keyword(s):

Lithium Ion Battery ◽

Short Term Memory ◽

Lithium Ion ◽

Short Term ◽

Generative Adversarial Network ◽

Term Memory ◽

Adversarial Network ◽

State Prediction ◽

Long Short Term Memory

Download Full-text

Vehicle trajectory prediction and generation using LSTM models and GANs

PLoS ONE ◽

10.1371/journal.pone.0253868 ◽

2021 ◽

Vol 16 (7) ◽

pp. e0253868

Author(s):

Luca Rossi ◽

Andrea Ajmar ◽

Marina Paolanti ◽

Roberto Pierdicca

Keyword(s):

Traffic Congestion ◽

Short Term Memory ◽

Autonomous Driving ◽

Generative Models ◽

Trajectory Prediction ◽

Generative Adversarial Network ◽

Displacement Error ◽

Adversarial Network ◽

Vehicle Trajectories ◽

Vehicle Trajectory

Vehicles’ trajectory prediction is a topic with growing interest in recent years, as there are applications in several domains ranging from autonomous driving to traffic congestion prediction and urban planning. Predicting trajectories starting from Floating Car Data (FCD) is a complex task that comes with different challenges, namely Vehicle to Infrastructure (V2I) interaction, Vehicle to Vehicle (V2V) interaction, multimodality, and generalizability. These challenges, especially, have not been completely explored by state-of-the-art works. In particular, multimodality and generalizability have been neglected the most, and this work attempts to fill this gap by proposing and defining new datasets, metrics, and methods to help understand and predict vehicle trajectories. We propose and compare Deep Learning models based on Long Short-Term Memory and Generative Adversarial Network architectures; in particular, our GAN-3 model can be used to generate multiple predictions in multimodal scenarios. These approaches are evaluated with our newly proposed error metrics N-ADE and N-FDE, which normalize some biases in the standard Average Displacement Error (ADE) and Final Displacement Error (FDE) metrics. Experiments have been conducted using newly collected datasets in four large Italian cities (Rome, Milan, Naples, and Turin), considering different trajectory lengths to analyze error growth over a larger number of time-steps. The results prove that, although LSTM-based models are superior in unimodal scenarios, generative models perform best in those where the effects of multimodality are higher. Space-time and geographical analysis are performed, to prove the suitability of the proposed methodology for real cases and management services.

Download Full-text

GAN-LSTM Joint Network Applied to Seismic Array Noise Signal Recognition

Applied Sciences ◽

10.3390/app11219987 ◽

2021 ◽

Vol 11 (21) ◽

pp. 9987

Author(s):

Jian Li ◽

Dongwei Hei ◽

Gaofeng Cui ◽

Mengmin He ◽

Juan Wang ◽

...

Keyword(s):

Data Processing ◽

Short Term Memory ◽

Noise Signal ◽

Seismic Exploration ◽

Phase Identification ◽

Time Data ◽

Generative Adversarial Network ◽

Accurate Identification ◽

Adversarial Network ◽

Explosion Monitoring

The purpose of seismic data processing in nuclear explosion monitoring is to accurately and reliably detect seismic or explosion events from complex ambient noises. Accurate detection and identification of seismic phases are of great significance to the detection and parameter estimation of seismic events. In seismic phase identification, discriminating between noise signals and real seismic signals is essential. Accurate identification of noise signals helps reduce false detections, improves the accuracy of automatic bulletins, and relieves the workload of analysts. At the same time, in seismic exploration, the prime objective in data processing is also to enhance the signal and suppress the noises. In this study, we combined a generative adversarial network (GAN) with a long short-term memory network (LSTM) to discriminate between noise and phases in seismic waveforms recorded by the International Monitoring System (IMS) array MKAR. First, using the beamforming data of the array as the input, we obtained the signal features of seismic phases through the learning of the GAN discriminator network. Then, we input these features and trained the joint network on mixed seismic phase and noise data, and successfully classified seismic phases and noise signals with a recall of 95.28% and 97.64%, respectively. Based on this model, we established a real-time data processing method, then validated the effectiveness of this method with real 2019 data of MKAR. We also verified whether improved noise signal identification improves the quality of phase association and event detection.

Download Full-text

Detecting and Measuring Defects in Wafer Die Using GAN and YOLOv3

Applied Sciences ◽

10.3390/app10238725 ◽

2020 ◽

Vol 10 (23) ◽

pp. 8725

Author(s):

Ssu-Han Chen ◽

Chih-Hsiang Kang ◽

Der-Baau Perng

Keyword(s):

Training Image ◽

Decimal Place ◽

Average Precision ◽

Generative Adversarial Network ◽

The Real ◽

Adversarial Network ◽

Image Set ◽

Bounding Boxes ◽

Realistic Images ◽

Quality Sorting

This research used deep learning methods to develop a set of algorithms to detect die particle defects. Generative adversarial network (GAN) generated natural and realistic images, which improved the ability of you only look once version 3 (YOLOv3) to detect die defects. Then defects were measured based on the bounding boxes predicted by YOLOv3, which potentially provided the criteria for die quality sorting. The pseudo defective images generated by GAN from the real defective images were used as the training image set. The results obtained after training with the combination of the real and pseudo defective images were 7.33% higher in testing average precision (AP) and more accurate by one decimal place in testing coordinate error than after training with the real images alone. The GAN can enhance the diversity of defects, which improves the versatility of YOLOv3 somewhat. In summary, the method of combining GAN and YOLOv3 employed in this study creates a feature-free algorithm that does not require a massive collection of defective samples and does not require additional annotation of pseudo defects. The proposed method is feasible and advantageous for cases that deal with various kinds of die patterns.

Download Full-text

Speech Emotion Recognition on Small Sample Learning by Hybrid WGAN-LSTM Networks

Journal of Circuits System and Computers ◽

10.1142/s0218126622500736 ◽

2021 ◽

Author(s):

Cunwei Sun ◽

Luping Ji ◽

Hailing Zhong

Keyword(s):

Emotion Recognition ◽

Language Processing ◽

Short Term Memory ◽

Small Sample ◽

New Method ◽

Small Samples ◽

Speech Emotion Recognition ◽

Generative Adversarial Network ◽

Adversarial Network ◽

In Series

The speech emotion recognition based on the deep networks on small samples is often a very challenging problem in natural language processing. The massive parameters of a deep network are much difficult to be trained reliably on small-quantity speech samples. Aiming at this problem, we propose a new method through the systematical cooperation of Generative Adversarial Network (GAN) and Long Short Term Memory (LSTM). In this method, it utilizes the adversarial training of GAN’s generator and discriminator on speech spectrogram images to implement sufficient sample augmentation. A six-layer convolution neural network (CNN), followed in series by a two-layer LSTM, is designed to extract features from speech spectrograms. For accelerating the training of networks, the parameters of discriminator are transferred to our feature extractor. By the sample augmentation, a well-trained feature extraction network and an efficient classifier could be achieved. The tests and comparisons on two publicly available datasets, i.e., EMO-DB and IEMOCAP, show that our new method is effective, and it is often superior to some state-of-the-art methods.

Download Full-text