scholarly journals Guitar Chord Sensing and Recognition Using Multi-Task Learning and Physical Data Augmentation with Robotics

Sensors ◽  
2020 ◽  
Vol 20 (21) ◽  
pp. 6077
Author(s):  
Gerelmaa Byambatsogt ◽  
Lodoiravsal Choimaa ◽  
Gou Koutaki

In recent years, many researchers have shown increasing interest in music information retrieval (MIR) applications, with automatic chord recognition being one of the popular tasks. Many studies have achieved/demonstrated considerable improvement using deep learning based models in automatic chord recognition problems. However, most of the existing models have focused on simple chord recognition, which classifies the root note with the major, minor, and seventh chords. Furthermore, in learning-based recognition, it is critical to collect high-quality and large amounts of training data to achieve the desired performance. In this paper, we present a multi-task learning (MTL) model for a guitar chord recognition task, where the model is trained using a relatively large-vocabulary guitar chord dataset. To solve data scarcity issues, a physical data augmentation method that directly records the chord dataset from a robotic performer is employed. Deep learning based MTL is proposed to improve the performance of automatic chord recognition with the proposed physical data augmentation dataset. The proposed MTL model is compared with four baseline models and its corresponding single-task learning model using two types of datasets, including a human dataset and a human combined with the augmented dataset. The proposed methods outperform the baseline models, and the results show that most scores of the proposed multi-task learning model are better than those of the corresponding single-task learning model. The experimental results demonstrate that physical data augmentation is an effective method for increasing the dataset size for guitar chord recognition tasks.

2020 ◽  
Vol 12 (7) ◽  
pp. 1092
Author(s):  
David Browne ◽  
Michael Giering ◽  
Steven Prestwich

Scene classification is an important aspect of image/video understanding and segmentation. However, remote-sensing scene classification is a challenging image recognition task, partly due to the limited training data, which causes deep-learning Convolutional Neural Networks (CNNs) to overfit. Another difficulty is that images often have very different scales and orientation (viewing angle). Yet another is that the resulting networks may be very large, again making them prone to overfitting and unsuitable for deployment on memory- and energy-limited devices. We propose an efficient deep-learning approach to tackle these problems. We use transfer learning to compensate for the lack of data, and data augmentation to tackle varying scale and orientation. To reduce network size, we use a novel unsupervised learning approach based on k-means clustering, applied to all parts of the network: most network reduction methods use computationally expensive supervised learning methods, and apply only to the convolutional or fully connected layers, but not both. In experiments, we set new standards in classification accuracy on four remote-sensing and two scene-recognition image datasets.


2020 ◽  
Author(s):  
Yun Zhang ◽  
Ling Wang ◽  
Xinqiao Wang ◽  
Chengyun Zhang ◽  
Jiamin Ge ◽  
...  

<p><b>Abstract:</b> Effective and rapid deep learning method to predict chemical reactions contributes to the research and development of organic chemistry and drug discovery. Despite the outstanding capability of deep learning in retrosynthesis and forward synthesis, predictions based on small chemical datasets generally result in low accuracy due to an insufficiency of reaction examples. Here, we introduce a new state art of method, which integrates transfer learning with transformer model to predict the outcomes of the Baeyer-Villiger reaction which is a representative small dataset reaction. The results demonstrate that introducing transfer learning strategy markedly improves the top-1 accuracy of the transformer-transfer learning model (81.8%) over that of the transformer-baseline model (58.4%). Moreover, we further introduce data augmentation to the input reaction SMILES, which allows for better performance and improves the accuracy of the transformer-transfer learning model (86.7%). In summary, both transfer learning and data augmentation methods significantly improve the predictive performance of transformer model, which are powerful methods used in chemistry field to eliminate the restriction of limited training data.</p>


2020 ◽  
Author(s):  
Yun Zhang ◽  
Ling Wang ◽  
Xinqiao Wang ◽  
Chengyun Zhang ◽  
Jiamin Ge ◽  
...  

<p><b>Abstract:</b> Effective and rapid deep learning method to predict chemical reactions contributes to the research and development of organic chemistry and drug discovery. Despite the outstanding capability of deep learning in retrosynthesis and forward synthesis, predictions based on small chemical datasets generally result in low accuracy due to an insufficiency of reaction examples. Here, we introduce a new state art of method, which integrates transfer learning with transformer model to predict the outcomes of the Baeyer-Villiger reaction which is a representative small dataset reaction. The results demonstrate that introducing transfer learning strategy markedly improves the top-1 accuracy of the transformer-transfer learning model (81.8%) over that of the transformer-baseline model (58.4%). Moreover, we further introduce data augmentation to the input reaction SMILES, which allows for better performance and improves the accuracy of the transformer-transfer learning model (86.7%). In summary, both transfer learning and data augmentation methods significantly improve the predictive performance of transformer model, which are powerful methods used in chemistry field to eliminate the restriction of limited training data.</p>


Author(s):  
Tuấn Nguyên Hoài Đức ◽  
Trần Tiện Lợi Long Tứ ◽  
Lê Đình Việt Huy

We built a model labelling the Predicate Argument Structure (PAS) for biomedical documents. PAS is an important semantic information of any document, because it reveals the main event mentioned in each sentence. Extracting PAS in a sentence is an important premise for the computer to solve a series of other problems related to the semantics in text such as event extraction, named entity extraction, question answering system… The predicate argument structure is domain dependent. Therefore, in Biomedical field, it is required to define a completely new Predicate Argument frame compared to the general field. For a machine learning model to work well with a new argument frame, identifying a new feature set is required. This is difficult, manual and requires a lot of expert labor. To address this challenge, we chose to train our model with Deep Learning method utilizing Bi-directional Long Short Term Memory. Deep learning is a machine learning method that does not require defining the feature sets manually. In addition, we also integrate Highway Connection between hidden neuron layers to minimize derivative loss. Besides, to overcome the problem of small training corpus, we integrate Deep Learning with Multi-task Learning technique. Multi-task Learning helps the main task (PAS tagging) to be complemented with knowledge learnt from a closely related task, the NER. Our model achieved F1 = 75.13% without any manually designed feature, thereby showing the prospect of Deep Learning in this domain. In addition, the experiment results also show that Multi-task Learning is an appropriate technique to overcome the problem of little training data in biomedical fields, by improving the F1 score.


2020 ◽  
Vol 185 ◽  
pp. 03021
Author(s):  
Meng Zhou ◽  
Rui Wang ◽  
Peng Fu ◽  
Yang Bai ◽  
Ligang Cui

As the most common malignancy in the endocrine system, thyroid cancer is usually diagnosed by discriminating the malignant nodules from the benign ones using ultrasonography, whose interpretation results primarily depends on the subjectivity judgement of the radiologists. In this study, we propose a novel cascade deep learning model to achieve automatic objective diagnose during ultrasound examination for assisting radiologists in recognizing benign and malignant thyroid nodules. First, the simplified U-net is employed to segment the region of interesting (ROI) of the thyroid nodules in each frame of the ultrasound image automatically. Then, to alleviate the limitation that medical training data are relatively small in size, the improved Conditional Variational Auto-Encoder (CVAE) learning the probability distribution of ROI images is trained to generate new images for data augmentation. Finally, ResNet50 is trained with both original and generated ROI images. As consequence, the deep learning model formed by the trained U-net and trained Resnet-50 cascade can achieve malignant thyroid nodule recognition with the accuracy of 87.4%, the sensitivity of 92%, and the specificity of 86.8%.


Diagnostics ◽  
2021 ◽  
Vol 11 (6) ◽  
pp. 1052
Author(s):  
Leang Sim Nguon ◽  
Kangwon Seo ◽  
Jung-Hyun Lim ◽  
Tae-Jun Song ◽  
Sung-Hyun Cho ◽  
...  

Mucinous cystic neoplasms (MCN) and serous cystic neoplasms (SCN) account for a large portion of solitary pancreatic cystic neoplasms (PCN). In this study we implemented a convolutional neural network (CNN) model using ResNet50 to differentiate between MCN and SCN. The training data were collected retrospectively from 59 MCN and 49 SCN patients from two different hospitals. Data augmentation was used to enhance the size and quality of training datasets. Fine-tuning training approaches were utilized by adopting the pre-trained model from transfer learning while training selected layers. Testing of the network was conducted by varying the endoscopic ultrasonography (EUS) image sizes and positions to evaluate the network performance for differentiation. The proposed network model achieved up to 82.75% accuracy and a 0.88 (95% CI: 0.817–0.930) area under curve (AUC) score. The performance of the implemented deep learning networks in decision-making using only EUS images is comparable to that of traditional manual decision-making using EUS images along with supporting clinical information. Gradient-weighted class activation mapping (Grad-CAM) confirmed that the network model learned the features from the cyst region accurately. This study proves the feasibility of diagnosing MCN and SCN using a deep learning network model. Further improvement using more datasets is needed.


2021 ◽  
Vol 13 (10) ◽  
pp. 2003
Author(s):  
Daeyong Jin ◽  
Eojin Lee ◽  
Kyonghwan Kwon ◽  
Taeyun Kim

In this study, we used convolutional neural networks (CNNs)—which are well-known deep learning models suitable for image data processing—to estimate the temporal and spatial distribution of chlorophyll-a in a bay. The training data required the construction of a deep learning model acquired from the satellite ocean color and hydrodynamic model. Chlorophyll-a, total suspended sediment (TSS), visibility, and colored dissolved organic matter (CDOM) were extracted from the satellite ocean color data, and water level, currents, temperature, and salinity were generated from the hydrodynamic model. We developed CNN Model I—which estimates the concentration of chlorophyll-a using a 48 × 27 sized overall image—and CNN Model II—which uses a 7 × 7 segmented image. Because the CNN Model II conducts estimation using only data around the points of interest, the quantity of training data is more than 300 times larger than that of CNN Model I. Consequently, it was possible to extract and analyze the inherent patterns in the training data, improving the predictive ability of the deep learning model. The average root mean square error (RMSE), calculated by applying CNN Model II, was 0.191, and when the prediction was good, the coefficient of determination (R2) exceeded 0.91. Finally, we performed a sensitivity analysis, which revealed that CDOM is the most influential variable in estimating the spatiotemporal distribution of chlorophyll-a.


2021 ◽  
Vol 11 (15) ◽  
pp. 7148
Author(s):  
Bedada Endale ◽  
Abera Tullu ◽  
Hayoung Shi ◽  
Beom-Soo Kang

Unmanned aerial vehicles (UAVs) are being widely utilized for various missions: in both civilian and military sectors. Many of these missions demand UAVs to acquire artificial intelligence about the environments they are navigating in. This perception can be realized by training a computing machine to classify objects in the environment. One of the well known machine training approaches is supervised deep learning, which enables a machine to classify objects. However, supervised deep learning comes with huge sacrifice in terms of time and computational resources. Collecting big input data, pre-training processes, such as labeling training data, and the need for a high performance computer for training are some of the challenges that supervised deep learning poses. To address these setbacks, this study proposes mission specific input data augmentation techniques and the design of light-weight deep neural network architecture that is capable of real-time object classification. Semi-direct visual odometry (SVO) data of augmented images are used to train the network for object classification. Ten classes of 10,000 different images in each class were used as input data where 80% were for training the network and the remaining 20% were used for network validation. For the optimization of the designed deep neural network, a sequential gradient descent algorithm was implemented. This algorithm has the advantage of handling redundancy in the data more efficiently than other algorithms.


2021 ◽  
Author(s):  
J. Annrose ◽  
N. Herald Anantha Rufus ◽  
C. R. Edwin Selva Rex ◽  
D. Godwin Immanuel

Abstract Bean which is botanically called Phaseolus vulgaris L belongs to the Fabaceae family.During bean disease identification, unnecessary economical losses occur due to the delay of the treatment period, incorrect treatment, and lack of knowledge. The existing deep learning and machine learning techniques met few issues such as high computational complexity, higher cost associated with the training data, more execution time, noise, feature dimensionality, lower accuracy, low speed, etc. To tackle these problems, we have proposed a hybrid deep learning model with an Archimedes optimization algorithm (HDL-AOA) for bean disease classification. In this work, there are five bean classes of which one is a healthy class whereas the remaining four classes indicate different diseases such as Bean halo blight, Pythium diseases, Rhizoctonia root rot, and Anthracnose abnormalities acquired from the Soybean (Large) Data Set.The hybrid deep learning technique is the combination of wavelet packet decomposition (WPD) and long short term memory (LSTM). Initially, the WPD decomposes the input images into four sub-series. For these sub-series, four LSTM networks were developed. During bean disease classification, an Archimedes optimization algorithm (AOA) enhances the classification accuracy for multiple single LSTM networks. MATLAB software implements the HDL-AOA model for bean disease classification. The proposed model accomplishes lower MAPE than other exiting methods. Finally, the proposed HDL-AOA model outperforms excellent classification results using different evaluation measures such as accuracy, specificity, sensitivity, precision, recall, and F-score.


2019 ◽  
Author(s):  
Mojtaba Haghighatlari ◽  
Gaurav Vishwakarma ◽  
Mohammad Atif Faiz Afzal ◽  
Johannes Hachmann

<div><div><div><p>We present a multitask, physics-infused deep learning model to accurately and efficiently predict refractive indices (RIs) of organic molecules, and we apply it to a library of 1.5 million compounds. We show that it outperforms earlier machine learning models by a significant margin, and that incorporating known physics into data-derived models provides valuable guardrails. Using a transfer learning approach, we augment the model to reproduce results consistent with higher-level computational chemistry training data, but with a considerably reduced number of corresponding calculations. Prediction errors of machine learning models are typically smallest for commonly observed target property values, consistent with the distribution of the training data. However, since our goal is to identify candidates with unusually large RI values, we propose a strategy to boost the performance of our model in the remoter areas of the RI distribution: We bias the model with respect to the under-represented classes of molecules that have values in the high-RI regime. By adopting a metric popular in web search engines, we evaluate our effectiveness in ranking top candidates. We confirm that the models developed in this study can reliably predict the RIs of the top 1,000 compounds, and are thus able to capture their ranking. We believe that this is the first study to develop a data-derived model that ensures the reliability of RI predictions by model augmentation in the extrapolation region on such a large scale. These results underscore the tremendous potential of machine learning in facilitating molecular (hyper)screening approaches on a massive scale and in accelerating the discovery of new compounds and materials, such as organic molecules with high-RI for applications in opto-electronics.</p></div></div></div>


Sign in / Sign up

Export Citation Format

Share Document