scholarly journals Imitation Learning System Design with Small Training Data for Flexible Tool Manipulation

2021 ◽  
Vol 15 (5) ◽  
pp. 669-677
Author(s):  
Harumo Sasatake ◽  
Ryosuke Tasaki ◽  
Takahito Yamashita ◽  
Naoki Uchiyama ◽  
◽  
...  

Population aging has become a major problem in developed countries. As the labor force declines, robot arms are expected to replace human labor for simple tasks. A robotic arm attaches a tool specialized for a task and acquires the movement through teaching by an engineer with expert knowledge. However, the number of such engineers is limited; therefore, a teaching method that can be used by non-technical personnel is necessitated. As a teaching method, deep learning can be used to imitate human behavior and tool usage. However, deep learning requires a large amount of training data for learning. In this study, the target task of the robot is to sweep multiple pieces of dirt using a broom. The proposed learning system can estimate the initial parameters for deep learning based on experience, as well as the shape and physical properties of the tools. It can reduce the number of training data points when learning a new tool. A virtual reality system is used to move the robot arm easily and safely, as well as to create training data for imitation. In this study, cleaning experiments are conducted to evaluate the effectiveness of the proposed method. The experimental results confirm that the proposed method can accelerate the learning speed of deep learning and acquire cleaning ability using a small amount of training data.

Heart ◽  
2018 ◽  
Vol 104 (23) ◽  
pp. 1921-1928 ◽  
Author(s):  
Ming-Zher Poh ◽  
Yukkee Cheung Poh ◽  
Pak-Hei Chan ◽  
Chun-Ka Wong ◽  
Louise Pun ◽  
...  

ObjectiveTo evaluate the diagnostic performance of a deep learning system for automated detection of atrial fibrillation (AF) in photoplethysmographic (PPG) pulse waveforms.MethodsWe trained a deep convolutional neural network (DCNN) to detect AF in 17 s PPG waveforms using a training data set of 149 048 PPG waveforms constructed from several publicly available PPG databases. The DCNN was validated using an independent test data set of 3039 smartphone-acquired PPG waveforms from adults at high risk of AF at a general outpatient clinic against ECG tracings reviewed by two cardiologists. Six established AF detectors based on handcrafted features were evaluated on the same test data set for performance comparison.ResultsIn the validation data set (3039 PPG waveforms) consisting of three sequential PPG waveforms from 1013 participants (mean (SD) age, 68.4 (12.2) years; 46.8% men), the prevalence of AF was 2.8%. The area under the receiver operating characteristic curve (AUC) of the DCNN for AF detection was 0.997 (95% CI 0.996 to 0.999) and was significantly higher than all the other AF detectors (AUC range: 0.924–0.985). The sensitivity of the DCNN was 95.2% (95% CI 88.3% to 98.7%), specificity was 99.0% (95% CI 98.6% to 99.3%), positive predictive value (PPV) was 72.7% (95% CI 65.1% to 79.3%) and negative predictive value (NPV) was 99.9% (95% CI 99.7% to 100%) using a single 17 s PPG waveform. Using the three sequential PPG waveforms in combination (<1 min in total), the sensitivity was 100.0% (95% CI 87.7% to 100%), specificity was 99.6% (95% CI 99.0% to 99.9%), PPV was 87.5% (95% CI 72.5% to 94.9%) and NPV was 100% (95% CI 99.4% to 100%).ConclusionsIn this evaluation of PPG waveforms from adults screened for AF in a real-world primary care setting, the DCNN had high sensitivity, specificity, PPV and NPV for detecting AF, outperforming other state-of-the-art methods based on handcrafted features.


2018 ◽  
Vol 8 (8) ◽  
pp. 1397 ◽  
Author(s):  
Veronica Morfi ◽  
Dan Stowell

In training a deep learning system to perform audio transcription, two practical problems may arise. Firstly, most datasets are weakly labelled, having only a list of events present in each recording without any temporal information for training. Secondly, deep neural networks need a very large amount of labelled training data to achieve good quality performance, yet in practice it is difficult to collect enough samples for most classes of interest. In this paper, we propose factorising the final task of audio transcription into multiple intermediate tasks in order to improve the training performance when dealing with this kind of low-resource datasets. We evaluate three data-efficient approaches of training a stacked convolutional and recurrent neural network for the intermediate tasks. Our results show that different methods of training have different advantages and disadvantages.


2019 ◽  
Vol 31 (3) ◽  
pp. 376-389 ◽  
Author(s):  
Congying Guan ◽  
Shengfeng Qin ◽  
Yang Long

Purpose The big challenge in apparel recommendation system research is not the exploration of machine learning technologies in fashion, but to really understand clothes, fashion and people, and know what to learn. The purpose of this paper is to explore an advanced apparel style learning and recommendation system that can recognise deep design-associated features of clothes and learn the connotative meanings conveyed by these features relating to style and the body so that it can make recommendations as a skilled human expert. Design/methodology/approach This study first proposes a type of new clothes style training data. Second, it designs three intelligent apparel-learning models based on newly proposed training data including ATTRIBUTE, MEANING and the raw image data, and compares the models’ performances in order to identify the best learning model. For deep learning, two models are introduced to train the prediction model, one is a convolutional neural network joint with the baseline classifier support vector machine and the other is with a newly proposed classifier later kernel fusion. Findings The results show that the most accurate model (with average prediction rate of 88.1 per cent) is the third model that is designed with two steps, one is to predict apparel ATTRIBUTEs through the apparel images, and the other is to further predict apparel MEANINGs based on predicted ATTRIBUTEs. The results indicate that adding the proposed ATTRIBUTE data that captures the deep features of clothes design does improve the model performances (e.g. from 73.5 per cent, Model B to 86 per cent, Model C), and the new concept of apparel recommendation based on style meanings is technically applicable. Originality/value The apparel data and the design of three training models are originally introduced in this study. The proposed methodology can evaluate the pros and cons of different clothes feature extraction approaches through either images or design attributes and balance different machine learning technologies between the latest CNN and traditional SVM.


2020 ◽  
Vol 34 (05) ◽  
pp. 9225-9232
Author(s):  
Wenya Wang ◽  
Sinno Jialin Pan

Information extraction (IE) aims to produce structured information from an input text, e.g., Named Entity Recognition and Relation Extraction. Various attempts have been proposed for IE via feature engineering or deep learning. However, most of them fail to associate the complex relationships inherent in the task itself, which has proven to be especially crucial. For example, the relation between 2 entities is highly dependent on their entity types. These dependencies can be regarded as complex constraints that can be efficiently expressed as logical rules. To combine such logic reasoning capabilities with learning capabilities of deep neural networks, we propose to integrate logical knowledge in the form of first-order logic into a deep learning system, which can be trained jointly in an end-to-end manner. The integrated framework is able to enhance neural outputs with knowledge regularization via logic rules, and at the same time update the weights of logic rules to comply with the characteristics of the training data. We demonstrate the effectiveness and generalization of the proposed model on multiple IE tasks.


2018 ◽  
Vol 1 (1) ◽  
pp. 192-204 ◽  
Author(s):  
Adrien CHAN-HON-TONG

Today, the main two security issues for deep learning are data poisoning and adversarial examples. Data poisoning consists of perverting a learning system by manipulating a small subset of the training data, while adversarial examples entail bypassing the system at testing time with low-amplitude manipulation of the testing sample. Unfortunately, data poisoning that is invisible to human eyes can be generated by adding adversarial noise to the training data. The main contribution of this paper includes a successful implementation of such invisible data poisoning using image classification datasets for a deep learning pipeline. This implementation leads to significant classification accuracy gaps.


Author(s):  
Veronica Morfi ◽  
Dan Stowell

In training a deep learning system to perform audio transcription, two practical problems may arise. Firstly, most datasets are weakly labelled, having only a list of events present in each recording without any temporal information for training. Secondly, deep neural networks need a very large amount of labelled training data to achieve good quality performance, yet in practice it is difficult to collect enough samples for most classes of interest. In this paper, we propose factorising the final task of audio transcription into multiple intermediate tasks in order to improve the training performance when dealing with this kind of low-resource datasets. We evaluate three data-efficient approaches of training a stacked convolutional and recurrent neural network for the intermediate tasks. Our results show that different methods of training have different advantages and disadvantages.


2012 ◽  
Vol 21 (4) ◽  
pp. 470-489 ◽  
Author(s):  
Amine Chellali ◽  
Cedric Dumas ◽  
Isabelle Milleville-Pennel

In interventional radiology, physicians require high haptic sensitivity and fine motor skills development because of the limited real-time visual feedback of the surgical site. The transfer of this type of surgical skill to novices is a challenging issue. This paper presents a study on the design of a biopsy procedure learning system. Our methodology, based on a task-centered design approach, aims to bring out new design rules for virtual learning environments. A new collaborative haptic training paradigm is introduced to support human-haptic interaction in a virtual environment. The interaction paradigm supports haptic communication between two distant users to teach a surgical skill. In order to evaluate this paradigm, a user experiment was conducted. Sixty volunteer medical students participated in the study to assess the influence of the teaching method on their performance in a biopsy procedure task. The results show that to transfer the skills, the combination of haptic communication with verbal and visual communications improves the novices’ performance compared to conventional teaching methods. Furthermore, the results show that, depending on the teaching method, participants developed different needle insertion profiles. We conclude that our interaction paradigm facilitates expert-novice haptic communication and improves skills transfer; and new skills acquisition depends on the availability of different communication channels between experts and novices. Our findings indicate that the traditional fellowship methods in surgery should evolve to an off-patient collaborative environment that will continue to support visual and verbal communication, but also haptic communication, in order to achieve a better and more complete skills training.


2019 ◽  
Vol 9 (22) ◽  
pp. 4749
Author(s):  
Lingyun Jiang ◽  
Kai Qiao ◽  
Linyuan Wang ◽  
Chi Zhang ◽  
Jian Chen ◽  
...  

Decoding human brain activities, especially reconstructing human visual stimuli via functional magnetic resonance imaging (fMRI), has gained increasing attention in recent years. However, the high dimensionality and small quantity of fMRI data impose restrictions on satisfactory reconstruction, especially for the reconstruction method with deep learning requiring huge amounts of labelled samples. When compared with the deep learning method, humans can recognize a new image because our human visual system is naturally capable of extracting features from any object and comparing them. Inspired by this visual mechanism, we introduced the mechanism of comparison into deep learning method to realize better visual reconstruction by making full use of each sample and the relationship of the sample pair by learning to compare. In this way, we proposed a Siamese reconstruction network (SRN) method. By using the SRN, we improved upon the satisfying results on two fMRI recording datasets, providing 72.5% accuracy on the digit dataset and 44.6% accuracy on the character dataset. Essentially, this manner can increase the training data about from n samples to 2n sample pairs, which takes full advantage of the limited quantity of training samples. The SRN learns to converge sample pairs of the same class or disperse sample pairs of different class in feature space.


Sensors ◽  
2020 ◽  
Vol 20 (6) ◽  
pp. 1579
Author(s):  
Dongqi Wang ◽  
Qinghua Meng ◽  
Dongming Chen ◽  
Hupo Zhang ◽  
Lisheng Xu

Automatic detection of arrhythmia is of great significance for early prevention and diagnosis of cardiovascular disease. Traditional feature engineering methods based on expert knowledge lack multidimensional and multi-view information abstraction and data representation ability, so the traditional research on pattern recognition of arrhythmia detection cannot achieve satisfactory results. Recently, with the increase of deep learning technology, automatic feature extraction of ECG data based on deep neural networks has been widely discussed. In order to utilize the complementary strength between different schemes, in this paper, we propose an arrhythmia detection method based on the multi-resolution representation (MRR) of ECG signals. This method utilizes four different up to date deep neural networks as four channel models for ECG vector representations learning. The deep learning based representations, together with hand-crafted features of ECG, forms the MRR, which is the input of the downstream classification strategy. The experimental results of big ECG dataset multi-label classification confirm that the F1 score of the proposed method is 0.9238, which is 1.31%, 0.62%, 1.18% and 0.6% higher than that of each channel model. From the perspective of architecture, this proposed method is highly scalable and can be employed as an example for arrhythmia recognition.


2020 ◽  
pp. bjophthalmol-2020-317825
Author(s):  
Yonghao Li ◽  
Weibo Feng ◽  
Xiujuan Zhao ◽  
Bingqian Liu ◽  
Yan Zhang ◽  
...  

Background/aimsTo apply deep learning technology to develop an artificial intelligence (AI) system that can identify vision-threatening conditions in high myopia patients based on optical coherence tomography (OCT) macular images.MethodsIn this cross-sectional, prospective study, a total of 5505 qualified OCT macular images obtained from 1048 high myopia patients admitted to Zhongshan Ophthalmic Centre (ZOC) from 2012 to 2017 were selected for the development of the AI system. The independent test dataset included 412 images obtained from 91 high myopia patients recruited at ZOC from January 2019 to May 2019. We adopted the InceptionResnetV2 architecture to train four independent convolutional neural network (CNN) models to identify the following four vision-threatening conditions in high myopia: retinoschisis, macular hole, retinal detachment and pathological myopic choroidal neovascularisation. Focal Loss was used to address class imbalance, and optimal operating thresholds were determined according to the Youden Index.ResultsIn the independent test dataset, the areas under the receiver operating characteristic curves were high for all conditions (0.961 to 0.999). Our AI system achieved sensitivities equal to or even better than those of retina specialists as well as high specificities (greater than 90%). Moreover, our AI system provided a transparent and interpretable diagnosis with heatmaps.ConclusionsWe used OCT macular images for the development of CNN models to identify vision-threatening conditions in high myopia patients. Our models achieved reliable sensitivities and high specificities, comparable to those of retina specialists and may be applied for large-scale high myopia screening and patient follow-up.


Sign in / Sign up

Export Citation Format

Share Document