scholarly journals Monitoring of Assembly Process Using Deep Learning Technology

Sensors ◽  
2020 ◽  
Vol 20 (15) ◽  
pp. 4208
Author(s):  
Chengjun Chen ◽  
Chunlin Zhang ◽  
Tiannuo Wang ◽  
Dongnian Li ◽  
Yang Guo ◽  
...  

Monitoring the assembly process is a challenge in the manual assembly of mass customization production, in which the operator needs to change the assembly process according to different products. If an assembly error is not immediately detected during the assembly process of a product, it may lead to errors and loss of time and money in the subsequent assembly process, and will affect product quality. To monitor assembly process, this paper explored two methods: recognizing assembly action and recognizing parts from complicated assembled products. In assembly action recognition, an improved three-dimensional convolutional neural network (3D CNN) model with batch normalization is proposed to detect a missing assembly action. In parts recognition, a fully convolutional network (FCN) is employed to segment, recognize different parts from complicated assembled products to check the assembly sequence for missing or misaligned parts. An assembly actions data set and an assembly segmentation data set are created. The experimental results of assembly action recognition show that the 3D CNN model with batch normalization reduces computational complexity, improves training speed and speeds up the convergence of the model, while maintaining accuracy. Experimental results of FCN show that FCN-2S provides a higher pixel recognition accuracy than other FCNs.

2021 ◽  
Vol 11 (15) ◽  
pp. 7104
Author(s):  
Xu Yang ◽  
Ziyi Huan ◽  
Yisong Zhai ◽  
Ting Lin

Nowadays, personalized recommendation based on knowledge graphs has become a hot spot for researchers due to its good recommendation effect. In this paper, we researched personalized recommendation based on knowledge graphs. First of all, we study the knowledge graphs’ construction method and complete the construction of the movie knowledge graphs. Furthermore, we use Neo4j graph database to store the movie data and vividly display it. Then, the classical translation model TransE algorithm in knowledge graph representation learning technology is studied in this paper, and we improved the algorithm through a cross-training method by using the information of the neighboring feature structures of the entities in the knowledge graph. Furthermore, the negative sampling process of TransE algorithm is improved. The experimental results show that the improved TransE model can more accurately vectorize entities and relations. Finally, this paper constructs a recommendation model by combining knowledge graphs with ranking learning and neural network. We propose the Bayesian personalized recommendation model based on knowledge graphs (KG-BPR) and the neural network recommendation model based on knowledge graphs(KG-NN). The semantic information of entities and relations in knowledge graphs is embedded into vector space by using improved TransE method, and we compare the results. The item entity vectors containing external knowledge information are integrated into the BPR model and neural network, respectively, which make up for the lack of knowledge information of the item itself. Finally, the experimental analysis is carried out on MovieLens-1M data set. The experimental results show that the two recommendation models proposed in this paper can effectively improve the accuracy, recall, F1 value and MAP value of recommendation.


The topic of Human activity recognition (HAR) is a prominent research area topic in the field of computer vision and image processing area. It has empowered state-of-art application in multiple sectors, surveillance, digital entertainment and medical healthcare. It is interesting to observe and intriguing to predict such kind of movements. Several sensor-based approaches have also been introduced to study and predict human activities such accelerometer, gyroscope, etc., it has its own advantages and disadvantages.[10] In this paper, an intelligent human activity recognition system is developed. Convolutional neural network (CNN) with spatiotemporal three dimensional (3D) kernels are trained using Kinetics data set which has 400 classes that depicts activities of humans in their everyday life and work and consist of 400 and more videos for each class. The 3D CNN model used in this model is RESNET-34. The videos were temporally cut down and last around tenth of a second. The trained model show satisfactory performance in all stages of training, testing. Finally the results show promising activity recognition of over 400 human actions.


2021 ◽  
Vol 0 (0) ◽  
Author(s):  
Shasha Sun ◽  
Chuanpeng Li ◽  
Ning Lv ◽  
Xiaoman Zhang ◽  
Zhaoyan Yu ◽  
...  

Abstract Sleep staging is an important basis for diagnosing sleep-related problems. In this paper, an attention based convolutional network for automatic sleep staging is proposed. The network takes time-frequency image as input and predict sleep stage for each 30-s epoch as output. For each CNN feature maps, our model generate attention maps along two separate dimensions, time and filter, and then multiplied to form the final attention map. Residual-like fusion structure is used to append the attention map to the input feature map for adaptive feature refinement. In addition, to get the global feature representation with less information loss, the generalized mean pooling is introduced. To prove the efficacy of the proposed method, we have compared with two baseline method on sleep-EDF data set with different setting of the framework and input channel type, the experimental results show that the paper model has achieved significant improvements in terms of overall accuracy, Cohen’s kappa, MF1, sensitivity and specificity. The performance of the proposed network is compared with that of the state-of-the-art algorithms with an overall accuracy of 83.4%, a macro F1-score of 77.3%, κ = 0.77, sensitivity = 77.1% and specificity = 95.4%, respectively. The experimental results demonstrate the superiority of the proposed network.


Author(s):  
Junyu Gao ◽  
Tianzhu Zhang ◽  
Changsheng Xu

Recently, with the ever-growing action categories, zero-shot action recognition (ZSAR) has been achieved by automatically mining the underlying concepts (e.g., actions, attributes) in videos. However, most existing methods only exploit the visual cues of these concepts but ignore external knowledge information for modeling explicit relationships between them. In fact, humans have remarkable ability to transfer knowledge learned from familiar classes to recognize unfamiliar classes. To narrow the knowledge gap between existing methods and humans, we propose an end-to-end ZSAR framework based on a structured knowledge graph, which can jointly model the relationships between action-attribute, action-action, and attribute-attribute. To effectively leverage the knowledge graph, we design a novel Two-Stream Graph Convolutional Network (TS-GCN) consisting of a classifier branch and an instance branch. Specifically, the classifier branch takes the semantic-embedding vectors of all the concepts as input, then generates the classifiers for action categories. The instance branch maps the attribute embeddings and scores of each video instance into an attribute-feature space. Finally, the generated classifiers are evaluated on the attribute features of each video, and a classification loss is adopted for optimizing the whole network. In addition, a self-attention module is utilized to model the temporal information of videos. Extensive experimental results on three realistic action benchmarks Olympic Sports, HMDB51 and UCF101 demonstrate the favorable performance of our proposed framework.


2021 ◽  
Vol 7 ◽  
Author(s):  
Xiaoling Liang ◽  
Yuexin Zhang ◽  
Jiahong Wang ◽  
Qing Ye ◽  
Yanhong Liu ◽  
...  

A three-dimensional (3D) deep learning method is proposed, which enables the rapid diagnosis of coronavirus disease 2019 (COVID-19) and thus significantly reduces the burden on radiologists and physicians. Inspired by the fact that the current chest computed tomography (CT) datasets are diversified in equipment types, we propose a COVID-19 graph in a graph convolutional network (GCN) to incorporate multiple datasets that differentiate the COVID-19 infected cases from normal controls. Specifically, we first apply a 3D convolutional neural network (3D-CNN) to extract image features from the initial 3D-CT images. In this part, a transfer learning method is proposed to improve the performance, which uses the task of predicting equipment type to initialize the parameters of the 3D-CNN structure. Second, we design a COVID-19 graph in GCN based on the extracted features. The graph divides all samples into several clusters, and samples with the same equipment type compose a cluster. Then we establish edge connections between samples in the same cluster. To compute accurate edge weights, we propose to combine the correlation distance of the extracted features and the score differences of subjects from the 3D-CNN structure. Lastly, by inputting the COVID-19 graph into GCN, we obtain the final diagnosis results. In experiments, the dataset contains 399 COVID-19 infected cases, and 400 normal controls from six equipment types. Experimental results show that the accuracy, sensitivity, and specificity of our method reach 98.5%, 99.9%, and 97%, respectively.


Sensors ◽  
2021 ◽  
Vol 21 (3) ◽  
pp. 950
Author(s):  
Xu Yang ◽  
Dongjingdian Liu ◽  
Jing Liu ◽  
Faren Yan ◽  
Pengpeng Chen ◽  
...  

Deep learning technology has improved the performance of vision-based action recognition algorithms, but such methods require a large number of labeled training datasets, resulting in weak universality. To address this issue, this paper proposes a novel self-deployable ubiquitous action recognition framework that enables a self-motivated user to bootstrap and deploy action recognition services, called FOLLOWER. Our main idea is to build a “fingerprint” library of actions based on a small number of user-defined sample action data. Then, we use the matching method to complete action recognition. The key step is how to construct a suitable “fingerprint”. Thus, a pose action normalized feature extraction method based on a three-dimensional pose sequence is designed. FOLLOWER is mainly composed of the guide process and follow the process. Guide process extracts pose action normalized feature and selects the inner class central feature to build a “fingerprint” library of actions. Follow process extracts the pose action normalized feature in the target video and uses the motion detection, action filtering, and adaptive weight offset template to identify the action in the video sequence. Finally, we collect an action video dataset with human pose annotation to research self-deployable action recognition and action recognition based on pose estimation. After experimenting on this dataset, the results show that FOLLOWER can effectively recognize the actions in the video sequence with recognition accuracy reaching 96.74%.


Author(s):  
J. K. Samarabandu ◽  
R. Acharya ◽  
D. R. Pareddy ◽  
P. C. Cheng

In the study of cell organization in a maize meristem, direct viewing of confocal optical sections in 3D (by means of 3D projection of the volumetric data set, Figure 1) becomes very difficult and confusing because of the large number of nucleus involved. Numerical description of the cellular organization (e.g. position, size and orientation of each structure) and computer graphic presentation are some of the solutions to effectively study the structure of such a complex system. An attempt at data-reduction by means of manually contouring cell nucleus in 3D was reported (Summers et al., 1990). Apart from being labour intensive, this 3D digitization technique suffers from the inaccuracies of manual 3D tracing related to the depth perception of the operator. However, it does demonstrate that reducing stack of confocal images to a 3D graphic representation helps to visualize and analyze complex tissues (Figure 2). This procedure also significantly reduce computational burden in an interactive operation.


Author(s):  
Weiping Liu ◽  
John W. Sedat ◽  
David A. Agard

Any real world object is three-dimensional. The principle of tomography, which reconstructs the 3-D structure of an object from its 2-D projections of different view angles has found application in many disciplines. Electron Microscopic (EM) tomography on non-ordered structures (e.g., subcellular structures in biology and non-crystalline structures in material science) has been exercised sporadically in the last twenty years or so. As vital as is the 3-D structural information and with no existing alternative 3-D imaging technique to compete in its high resolution range, the technique to date remains the kingdom of a brave few. Its tedious tasks have been preventing it from being a routine tool. One keyword in promoting its popularity is automation: The data collection has been automated in our lab, which can routinely yield a data set of over 100 projections in the matter of a few hours. Now the image processing part is also automated. Such automations finish the job easier, faster and better.


2020 ◽  
Vol 27 (4) ◽  
pp. 329-336 ◽  
Author(s):  
Lei Xu ◽  
Guangmin Liang ◽  
Baowen Chen ◽  
Xu Tan ◽  
Huaikun Xiang ◽  
...  

Background: Cell lytic enzyme is a kind of highly evolved protein, which can destroy the cell structure and kill the bacteria. Compared with antibiotics, cell lytic enzyme will not cause serious problem of drug resistance of pathogenic bacteria. Thus, the study of cell wall lytic enzymes aims at finding an efficient way for curing bacteria infectious. Compared with using antibiotics, the problem of drug resistance becomes more serious. Therefore, it is a good choice for curing bacterial infections by using cell lytic enzymes. Cell lytic enzyme includes endolysin and autolysin and the difference between them is the purpose of the break of cell wall. The identification of the type of cell lytic enzymes is meaningful for the study of cell wall enzymes. Objective: In this article, our motivation is to predict the type of cell lytic enzyme. Cell lytic enzyme is helpful for killing bacteria, so it is meaningful for study the type of cell lytic enzyme. However, it is time consuming to detect the type of cell lytic enzyme by experimental methods. Thus, an efficient computational method for the type of cell lytic enzyme prediction is proposed in our work. Method: We propose a computational method for the prediction of endolysin and autolysin. First, a data set containing 27 endolysins and 41 autolysins is built. Then the protein is represented by tripeptides composition. The features are selected with larger confidence degree. At last, the classifier is trained by the labeled vectors based on support vector machine. The learned classifier is used to predict the type of cell lytic enzyme. Results: Following the proposed method, the experimental results show that the overall accuracy can attain 97.06%, when 44 features are selected. Compared with Ding's method, our method improves the overall accuracy by nearly 4.5% ((97.06-92.9)/92.9%). The performance of our proposed method is stable, when the selected feature number is from 40 to 70. The overall accuracy of tripeptides optimal feature set is 94.12%, and the overall accuracy of Chou's amphiphilic PseAAC method is 76.2%. The experimental results also demonstrate that the overall accuracy is improved by nearly 18% when using the tripeptides optimal feature set. Conclusion: The paper proposed an efficient method for identifying endolysin and autolysin. In this paper, support vector machine is used to predict the type of cell lytic enzyme. The experimental results show that the overall accuracy of the proposed method is 94.12%, which is better than some existing methods. In conclusion, the selected 44 features can improve the overall accuracy for identification of the type of cell lytic enzyme. Support vector machine performs better than other classifiers when using the selected feature set on the benchmark data set.


Sign in / Sign up

Export Citation Format

Share Document