convolution process
Recently Published Documents


TOTAL DOCUMENTS

37
(FIVE YEARS 21)

H-INDEX

5
(FIVE YEARS 1)

2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Yuanyao Lu ◽  
Qi Xiao ◽  
Haiyang Jiang

In recent years, deep learning has already been applied to English lip-reading. However, Chinese lip-reading starts late and lacks relevant dataset, and the recognition accuracy is not ideal. Therefore, this paper proposes a new hybrid neural network model to establish a Chinese lip-reading system. In this paper, we integrate the attention mechanism into both CNN and RNN. Specifically, we add the convolutional block attention module (CBAM) to the ResNet50 neural network, which enhances its ability to capture the small differences among the mouth patterns of similarly pronounced words in Chinese, improving the performance of feature extraction in the convolution process. We also add the time attention mechanism to the GRU neural network, which helps to extract the features among consecutive lip motion images. Considering the effects of the moments before and after on the current moment in the lip-reading process, we assign more weights to the key frames, which makes the features more representative. We further validate our model through experiments on our self-built dataset. Our experiments show that using convolutional block attention module (CBAM) in the Chinese lip-reading model can accurately recognize Chinese numbers 0–9 and some frequently used Chinese words. Compared with other lip-reading systems, our system has better performance and higher recognition accuracy.


2021 ◽  
Vol 13 (16) ◽  
pp. 3239
Author(s):  
Zhihao Shen ◽  
Huawei Liang ◽  
Linglong Lin ◽  
Zhiling Wang ◽  
Weixin Huang ◽  
...  

LiDAR occupies a vital position in self-driving as the advanced detection technology enables autonomous vehicles (AVs) to obtain much environmental information. Ground segmentation for LiDAR point cloud is a crucial procedure to ensure AVs’ driving safety. However, some current algorithms suffer from embarrassments such as unavailability on complex terrains, excessive time and memory usage, and additional pre-training requirements. The Jump-Convolution-Process (JCP) is proposed to solve these issues. JCP converts the segmentation problem of the 3D point cloud into the smoothing problem of the 2D image and takes little time to improve the segmentation effect significantly. First, the point cloud marked by an improved local feature extraction algorithm is projected onto an RGB image. Then, the pixel value is initialized with the points’ label and continuously updated according to image convolution. Finally, a jump operation is introduced in the convolution process to perform calculations only on the low-confidence points filtered by the credibility propagation algorithm, reducing the time cost. Experiments on three datasets show that our approach has a better segmentation accuracy and terrain adaptability than those of the three existing methods. Meanwhile, the average time for the proposed method to deal with one scan data of 64-beam and 128-beam LiDAR is only 8.61 ms and 15.62 ms, which fully meets the AVs’ requirement for real-time performance.


Geophysics ◽  
2021 ◽  
pp. 1-97
Author(s):  
Haorui Peng ◽  
Ivan Vasconcelos ◽  
Yanadet Sripanich ◽  
Lele Zhang

Marchenko methods can retrieve Green’s functions and focusing functions from single-sided reflection data and a smooth velocity model, as essential components of a redatuming process. Recent studies also indicate that a modified Marchenko scheme can reconstruct primary-only reflection responses directly from reflection data without requiring a priori model information. To provide insight into the artifacts that arise when input data are not ideally sampled, we study the effects of subsampling in both types of Marchenko methods in 2D earth and data — by analyzing the behavior of Marchenko-based results on synthetic data subsampled in sources or receivers. With a layered model, we find that for Marchenko redatuming, subsampling effects jointly depend on the choice of integration variable and the subsampling dimension, originated from the integrand gather in the multidimensional convolution process. When reflection data are subsampled in a single dimension, integrating on the other yields spatial gaps together with artifacts, whereas integrating on the subsampled dimension produces aliasing artifacts but without spatial gaps. Our complex subsalt model indicates that the subsampling may lead to very strong artifacts, which can be further complicated by having limited apertures. For Marchenko-based primary estimation (MPE), subsampling below a certain fraction of the fully sampled data can cause MPE iterations to diverge, which can be mitigated to some extent by using more robust iterative solvers, such as least-squares QR. Our results, covering redatuming and primary estimation in a range of subsampling scenarios, provide insights that can inform acquisition sampling choices as well as processing parameterization and quality control, e.g., to set up appropriate data filters and scaling to accommodate the effects of dipole fields, or to help ensuring that the data interpolation achieves the desired levels of reconstruction quality that minimize subsampling artifacts in Marchenko-derived fields and images.


2021 ◽  
Vol 11 (7) ◽  
pp. 1845-1851
Author(s):  
Xi Cai

Disease diagnosis methods based on deep learning have some shortcomings in the auxiliary diagnosis process, such as relying heavily on labeled data and lack of doctor or expert experience knowledge. Based on the above background, this study proposes a disease diagnosis method combining medical knowledge atlas and deep learning (CKGDL). The core of the method is a knowledge-driven convolutional neural network (CNN) model. The structured disease knowledge in the medical knowledge map is obtained through entity link disambiguation and knowledge map embedding and extraction. The disease feature word vector and the corresponding knowledge entity vector in the disease description text are used as the multi-channel input of CNN, and different types of diseases are expressed from the semantic and knowledge levels in the convolution process. Through training and testing on multiple types of disease description text data sets, the experimental results show that the diagnostic performance of this method is better than that of a single CNN model and other disease diagnosis methods. And further verified that this method of joint training of knowledge and data is more suitable for the initial diagnosis of the disease.


Author(s):  
Ms. K. N. Rode, Et. al.

Sclerosis detection using brain magnetic resonant imaging (MRI) im-ages is challenging task. With the promising results for variety of ap-plications in terms of classification accuracy using of deep neural net-work models, one can use such models for sclerosis detection. The fea-tures associated with sclerosis is important factor which is highlighted with contrast lesion in brain MRI images. The sclerosis classification initially can be considered as binary task in which the sclerosis seg-mentation can be avoided for reduced complexity of the model. The sclerosis lesion show considerable impact on the features extracted us-ing convolution process in convolution neural network models. The images are used to train the convolutional neural network composed of 35 layers for the classification of sclerosis and normal images of brain MRI. The 35 layers are composed of combination of convolutional lay-ers, Maxpooling layers and Upscaling layers. The results are com-pared with VGG16 model and results are found satisfactory and about 92% accuracy is seen for validation set.


Work ◽  
2021 ◽  
Vol 68 (3) ◽  
pp. 935-943
Author(s):  
Guangnan Zhang ◽  
Wang Jing ◽  
Hai Tao ◽  
Md Arafatur Rahman ◽  
Sinan Q. Salih ◽  
...  

BACKGROUND: Human-Robot Interaction (HRI) has become a prominent solution to improve the robustness of real-time service provisioning through assisted functions for day-to-day activities. The application of the robotic system in security services helps to improve the precision of event detection and environmental monitoring with ease. OBJECTIVES: This paper discusses activity detection and analysis (ADA) using security robots in workplaces. The application scenario of this method relies on processing image and sensor data for event and activity detection. The events that are detected are classified for its abnormality based on the analysis performed using the sensor and image data operated using a convolution neural network. This method aims to improve the accuracy of detection by mitigating the deviations that are classified in different levels of the convolution process. RESULTS: The differences are identified based on independent data correlation and information processing. The performance of the proposed method is verified for the three human activities, such as standing, walking, and running, as detected using the images and sensor dataset. CONCLUSION: The results are compared with the existing method for metrics accuracy, classification time, and recall.


2021 ◽  
Vol 15 ◽  
Author(s):  
Yanxian He ◽  
Jun Wu ◽  
Li Zhou ◽  
Yi Chen ◽  
Fang Li ◽  
...  

Alzheimer disease (AD) is mainly manifested as insidious onset, chronic progressive cognitive decline and non-cognitive neuropsychiatric symptoms, which seriously affects the quality of life of the elderly and causes a very large burden on society and families. This paper uses graph theory to analyze the constructed brain network, and extracts the node degree, node efficiency, and node betweenness centrality parameters of the two modal brain networks. The T test method is used to analyze the difference of graph theory parameters between normal people and AD patients, and brain regions with significant differences in graph theory parameters are selected as brain network features. By analyzing the calculation principles of the conventional convolutional layer and the depth separable convolution unit, the computational complexity of them is compared. The depth separable convolution unit decomposes the traditional convolution process into spatial convolution for feature extraction and point convolution for feature combination, which greatly reduces the number of multiplication and addition operations in the convolution process, while still being able to obtain comparisons. Aiming at the special convolution structure of the depth separable convolution unit, this paper proposes a channel pruning method based on the convolution structure and explains its pruning process. Multimodal neuroimaging can provide complete information for the quantification of Alzheimer’s disease. This paper proposes a cascaded three-dimensional neural network framework based on single-modal and multi-modal images, using MRI and PET images to distinguish AD and MCI from normal samples. Multiple three-dimensional CNN networks are used to extract recognizable information in local image blocks. The high-level two-dimensional CNN network fuses multi-modal features and selects the features of discriminative regions to perform quantitative predictions on samples. The algorithm proposed in this paper can automatically extract and fuse the features of multi-modality and multi-regions layer by layer, and the visual analysis results show that the abnormally changed regions affected by Alzheimer’s disease provide important information for clinical quantification.


Author(s):  
Hyung-Hwa Ko ◽  
GilHee Choi ◽  
KyoungHak Lee

Recently, many studies on the image completion methods make us erase obstacles and fill the hole realistically but putting a new object in its place cannot be solved with the existing Image Completion. To solve this problem, this paper proposes Image Completion which filled a new object that is created through sketch image. The proposed network use pix2pix image translation model for generating object image from sketch image. The image completion network used gated convolution to reduce the weight of meaningless pixels in the convolution process. And WGAN-GP loss is used to reduce the mode dropping. In addition, by adding a contextual attention layer in the middle of the network, image completion is performed by referring to the feature value at a distant pixel. To train the models, Places2 dataset was used as background training data for image completion and Standard Dog dataset was used as training data for pix2pix. As a result of the experiment, an image of dog is generated well by sketch image and use this image as an input of the image completion network, it can generate the realistic image as a result.


Sign in / Sign up

Export Citation Format

Share Document