scholarly journals An Automatic Scale-Adaptive Approach With Attention Mechanism-Based Crowd Spatial Information for Crowd Counting

IEEE Access ◽  
2019 ◽  
Vol 7 ◽  
pp. 66215-66225 ◽  
Author(s):  
Weihang Kong ◽  
He Li ◽  
Guanglong Xing ◽  
Fengda Zhao
2018 ◽  
Vol 32 (7) ◽  
pp. 2897-2908 ◽  
Author(s):  
Bisheng Wang ◽  
Guo Cao ◽  
Yanfeng Shang ◽  
Licun Zhou ◽  
Youqiang Zhang ◽  
...  

Sensors ◽  
2020 ◽  
Vol 20 (22) ◽  
pp. 6493
Author(s):  
Song-Kyu Park ◽  
Joon-Hyuk Chang

In this paper, we propose a multi-channel cross-tower with attention mechanisms in latent domain network (Multi-TALK) that suppresses both the acoustic echo and background noise. The proposed approach consists of the cross-tower network, a parallel encoder with an auxiliary encoder, and a decoder. For the multi-channel processing, a parallel encoder is used to extract latent features of each microphone, and the latent features including the spatial information are compressed by a 1D convolution operation. In addition, the latent features of the far-end are extracted by the auxiliary encoder, and they are effectively provided to the cross-tower network by using the attention mechanism. The cross tower network iteratively estimates the latent features of acoustic echo and background noise in each tower. To improve the performance at each iteration, the outputs of each tower are transmitted as the input for the next iteration of the neighboring tower. Before passing through the decoder, to estimate the near-end speech, attention mechanisms are further applied to remove the estimated acoustic echo and background noise from the compressed mixture to prevent speech distortion by over-suppression. Compared to the conventional algorithms, the proposed algorithm effectively suppresses the acoustic echo and background noise and significantly lowers the speech distortion.


2020 ◽  
Vol 528 ◽  
pp. 79-91 ◽  
Author(s):  
Li Dong ◽  
Haijun Zhang ◽  
Yuzhu Ji ◽  
Yuxin Ding

Sensors ◽  
2021 ◽  
Vol 21 (12) ◽  
pp. 4182
Author(s):  
Haijing Sun ◽  
Anna Wang ◽  
Wenhui Wang ◽  
Chen Liu

The early diagnosis of Alzheimer’s disease (AD) can allow patients to take preventive measures before irreversible brain damage occurs. It can be seen from cross-sectional imaging studies of AD that the features of the lesion areas in AD patients, as observed by magnetic resonance imaging (MRI), show significant variation, and these features are distributed throughout the image space. Since the convolutional layer of the general convolutional neural network (CNN) cannot satisfactorily extract long-distance correlation in the feature space, a deep residual network (ResNet) model, based on spatial transformer networks (STN) and the non-local attention mechanism, is proposed in this study for the early diagnosis of AD. In this ResNet model, a new Mish activation function is selected in the ResNet-50 backbone to replace the Relu function, STN is introduced between the input layer and the improved ResNet-50 backbone, and a non-local attention mechanism is introduced between the fourth and the fifth stages of the improved ResNet-50 backbone. This ResNet model can extract more information from the layers by deepening the network structure through deep ResNet. The introduced STN can transform the spatial information in MRI images of Alzheimer’s patients into another space and retain the key information. The introduced non-local attention mechanism can find the relationship between the lesion areas and normal areas in the feature space. This model can solve the problem of local information loss in traditional CNN and can extract the long-distance correlation in feature space. The proposed method was validated using the ADNI (Alzheimer’s disease neuroimaging initiative) experimental dataset, and compared with several models. The experimental results show that the classification accuracy of the algorithm proposed in this study can reach 97.1%, the macro precision can reach 95.5%, the macro recall can reach 95.3%, and the macro F1 value can reach 95.4%. The proposed model is more effective than other algorithms.


2021 ◽  
Vol 11 (7) ◽  
pp. 3111
Author(s):  
Enjie Ding ◽  
Yuhao Cheng ◽  
Chengcheng Xiao ◽  
Zhongyu Liu ◽  
Wanli Yu

Light-weight convolutional neural networks (CNNs) suffer limited feature representation capabilities due to low computational budgets, resulting in degradation in performance. To make CNNs more efficient, dynamic neural networks (DyNet) have been proposed to increase the complexity of the model by using the Squeeze-and-Excitation (SE) module to adaptively obtain the importance of each convolution kernel through the attention mechanism. However, the attention mechanism in the SE network (SENet) selects all channel information for calculations, which brings essential challenges: (a) interference caused by the internal redundant information; and (b) increasing number of network calculations. To address the above problems, this work proposes a dynamic convolutional network (termed as EAM-DyNet) to reduce the number of channels in feature maps by extracting only the useful spatial information. EAM-DyNet first uses the random channel reduction and channel grouping reduction methods to remove the redundancy in the information. As the downsampling of information can lead to the loss of useful information, it then applies an adaptive average pooling method to maintain the information integrity. Extensive experimental results on the baseline demonstrate that EAM-DyNet outperformed the existing approaches, thus it can achieve higher accuracy of the network test and less network parameters.


2022 ◽  
Vol 2022 ◽  
pp. 1-14
Author(s):  
Mengxing Huang ◽  
Shi Liu ◽  
Zhenfeng Li ◽  
Siling Feng ◽  
Di Wu ◽  
...  

A two-stream remote sensing image fusion network (RCAMTFNet) based on the residual channel attention mechanism is proposed by introducing the residual channel attention mechanism (RCAM) in this paper. In the RCAMTFNet, the spatial features of PAN and the spectral features of MS are extracted, respectively, by a two-channel feature extraction layer. Multiresidual connections allow the network to adapt to a deeper network structure without the degradation. The residual channel attention mechanism is introduced to learn the interdependence between channels, and then the correlation features among channels are adapted on the basis of the dependency. In this way, image spatial information and spectral information are extracted exclusively. What is more, pansharpening images are reconstructed across the board. Experiments are conducted on two satellite datasets, GaoFen-2 and WorldView-2. The experimental results show that the proposed algorithm is superior to the algorithms to some existing literature in the comparison of the values of reference evaluation indicators and nonreference evaluation indicators.


2021 ◽  
Vol 11 (23) ◽  
pp. 11136
Author(s):  
Zenebe Markos Lonseko ◽  
Prince Ebenezer Adjei ◽  
Wenju Du ◽  
Chengsi Luo ◽  
Dingcan Hu ◽  
...  

Gastrointestinal (GI) diseases constitute a leading problem in the human digestive system. Consequently, several studies have explored automatic classification of GI diseases as a means of minimizing the burden on clinicians and improving patient outcomes, for both diagnostic and treatment purposes. The challenge in using deep learning-based (DL) approaches, specifically a convolutional neural network (CNN), is that spatial information is not fully utilized due to the inherent mechanism of CNNs. This paper proposes the application of spatial factors in improving classification performance. Specifically, we propose a deep CNN-based spatial attention mechanism for the classification of GI diseases, implemented with encoder–decoder layers. To overcome the data imbalance problem, we adapt data-augmentation techniques. A total of 12,147 multi-sited, multi-diseased GI images, drawn from publicly available and private sources, were used to validate the proposed approach. Furthermore, a five-fold cross-validation approach was adopted to minimize inconsistencies in intra- and inter-class variability and to ensure that results were robustly assessed. Our results, compared with other state-of-the-art models in terms of mean accuracy (ResNet50 = 90.28, GoogLeNet = 91.38, DenseNets = 91.60, and baseline = 92.84), demonstrated better outcomes (Precision = 92.8, Recall = 92.7, F1-score = 92.8, and Accuracy = 93.19). We also implemented t-distributed stochastic neighbor embedding (t–SNE) and confusion matrix analysis techniques for better visualization and performance validation. Overall, the results showed that the attention mechanism improved the automatic classification of multi-sited GI disease images. We validated clinical tests based on the proposed method by overcoming previous limitations, with the goal of improving automatic classification accuracy in future work.


Sign in / Sign up

Export Citation Format

Share Document