scholarly journals Stacked Hourglass Network with a Multi-level Attention Mechanism: Where to Look for Intervertebral Disc Labeling

2021 ◽  
pp. 406-415
Author(s):  
Reza Azad ◽  
Lucas Rouhier ◽  
Julien Cohen-Adad
Author(s):  
Chu-Xiong Qin ◽  
Wen-Lin Zhang ◽  
Dan Qu

Abstract A method called joint connectionist temporal classification (CTC)-attention-based speech recognition has recently received increasing focus and has achieved impressive performance. A hybrid end-to-end architecture that adds an extra CTC loss to the attention-based model could force extra restrictions on alignments. To explore better the end-to-end models, we propose improvements to the feature extraction and attention mechanism. First, we introduce a joint model trained with nonnegative matrix factorization (NMF)-based high-level features. Then, we put forward a hybrid attention mechanism by incorporating multi-head attentions and calculating attention scores over multi-level outputs. Experiments on TIMIT indicate that the new method achieves state-of-the-art performance with our best model. Experiments on WSJ show that our method exhibits a word error rate (WER) that is only 0.2% worse in absolute value than the best referenced method, which is trained on a much larger dataset, and it beats all present end-to-end methods. Further experiments on LibriSpeech show that our method is also comparable to the state-of-the-art end-to-end system in WER.


2022 ◽  
Vol 9 (1) ◽  
pp. 18
Author(s):  
Hiroaki Kamishina ◽  
Yukiko Nakano ◽  
Kohei Nakata ◽  
Shintaro Kimura ◽  
Yuta Nozue ◽  
...  

The objective of this study was to evaluate the feasibility and clinical outcomes of microendoscopic dorsal laminectomy for multi-level cervical intervertebral disc protrusions in dogs. Eight client-owned dogs diagnosed with multi-level cervical intervertebral disc protrusions using computed tomography (CT) and magnetic resonance imaging (MRI) were included in this retrospective case series. Microendoscopic dorsal laminectomies (MEL) were performed with an integrated endoscopic system to the cranial and caudal vertebrae of the affected intervertebral joints. Pre- and post-operative neurological status, operation time, intra-operative complications, and postoperative complications were reviewed. Post-operative CT images were obtained to measure the dimensions of laminectomy and compared to those of planned laminectomy. Full endoscopic procedures were feasible in 7 dogs (87.5%) and the laminectomy dimensions were in agreement with pre-operative planning. In all dogs, major intra- and postoperative complications did not occur. Conversion to open surgery was required in one case. Short-term postoperative clinical deterioration was found in two dogs. Long-term clinical outcomes were good and comparable to those reported in previous studies of open dorsal laminectomies. MEL is a promising minimally invasive approach to multi-level cervical dorsal laminectomy for intervertebral disc protrusions. This technique may improve postoperative discomfort compared to the open approach. Further studies are needed to directly compare outcomes between these two approaches.


Author(s):  
Chaoqun Duan ◽  
Lei Cui ◽  
Xinchi Chen ◽  
Furu Wei ◽  
Conghui Zhu ◽  
...  

Natural language inference aims to predict whether a premise sentence can infer another hypothesis sentence. Recent progress on this task only relies on a shallow interaction between sentence pairs, which is insufficient for modeling complex relations. In this paper, we present an attention-fused deep matching network (AF-DMN) for natural language inference. Unlike existing models, AF-DMN takes two sentences as input and iteratively learns the attention-aware representations for each side by multi-level interactions. Moreover, we add a self-attention mechanism to fully exploit local context information within each sentence. Experiment results show that AF-DMN achieves state-of-the-art performance and outperforms strong baselines on Stanford natural language inference (SNLI), multi-genre natural language inference (MultiNLI), and Quora duplicate questions datasets.


Entropy ◽  
2019 ◽  
Vol 21 (2) ◽  
pp. 143
Author(s):  
Zhipeng Lin ◽  
Yuhua Tang ◽  
Yongjun Zhang

The Recommender System (RS) has obtained a pivotal role in e-commerce. To improve the performance of RS, review text information has been extensively utilized. However, it is still a challenge for RS to extract the most informative feature from a tremendous amount of reviews. Another significant issue is the modeling of user–item interaction, which is rarely considered to capture high- and low-order interactions simultaneously. In this paper, we design a multi-level attention mechanism to learn the usefulness of reviews and the significance of words by Deep Neural Networks (DNN). In addition, we develop a hybrid prediction structure that integrates Factorization Machine (FM) and DNN to model low-order user–item interactions as in FM and capture the high-order interactions as in DNN. Based on these two designs, we build a Multi-level Attentional and Hybrid-prediction-based Recommender (MAHR) model for recommendation. Extensive experiments on Amazon and Yelp datasets showed that our approach provides more accurate recommendations than the state-of-the-art recommendation approaches. Furthermore, the verification experiments and explainability study, including the visualization of attention modules and the review-usefulness prediction test, also validated the reasonability of our multi-level attention mechanism and hybrid prediction.


2020 ◽  
Vol 12 (3) ◽  
pp. 560
Author(s):  
Lifu Chen ◽  
Siyu Tan ◽  
Zhouhao Pan ◽  
Jin Xing ◽  
Zhihui Yuan ◽  
...  

The detection of airports from Synthetic Aperture Radar (SAR) images is of great significance in various research fields. However, it is challenging to distinguish the airport from surrounding objects in SAR images. In this paper, a new framework, multi-level and densely dual attention (MDDA) network is proposed to extract airport runway areas (runways, taxiways, and parking lots) in SAR images to achieve automatic airport detection. The framework consists of three parts: down-sampling of original SAR images, MDDA network for feature extraction and classification, and up-sampling of airports extraction results. First, down-sampling is employed to obtain a medium-resolution SAR image from the high-resolution SAR images to ensure the samples (500 × 500) can contain adequate information about airports. The dataset is then input to the MDDA network, which contains an encoder and a decoder. The encoder uses ResNet_101 to extract four-level features with different resolutions, and the decoder performs fusion and further feature extraction on these features. The decoder integrates the chained residual pooling network (CRP_Net) and the dual attention fusion and extraction (DAFE) module. The CRP_Net module mainly uses chained residual pooling and multi-feature fusion to extract advanced semantic features. In the DAFE module, position attention module (PAM) and channel attention mechanism (CAM) are combined with weighted filtering. The entire decoding network is constructed in a densely connected manner to enhance the gradient transmission among features and take full advantage of them. Finally, the airport results extracted by the decoding network were up-sampled by bilinear interpolation to accomplish airport extraction from high-resolution SAR images. To verify the proposed framework, experiments were performed using Gaofen-3 SAR images with 1 m resolution, and three different airports were selected for accuracy evaluation. The results showed that the mean pixels accuracy (MPA) and mean intersection over union (MIoU) of the MDDA network was 0.98 and 0.97, respectively, which is much higher than RefineNet and DeepLabV3. Therefore, MDDA can achieve automatic airport extraction from high-resolution SAR images with satisfying accuracy.


Sign in / Sign up

Export Citation Format

Share Document