scholarly journals Limitations of Deep Learning Attention Mechanisms in Clinical Research: Empirical Case Study Based on the Korean Diabetic Disease Setting (Preprint)

2020 ◽  
Author(s):  
Junetae Kim ◽  
Sangwon Lee ◽  
Eugene Hwang ◽  
Kwang Sun Ryu ◽  
Hanseok Jeong ◽  
...  

BACKGROUND Despite excellent prediction performance, noninterpretability has undermined the value of applying deep-learning algorithms in clinical practice. To overcome this limitation, attention mechanism has been introduced to clinical research as an explanatory modeling method. However, potential limitations of using this attractive method have not been clarified to clinical researchers. Furthermore, there has been a lack of introductory information explaining attention mechanisms to clinical researchers. OBJECTIVE The aim of this study was to introduce the basic concepts and design approaches of attention mechanisms. In addition, we aimed to empirically assess the potential limitations of current attention mechanisms in terms of prediction and interpretability performance. METHODS First, the basic concepts and several key considerations regarding attention mechanisms were identified. Second, four approaches to attention mechanisms were suggested according to a two-dimensional framework based on the degrees of freedom and uncertainty awareness. Third, the prediction performance, probability reliability, concentration of variable importance, consistency of attention results, and generalizability of attention results to conventional statistics were assessed in the diabetic classification modeling setting. Fourth, the potential limitations of attention mechanisms were considered. RESULTS Prediction performance was very high for all models. Probability reliability was high in models with uncertainty awareness. Variable importance was concentrated in several variables when uncertainty awareness was not considered. The consistency of attention results was high when uncertainty awareness was considered. The generalizability of attention results to conventional statistics was poor regardless of the modeling approach. CONCLUSIONS The attention mechanism is an attractive technique with potential to be very promising in the future. However, it may not yet be desirable to rely on this method to assess variable importance in clinical settings. Therefore, along with theoretical studies enhancing attention mechanisms, more empirical studies investigating potential limitations should be encouraged.

10.2196/18418 ◽  
2020 ◽  
Vol 22 (12) ◽  
pp. e18418
Author(s):  
Junetae Kim ◽  
Sangwon Lee ◽  
Eugene Hwang ◽  
Kwang Sun Ryu ◽  
Hanseok Jeong ◽  
...  

Background Despite excellent prediction performance, noninterpretability has undermined the value of applying deep-learning algorithms in clinical practice. To overcome this limitation, attention mechanism has been introduced to clinical research as an explanatory modeling method. However, potential limitations of using this attractive method have not been clarified to clinical researchers. Furthermore, there has been a lack of introductory information explaining attention mechanisms to clinical researchers. Objective The aim of this study was to introduce the basic concepts and design approaches of attention mechanisms. In addition, we aimed to empirically assess the potential limitations of current attention mechanisms in terms of prediction and interpretability performance. Methods First, the basic concepts and several key considerations regarding attention mechanisms were identified. Second, four approaches to attention mechanisms were suggested according to a two-dimensional framework based on the degrees of freedom and uncertainty awareness. Third, the prediction performance, probability reliability, concentration of variable importance, consistency of attention results, and generalizability of attention results to conventional statistics were assessed in the diabetic classification modeling setting. Fourth, the potential limitations of attention mechanisms were considered. Results Prediction performance was very high for all models. Probability reliability was high in models with uncertainty awareness. Variable importance was concentrated in several variables when uncertainty awareness was not considered. The consistency of attention results was high when uncertainty awareness was considered. The generalizability of attention results to conventional statistics was poor regardless of the modeling approach. Conclusions The attention mechanism is an attractive technique with potential to be very promising in the future. However, it may not yet be desirable to rely on this method to assess variable importance in clinical settings. Therefore, along with theoretical studies enhancing attention mechanisms, more empirical studies investigating potential limitations should be encouraged.


2021 ◽  
Vol 11 (11) ◽  
pp. 4793
Author(s):  
Cong Pan ◽  
Minyan Lu ◽  
Biao Xu

Deep learning-based software defect prediction has been popular these days. Recently, the publishing of the CodeBERT model has made it possible to perform many software engineering tasks. We propose various CodeBERT models targeting software defect prediction, including CodeBERT-NT, CodeBERT-PS, CodeBERT-PK, and CodeBERT-PT. We perform empirical studies using such models in cross-version and cross-project software defect prediction to investigate if using a neural language model like CodeBERT could improve prediction performance. We also investigate the effects of different prediction patterns in software defect prediction using CodeBERT models. The empirical results are further discussed.


2020 ◽  
Vol 2020 ◽  
pp. 1-15 ◽  
Author(s):  
Pan Wu ◽  
Zilin Huang ◽  
Yuzhuang Pian ◽  
Lunhui Xu ◽  
Jinlong Li ◽  
...  

Short-term traffic speed prediction is a promising research topic in intelligent transportation systems (ITSs), which also plays an important role in the real-time decision-making of traffic control and guidance systems. However, the urban traffic speed has strong temporal, spatial correlation and the characteristic of complex nonlinearity and randomness, which makes it challenging to accurately and efficiently forecast short-term traffic speeds. We investigate the relevant literature and found that although most methods can achieve good prediction performance with the complete sample data, when there is a certain missing rate in the database, it is difficult to maintain accuracy with these methods. Recent studies have shown that deep learning methods, especially long short-term memory (LSTM) models, have good results in short-term traffic flow prediction. Furthermore, the attention mechanism can properly assign weights to distinguish the importance of traffic time sequences, thereby further improving the computational efficiency of the prediction model. Therefore, we propose a framework for short-term traffic speed prediction, including data preprocessing module and short-term traffic prediction module. In the data preprocessing module, the missing traffic data are repaired to provide a complete dataset for subsequent prediction. In the prediction module, a combined deep learning method that is an attention-based LSTM (ATT-LSTM) model for predicting short-term traffic speed on urban roads is proposed. The proposed framework was applied to the urban road network in Nanshan District, Shenzhen, Guangdong Province, China, with a 30-day traffic speed dataset (floating car data) used as the experimental sample. Results show that the proposed method outperforms other deep learning algorithms (such as recurrent neural network (RNN) and convolutional neural network (CNN)) in terms of both calculating efficiency and prediction accuracy. The attention mechanism can significantly reduce the error of the LSTM model (up to 12.4%) and improves the prediction performance.


2017 ◽  
Vol 1 (1) ◽  
pp. 9-21
Author(s):  
Milton Raul Licona Luna ◽  
Elizabeth Alvarado Martínez

Institutions from basic to higher education in Mexico that offer courses of English as a Foreign Language rely heavily on the administering of assessment, usually a formal type of assessment. However, the literature shows how important it is the involvement of other types of assessment in the classroom for effective language learning to take place. For instance, assessment for learning, which consist of a continuous assessment where learners receive feedback so greater learning occurs, what is more, it enables teachers to modify their teaching ways as they reflect on the learners’ progress. To show how assessment is carried out in our context, this research project focuses on a case study within the CAADI from FOD in the UANL.


2018 ◽  
Vol 12 (3) ◽  
pp. 181-187
Author(s):  
M. Erkan Kütük ◽  
L. Canan Dülger

An optimization study with kinetostatic analysis is performed on hybrid seven-bar press mechanism. This study is based on previous studies performed on planar hybrid seven-bar linkage. Dimensional synthesis is performed, and optimum link lengths for the mechanism are found. Optimization study is performed by using genetic algorithm (GA). Genetic Algorithm Toolbox is used with Optimization Toolbox in MATLAB®. The design variables and the constraints are used during design optimization. The objective function is determined and eight precision points are used. A seven-bar linkage system with two degrees of freedom is chosen as an example. Metal stamping operation with a dwell is taken as the case study. Having completed optimization, the kinetostatic analysis is performed. All forces on the links and the crank torques are calculated on the hybrid system with the optimized link lengths


2021 ◽  
Author(s):  
Zhaoyang Niu ◽  
Guoqiang Zhong ◽  
Hui Yu

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Youngbin Na ◽  
Do-Kyeong Ko

AbstractStructured light with spatial degrees of freedom (DoF) is considered a potential solution to address the unprecedented demand for data traffic, but there is a limit to effectively improving the communication capacity by its integer quantization. We propose a data transmission system using fractional mode encoding and deep-learning decoding. Spatial modes of Bessel-Gaussian beams separated by fractional intervals are employed to represent 8-bit symbols. Data encoded by switching phase holograms is efficiently decoded by a deep-learning classifier that only requires the intensity profile of transmitted modes. Our results show that the trained model can simultaneously recognize two independent DoF without any mode sorter and precisely detect small differences between fractional modes. Moreover, the proposed scheme successfully achieves image transmission despite its densely packed mode space. This research will present a new approach to realizing higher data rates for advanced optical communication systems.


Sign in / Sign up

Export Citation Format

Share Document