transformer model
Recently Published Documents


TOTAL DOCUMENTS

455
(FIVE YEARS 223)

H-INDEX

26
(FIVE YEARS 5)

2022 ◽  
Author(s):  
Hariharan Nagasubramaniam ◽  
Rabih Younes

Bokeh effect is growing to be an important feature in photography, essentially to choose an object of interest to be in focus with the rest of the background being blurred. While naturally rendering this effect requires a DSLR with large diameter of aperture, with the current advancements in Deep Learning, this effect can also be produced in mobile cameras. Most of the existing methods use Convolutional Neural Networks while some relying on the depth map to render this effect. In this paper, we propose an end-to-end Vision Transformer model for Bokeh rendering of images from monocular camera. This architecture uses vision transformers as backbone, thus learning from the entire image rather than just the parts from the filters in a CNN. This property of retaining global information coupled with initial training of the model for image restoration before training to render the blur effect for the background, allows our method to produce clearer images and outperform the current state-of-the-art models on the EBB! Data set. The code to our proposed method can be found at: https://github.com/Soester10/ Bokeh-Rendering-with-Vision-Transformers.


2022 ◽  
Author(s):  
Hariharan Nagasubramaniam ◽  
Rabih Younes

Bokeh effect is growing to be an important feature in photography, essentially to choose an object of interest to be in focus with the rest of the background being blurred. While naturally rendering this effect requires a DSLR with large diameter of aperture, with the current advancements in Deep Learning, this effect can also be produced in mobile cameras. Most of the existing methods use Convolutional Neural Networks while some relying on the depth map to render this effect. In this paper, we propose an end-to-end Vision Transformer model for Bokeh rendering of images from monocular camera. This architecture uses vision transformers as backbone, thus learning from the entire image rather than just the parts from the filters in a CNN. This property of retaining global information coupled with initial training of the model for image restoration before training to render the blur effect for the background, allows our method to produce clearer images and outperform the current state-of-the-art models on the EBB! Data set. The code to our proposed method can be found at: https://github.com/Soester10/ Bokeh-Rendering-with-Vision-Transformers.


2022 ◽  
Vol 2022 ◽  
pp. 1-8
Author(s):  
Zong-Yu Peng ◽  
Pei-Chang Guo

The accurate prediction of stock prices is not an easy task. The long short-term memory (LSTM) neural network and the transformer are good machine learning models for times series forecasting. In this paper, we use LSTM and transformer to predict prices of banking stocks in China’s A-share market. It is shown that organizing the input data can help get accurate outcomes of the models. In this paper, we first introduce some basic knowledge about LSTM and present prediction results using a standard LSTM model. Then, we show how to organize the input data during the training period and give the comparison results for not only LSTM but also the transformer model. The numerical results show that the prediction results of LSTM and transformer can be improved after the input data are organized when training.


2022 ◽  
Vol 12 (1) ◽  
pp. 468
Author(s):  
Yeonghyeon Gu ◽  
Zhegao Piao ◽  
Seong Joon Yoo

In magnetic resonance imaging (MRI) segmentation, conventional approaches utilize U-Net models with encoder–decoder structures, segmentation models using vision transformers, or models that combine a vision transformer with an encoder–decoder model structure. However, conventional models have large sizes and slow computation speed and, in vision transformer models, the computation amount sharply increases with the image size. To overcome these problems, this paper proposes a model that combines Swin transformer blocks and a lightweight U-Net type model that has an HarDNet blocks-based encoder–decoder structure. To maintain the features of the hierarchical transformer and shifted-windows approach of the Swin transformer model, the Swin transformer is used in the first skip connection layer of the encoder instead of in the encoder–decoder bottleneck. The proposed model, called STHarDNet, was evaluated by separating the anatomical tracings of lesions after stroke (ATLAS) dataset, which comprises 229 T1-weighted MRI images, into training and validation datasets. It achieved Dice, IoU, precision, and recall values of 0.5547, 0.4185, 0.6764, and 0.5286, respectively, which are better than those of the state-of-the-art models U-Net, SegNet, PSPNet, FCHarDNet, TransHarDNet, Swin Transformer, Swin UNet, X-Net, and D-UNet. Thus, STHarDNet improves the accuracy and speed of MRI image-based stroke diagnosis.


Author(s):  
Lingxiao Meng ◽  
Wenjun Tan ◽  
Jiangang Ma ◽  
Ruofei Wang ◽  
Xiaoxia Yin ◽  
...  

2021 ◽  
Author(s):  
Wenqing Chang ◽  
Xiang Li ◽  
Huomin Dong ◽  
Chunxiao Wang ◽  
Zhigang Zhao ◽  
...  

Author(s):  
N. Habbat ◽  
H. Anoun ◽  
L. Hassouni

Abstract. Topic models extract meaningful words from text collection, allowing for a better understanding of data. However, the results are often not coherent enough, and thus harder to interpret. Adding more contextual knowledge to the model can enhance coherence. In recent years, neural network-based topic models become available, and the development level of the neural model has developed thanks to BERT-based representation. In this study, we suggest a model extract news on the Aljazeera Facebook page. Our approach combines the neural model (ProdLDA) and the Arabic Pre-training BERT transformer model (AraBERT). Therefore, the proposed model produces more expressive and consistent topics than ELMO using different topic model algorithms (ProdLDA and LDA) with 0.883 in topic coherence.


Sign in / Sign up

Export Citation Format

Share Document