Bokeh Effect Rendering with Vision Transformers

10.36227/techrxiv.17714849.v1 ◽

2022 ◽

Author(s):

Hariharan Nagasubramaniam ◽

Rabih Younes

Keyword(s):

Neural Networks ◽

State Of The Art ◽

Large Diameter ◽

Depth Map ◽

Global Information ◽

Initial Training ◽

Data Set ◽

Current State ◽

Mobile Cameras ◽

Transformer Model

Bokeh effect is growing to be an important feature in photography, essentially to choose an object of interest to be in focus with the rest of the background being blurred. While naturally rendering this effect requires a DSLR with large diameter of aperture, with the current advancements in Deep Learning, this effect can also be produced in mobile cameras. Most of the existing methods use Convolutional Neural Networks while some relying on the depth map to render this effect. In this paper, we propose an end-to-end Vision Transformer model for Bokeh rendering of images from monocular camera. This architecture uses vision transformers as backbone, thus learning from the entire image rather than just the parts from the filters in a CNN. This property of retaining global information coupled with initial training of the model for image restoration before training to render the blur effect for the background, allows our method to produce clearer images and outperform the current state-of-the-art models on the EBB! Data set. The code to our proposed method can be found at: https://github.com/Soester10/ Bokeh-Rendering-with-Vision-Transformers.

Download Full-text

Soft Threshold Ternary Networks

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/318 ◽

2020 ◽

Author(s):

Weixiang Xu ◽

Xiangyu He ◽

Tianli Zhao ◽

Qinghao Hu ◽

Peisong Wang ◽

...

Keyword(s):

Neural Networks ◽

Mobile Devices ◽

State Of The Art ◽

Performance Gap ◽

Training Time ◽

Current State ◽

The Arts ◽

Soft Threshold ◽

And Storage ◽

Selection Of

Large neural networks are difficult to deploy on mobile devices because of intensive computation and storage. To alleviate it, we study ternarization, a balance between efficiency and accuracy that quantizes both weights and activations into ternary values. In previous ternarized neural networks, a hard threshold Δ is introduced to determine quantization intervals. Although the selection of Δ greatly affects the training results, previous works estimate Δ via an approximation or treat it as a hyper-parameter, which is suboptimal. In this paper, we present the Soft Threshold Ternary Networks (STTN), which enables the model to automatically determine quantization intervals instead of depending on a hard threshold. Concretely, we replace the original ternary kernel with the addition of two binary kernels at training time, where ternary values are determined by the combination of two corresponding binary values. At inference time, we add up the two binary kernels to obtain a single ternary kernel. Our method dramatically outperforms current state-of-the-arts, lowering the performance gap between full-precision networks and extreme low bit networks. Experiments on ImageNet with AlexNet (Top-1 55.6%), ResNet-18 (Top-1 66.2%) achieves new state-of-the-art.

Download Full-text

AI-driven deep CNN approach for multi-label pathology classification using chest X-Rays

PeerJ Computer Science ◽

10.7717/peerj-cs.495 ◽

2021 ◽

Vol 7 ◽

pp. e495

Author(s):

Saleh Albahli ◽

Hafiz Tayyab Rauf ◽

Abdulelah Algosaibi ◽

Valentina Emilia Balas

Keyword(s):

Neural Networks ◽

Data Augmentation ◽

State Of The Art ◽

Synthetic Data ◽

X Rays ◽

Deep Convolutional Neural Networks ◽

Current State ◽

Pathology Classification ◽

Wide Range ◽

Multi Class Classification

Artificial intelligence (AI) has played a significant role in image analysis and feature extraction, applied to detect and diagnose a wide range of chest-related diseases. Although several researchers have used current state-of-the-art approaches and have produced impressive chest-related clinical outcomes, specific techniques may not contribute many advantages if one type of disease is detected without the rest being identified. Those who tried to identify multiple chest-related diseases were ineffective due to insufficient data and the available data not being balanced. This research provides a significant contribution to the healthcare industry and the research community by proposing a synthetic data augmentation in three deep Convolutional Neural Networks (CNNs) architectures for the detection of 14 chest-related diseases. The employed models are DenseNet121, InceptionResNetV2, and ResNet152V2; after training and validation, an average ROC-AUC score of 0.80 was obtained competitive as compared to the previous models that were trained for multi-class classification to detect anomalies in x-ray images. This research illustrates how the proposed model practices state-of-the-art deep neural networks to classify 14 chest-related diseases with better accuracy.

Download Full-text

Detecting Emotions in English and Arabic Tweets

Information ◽

10.3390/info10030098 ◽

2019 ◽

Vol 10 (3) ◽

pp. 98 ◽

Cited By ~ 4

Author(s):

Tariq Ahmad ◽

Allan Ramsay ◽

Hanady Ahmed

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Deep Neural Networks ◽

State Of The Art ◽

Learning Algorithms ◽

General Purpose ◽

Machine Learning Algorithms ◽

Current State ◽

Optimal Thresholds ◽

Alternative Approach

Assigning sentiment labels to documents is, at first sight, a standard multi-label classification task. Many approaches have been used for this task, but the current state-of-the-art solutions use deep neural networks (DNNs). As such, it seems likely that standard machine learning algorithms, such as these, will provide an effective approach. We describe an alternative approach, involving the use of probabilities to construct a weighted lexicon of sentiment terms, then modifying the lexicon and calculating optimal thresholds for each class. We show that this approach outperforms the use of DNNs and other standard algorithms. We believe that DNNs are not a universal panacea and that paying attention to the nature of the data that you are trying to learn from can be more important than trying out ever more powerful general purpose machine learning algorithms.

Download Full-text

PPDIST, global 0.1° daily and 3-hourly precipitation probability distribution climatologies for 1979–2018

Scientific Data ◽

10.1038/s41597-020-00631-x ◽

2020 ◽

Vol 7 (1) ◽

Author(s):

Hylke E. Beck ◽

Seth Westra ◽

Jackson Tan ◽

Florian Pappenberger ◽

George J. Huffman ◽

...

Keyword(s):

Neural Networks ◽

Probability Distribution ◽

State Of The Art ◽

Intertropical Convergence Zone ◽

Coefficient Of Determination ◽

Current State ◽

Peak Intensity ◽

Global Land ◽

The Neural Networks ◽

Better Than

Abstract We introduce the Precipitation Probability DISTribution (PPDIST) dataset, a collection of global high-resolution (0.1°) observation-based climatologies (1979–2018) of the occurrence and peak intensity of precipitation (P) at daily and 3-hourly time-scales. The climatologies were produced using neural networks trained with daily P observations from 93,138 gauges and hourly P observations (resampled to 3-hourly) from 11,881 gauges worldwide. Mean validation coefficient of determination (R2) values ranged from 0.76 to 0.80 for the daily P occurrence indices, and from 0.44 to 0.84 for the daily peak P intensity indices. The neural networks performed significantly better than current state-of-the-art reanalysis (ERA5) and satellite (IMERG) products for all P indices. Using a 0.1 mm 3 h−1 threshold, P was estimated to occur 12.2%, 7.4%, and 14.3% of the time, on average, over the global, land, and ocean domains, respectively. The highest P intensities were found over parts of Central America, India, and Southeast Asia, along the western equatorial coast of Africa, and in the intertropical convergence zone. The PPDIST dataset is available via www.gloh2o.org/ppdist.

Download Full-text

The secret is at the crossways: Hodotopic organization and nonlinear dynamics of brain neural networks

Behavioral and Brain Sciences ◽

10.1017/s0140525x13001386 ◽

2013 ◽

Vol 36 (6) ◽

pp. 623-624 ◽

Cited By ~ 2

Author(s):

Tobias A. Mattei

Keyword(s):

Nonlinear Dynamics ◽

Neural Networks ◽

Cognitive Neuroscience ◽

State Of The Art ◽

Current State

AbstractBy integrating the classic psychological principles of ancient art of memory (AAOM) with the most recent paradigms in cognitive neuroscience (i.e., the concepts of hodotopic organization and nonlinear dynamics of brain neural networks), Llewellyn provides an up-to-date model of the complex psychological relationships between memory, imagination, and dreams in accordance with current state-of-the-art principles in neuroscience.

Download Full-text

Modified Deep Neural Networks for Dog Breeds Identification

10.20944/preprints201812.0232.v1 ◽

2018 ◽

Cited By ~ 1

Author(s):

Aydin Ayanzadeh ◽

Sahand Vahidnia

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

State Of The Art ◽

The State ◽

Fine Tuning ◽

Test Accuracy ◽

Data Sets ◽

Data Set

In this paper, we leverage state of the art models on Imagenet data-sets. We use the pre-trained model and learned weighs to extract the feature from the Dog breeds identification data-set. Afterwards, we applied fine-tuning and dataaugmentation to increase the performance of our test accuracy in classification of dog breeds datasets. The performance of the proposed approaches are compared with the state of the art models of Image-Net datasets such as ResNet-50, DenseNet-121, DenseNet-169 and GoogleNet. we achieved 89.66% , 85.37% 84.01% and 82.08% test accuracy respectively which shows thesuperior performance of proposed method to the previous works on Stanford dog breeds datasets.

Download Full-text

Training a neural network to learn other dimensionality reduction removes data size restrictions in bioinformatics and provides a new route to exploring data representations

10.1101/2020.09.03.269555 ◽

2020 ◽

Cited By ~ 1

Author(s):

Alex Dexter ◽

Spencer A. Thomas ◽

Rory T. Steven ◽

Kenneth N. Robinson ◽

Adam J. Taylor ◽

...

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Dimensionality Reduction ◽

Computational Analysis ◽

New Technologies ◽

State Of The Art ◽

Current State ◽

Data Representations ◽

Non Linear ◽

Linear Dimensionality Reduction

AbstractHigh dimensionality omics and hyperspectral imaging datasets present difficult challenges for feature extraction and data mining due to huge numbers of features that cannot be simultaneously examined. The sample numbers and variables of these methods are constantly growing as new technologies are developed, and computational analysis needs to evolve to keep up with growing demand. Current state of the art algorithms can handle some routine datasets but struggle when datasets grow above a certain size. We present a training deep learning via neural networks on non-linear dimensionality reduction, in particular t-distributed stochastic neighbour embedding (t-SNE), to overcome prior limitations of these methods.One Sentence SummaryAnalysis of prohibitively large datasets by combining deep learning via neural networks with non-linear dimensionality reduction.

Download Full-text

Using 3D Convolutional Neural Networks for Real-time Detection of Soccer Events

International Journal of Semantic Computing ◽

10.1142/s1793351x2140002x ◽

2021 ◽

Vol 15 (02) ◽

pp. 161-187

Author(s):

Olav A. Nergård Rongved ◽

Steven A. Hicks ◽

Vajira Thambawita ◽

Håkon K. Stensland ◽

Evi Zouganeli ◽

...

Keyword(s):

Neural Networks ◽

Real Time ◽

Convolutional Neural Networks ◽

Event Detection ◽

State Of The Art ◽

Time Estimation ◽

High Recall ◽

Current State ◽

Ablation Study ◽

Different Parts

Developing systems for the automatic detection of events in video is a task which has gained attention in many areas including sports. More specifically, event detection for soccer videos has been studied widely in the literature. However, there are still a number of shortcomings in the state-of-the-art such as high latency, making it challenging to operate at the live edge. In this paper, we present an algorithm to detect events in soccer videos in real time, using 3D convolutional neural networks. We test our algorithm on three different datasets from SoccerNet, the Swedish Allsvenskan, and the Norwegian Eliteserien. Overall, the results show that we can detect events with high recall, low latency, and accurate time estimation. The trade-off is a slightly lower precision compared to the current state-of-the-art, which has higher latency and performs better when a less accurate time estimation can be accepted. In addition to the presented algorithm, we perform an extensive ablation study on how the different parts of the training pipeline affect the final results.

Download Full-text

A logic programming approach to predict effective compiler settings for embedded software

Theory and Practice of Logic Programming ◽

10.1017/s1471068415000174 ◽

2015 ◽

Vol 15 (4-5) ◽

pp. 481-494 ◽

Cited By ~ 3

Author(s):

CRAIG BLACKMORE ◽

OLIVER RAY ◽

KERSTIN EDER

Keyword(s):

Embedded System ◽

Logic Programming ◽

State Of The Art ◽

System Development ◽

Hybrid Approach ◽

Inductive Logic ◽

Programming Approach ◽

Data Set ◽

Current State ◽

Execution Times

AbstractThis paper introduces a new logic-based method for optimising the selection of compiler flags on embedded architectures. In particular, we use Inductive Logic Programming (ILP) to learn logical rules that relate effective compiler flags to specific program features. Unlike earlier work, we aim to infer human-readable rules and we seek to develop a relational first-order approach which automatically discovers relevant features rather than relying on a vector of predetermined attributes. To this end we generated a data set by measuring execution times of 60 benchmarks on an embedded system development board and we developed an ILP prototype which outperforms the current state-of-the-art learning approach in 34 of the 60 benchmarks. Finally, we combined the strengths of the current state of the art and our ILP method in a hybrid approach which reduced execution times by an average of 8% and up to 50% in some cases.

Download Full-text