ANN softmax

2021 ◽  
Vol 15 (1) ◽  
pp. 1-10
Author(s):  
Kang Zhao ◽  
Liuyihan Song ◽  
Yingya Zhang ◽  
Pan Pan ◽  
Yinghui Xu ◽  
...  

Thanks to the popularity of GPU and the growth of its computational power, more and more deep learning tasks, such as face recognition, image retrieval and word embedding, can take advantage of extreme classification to improve accuracy. However, it remains a big challenge to train a deep model with millions of classes efficiently due to the huge memory and computation consumption in the last layer. By sampling a small set of classes to avoid the total classes calculation, sampling-based approaches have been proved to be an effective solution. But most of them suffer from the following two issues: i) the important classes are ignored or only partly sampled, such as the methods using random sampling scheme or retrieval techniques of low recall (e.g., locality-sensitive hashing), resulting in the degradation of accuracy; ii) inefficient implementation owing to incompatibility with GPU, like selective softmax. It uses hashing forest to help select classes, but the search process is implemented in CPU. To address the above issues, we propose a new sampling-based softmax called ANN Softmax in this paper. Specifically, we employ binary quantization with inverted file system to improve the recall of important classes. With the help of dedicated kernel design, it can be totally parallelized in mainstream training framework. Then, we find the size of important classes that are recalled by each training sample has a great impact on the final accuracy, so we introduce sample grouping optimization to well approximate the full classes training. Experimental evaluations on two tasks (Embedding Learning and Classification) and ten datasets (e.g., MegaFace, ImageNet, SKU datasets) demonstrate our proposed method maintains the same precision as Full Softmax for different loss objectives, including cross entropy loss, ArcFace, CosFace and D-Softmax loss, with only 1/10 sampled classes, which outperforms the state-of-the-art techniques. Moreover, we implement ANN Soft-max in a complete GPU pipeline that can accelerate the training more than 4.3X. Equipped our method with a 256 GPUs cluster, the time of training a classifier of 300 million classes on our SKU-300M dataset can be reduced to ten days.

Algorithms ◽  
2021 ◽  
Vol 14 (2) ◽  
pp. 39
Author(s):  
Carlos Lassance ◽  
Vincent Gripon ◽  
Antonio Ortega

Deep Learning (DL) has attracted a lot of attention for its ability to reach state-of-the-art performance in many machine learning tasks. The core principle of DL methods consists of training composite architectures in an end-to-end fashion, where inputs are associated with outputs trained to optimize an objective function. Because of their compositional nature, DL architectures naturally exhibit several intermediate representations of the inputs, which belong to so-called latent spaces. When treated individually, these intermediate representations are most of the time unconstrained during the learning process, as it is unclear which properties should be favored. However, when processing a batch of inputs concurrently, the corresponding set of intermediate representations exhibit relations (what we call a geometry) on which desired properties can be sought. In this work, we show that it is possible to introduce constraints on these latent geometries to address various problems. In more detail, we propose to represent geometries by constructing similarity graphs from the intermediate representations obtained when processing a batch of inputs. By constraining these Latent Geometry Graphs (LGGs), we address the three following problems: (i) reproducing the behavior of a teacher architecture is achieved by mimicking its geometry, (ii) designing efficient embeddings for classification is achieved by targeting specific geometries, and (iii) robustness to deviations on inputs is achieved via enforcing smooth variation of geometry between consecutive latent spaces. Using standard vision benchmarks, we demonstrate the ability of the proposed geometry-based methods in solving the considered problems.


2022 ◽  
Vol 29 (2) ◽  
pp. 1-33
Author(s):  
Nigel Bosch ◽  
Sidney K. D'Mello

The ability to identify whether a user is “zoning out” (mind wandering) from video has many HCI (e.g., distance learning, high-stakes vigilance tasks). However, it remains unknown how well humans can perform this task, how they compare to automatic computerized approaches, and how a fusion of the two might improve accuracy. We analyzed videos of users’ faces and upper bodies recorded 10s prior to self-reported mind wandering (i.e., ground truth) while they engaged in a computerized reading task. We found that a state-of-the-art machine learning model had comparable accuracy to aggregated judgments of nine untrained human observers (area under receiver operating characteristic curve [AUC] = .598 versus .589). A fusion of the two (AUC = .644) outperformed each, presumably because each focused on complementary cues. Furthermore, adding more humans beyond 3–4 observers yielded diminishing returns. We discuss implications of human–computer fusion as a means to improve accuracy in complex tasks.


2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Ali Farki ◽  
Zahra Salekshahrezaee ◽  
Arash Mohammadi Tofigh ◽  
Reza Ghanavati ◽  
Behdad Arandian ◽  
...  

The COVID-19 epidemic is spreading day by day. Early diagnosis of this disease is essential to provide effective preventive and therapeutic measures. This process can be used by a computer-aided methodology to improve accuracy. In this study, a new and optimal method has been utilized for the diagnosis of COVID-19. Here, a method based on fuzzy C -ordered means (FCOM) along with an improved version of the enhanced capsule network (ECN) has been proposed for this purpose. The proposed ECN method is improved based on mayfly optimization (MFO) algorithm. The suggested technique is then implemented on the chest X-ray COVID-19 images from publicly available datasets. Simulation results are assessed by considering a comparison with some state-of-the-art methods, including FOMPA, MID, and 4S-DT. The results show that the proposed method with 97.08% accuracy and 97.29% precision provides the highest accuracy and reliability compared with the other studied methods. Moreover, the results show that the proposed method with a 97.1% sensitivity rate has the highest ratio. And finally, the proposed method with a 97.47% F 1 -score rate gives the uppermost value compared to the others.


Author(s):  
Usman Ahmed ◽  
Jerry Chun-Wei Lin ◽  
Gautam Srivastava

Deep learning methods have led to a state of the art medical applications, such as image classification and segmentation. The data-driven deep learning application can help stakeholders to collaborate. However, limited labelled data set limits the deep learning algorithm to generalize for one domain into another. To handle the problem, meta-learning helps to learn from a small set of data. We proposed a meta learning-based image segmentation model that combines the learning of the state-of-the-art model and then used it to achieve domain adoption and high accuracy. Also, we proposed a prepossessing algorithm to increase the usability of the segments part and remove noise from the new test image. The proposed model can achieve 0.94 precision and 0.92 recall. The ability to increase 3.3% among the state-of-the-art algorithms.


2020 ◽  
pp. 1-24
Author(s):  
Dequan Jin ◽  
Ziyan Qin ◽  
Murong Yang ◽  
Penghe Chen

We propose a novel neural model with lateral interaction for learning tasks. The model consists of two functional fields: an elementary field to extract features and a high-level field to store and recognize patterns. Each field is composed of some neurons with lateral interaction, and the neurons in different fields are connected by the rules of synaptic plasticity. The model is established on the current research of cognition and neuroscience, making it more transparent and biologically explainable. Our proposed model is applied to data classification and clustering. The corresponding algorithms share similar processes without requiring any parameter tuning and optimization processes. Numerical experiments validate that the proposed model is feasible in different learning tasks and superior to some state-of-the-art methods, especially in small sample learning, one-shot learning, and clustering.


2018 ◽  
Vol 6 ◽  
pp. 421-435 ◽  
Author(s):  
Yan Shao ◽  
Christian Hardmeier ◽  
Joakim Nivre

Word segmentation is a low-level NLP task that is non-trivial for a considerable number of languages. In this paper, we present a sequence tagging framework and apply it to word segmentation for a wide range of languages with different writing systems and typological characteristics. Additionally, we investigate the correlations between various typological factors and word segmentation accuracy. The experimental results indicate that segmentation accuracy is positively related to word boundary markers and negatively to the number of unique non-segmental terms. Based on the analysis, we design a small set of language-specific settings and extensively evaluate the segmentation system on the Universal Dependencies datasets. Our model obtains state-of-the-art accuracies on all the UD languages. It performs substantially better on languages that are non-trivial to segment, such as Chinese, Japanese, Arabic and Hebrew, when compared to previous work.


2020 ◽  
Vol 2020 ◽  
pp. 1-24
Author(s):  
Jose M. Lanza-Gutierrez ◽  
N. C. Caballe ◽  
Broderick Crawford ◽  
Ricardo Soto ◽  
Juan A. Gomez-Pulido ◽  
...  

The set covering problem (SCP) is an NP-complete optimization problem, fitting with many problems in engineering. The traditional SCP formulation does not directly address both solution unsatisfiability and set redundancy aspects. As a result, the solving methods have to control these aspects to avoid getting unfeasible and nonoptimized in cost solutions. In the last years, an alternative SCP formulation was proposed, directly covering both aspects. This alternative formulation received limited attention because managing both aspects is considered straightforward at this time. This paper questions whether there is some advantage in the alternative formulation, beyond addressing the two issues. Thus, two studies based on a metaheuristic approach are proposed to identify if there is any concept in the alternative formulation, which could be considered for enhancing a solving method considering the traditional SCP formulation. As a result, the authors conclude that there are concepts from the alternative formulation, which could be applied for guiding the search process and for designing heuristic feasibilit\y operators. Thus, such concepts could be recommended for designing state-of-the-art algorithms addressing the traditional SCP formulation.


2015 ◽  
Vol 3 ◽  
pp. 449-460 ◽  
Author(s):  
Michael Roth ◽  
Mirella Lapata

Frame semantic representations have been useful in several applications ranging from text-to-scene generation, to question answering and social network analysis. Predicting such representations from raw text is, however, a challenging task and corresponding models are typically only trained on a small set of sentence-level annotations. In this paper, we present a semantic role labeling system that takes into account sentence and discourse context. We introduce several new features which we motivate based on linguistic insights and experimentally demonstrate that they lead to significant improvements over the current state-of-the-art in FrameNet-based semantic role labeling.


Complexity ◽  
2020 ◽  
Vol 2020 ◽  
pp. 1-16
Author(s):  
M. A. Balafar ◽  
R. Hazratgholizadeh ◽  
M. R. F. Derakhshi

Constrained clustering is intended to improve accuracy and personalization based on the constraints expressed by an Oracle. In this paper, a new constrained clustering algorithm is proposed and some of the informative data pairs are selected during an iterative process. Then, they are presented to the Oracle and their relation is answered with “Must-link (ML) or Cannot-link (CL).” In each iteration, first, the support vector machine (SVM) is utilized based on the label produced by the current clustering. According to the distance of each document from the hyperplane, the distance matrix is created. Also, based on cosine similarity of word2vector of each document, the similarity matrix is created. Two types of probability (similarity and degree of similarity) are calculated and they are smoothed for belonging to neighborhoods. Neighborhoods form the samples that are labeled by Oracle, to be in the same cluster. Finally, at the end of each iteration, the data with a greater level of uncertainty (in term of probability) is selected for questioning the oracle. In order to evaluate, the proposed method is compared with famous state-of-the-art methods based on two criteria and over a standard dataset. The result demonstrates an increased accuracy and stability of the obtained result with fewer questions.


2020 ◽  
Vol 34 (07) ◽  
pp. 12095-12103
Author(s):  
Yu-Ju Tsai ◽  
Yu-Lun Liu ◽  
Ming Ouhyoung ◽  
Yung-Yu Chuang

This paper introduces a novel deep network for estimating depth maps from a light field image. For utilizing the views more effectively and reducing redundancy within views, we propose a view selection module that generates an attention map indicating the importance of each view and its potential for contributing to accurate depth estimation. By exploring the symmetric property of light field views, we enforce symmetry in the attention map and further improve accuracy. With the attention map, our architecture utilizes all views more effectively and efficiently. Experiments show that the proposed method achieves state-of-the-art performance in terms of accuracy and ranks the first on a popular benchmark for disparity estimation for light field images.


Sign in / Sign up

Export Citation Format

Share Document