alignment task Latest Research Papers

The current state of the art for automatic transcription of historical manuscripts is typically limited by the requirement of human-annotated learning samples, which are are necessary to train specific machine learning models for specific languages and scripts. Transcription alignment is a simpler task that aims to find a correspondence between text in the scanned image and its existing Unicode counterpart, a correspondence which can then be used as training data. The alignment task can be approached with heuristic methods dedicated to certain types of manuscripts, or with weakly trained systems reducing the required amount of annotations. In this article, we propose a novel learning-based alignment method based on fully convolutional object detection that does not require any human annotation at all. Instead, the object detection system is initially trained on synthetic printed pages using a font and then adapted to the real manuscripts by means of self-training. On a dataset of historical Vietnamese handwriting, we demonstrate the feasibility of annotation-free alignment as well as the positive impact of self-training on the character detection accuracy, reaching a detection accuracy of 96.4% with a YOLOv5m model without using any human annotation.

Download Full-text

Microsaccades and attention in a high-acuity visual alignment task

Journal of Vision ◽

10.1167/jov.21.2.6 ◽

2021 ◽

Vol 21 (2) ◽

pp. 6

Author(s):

Rakesh Nanjappa ◽

Robert M. McPeek

Keyword(s):

Alignment Task ◽

High Acuity

Download Full-text

Summary-Source Proposition-level Alignment: Task, Datasets and Supervised Baseline

10.18653/v1/2021.conll-1.25 ◽

2021 ◽

Author(s):

Ori Ernst ◽

Ori Shapira ◽

Ramakanth Pasunuru ◽

Michael Lepioshkin ◽

Jacob Goldberger ◽

...

Keyword(s):

Alignment Task

Download Full-text

When Deep Learning Meets Data Alignment: A Review on Deep Registration Networks (DRNs)

Applied Sciences ◽

10.3390/app10217524 ◽

2020 ◽

Vol 10 (21) ◽

pp. 7524

Author(s):

Victor Villena-Martinez ◽

Sergiu Oprea ◽

Marcelo Saval-Calvo ◽

Jorge Azorin-Lopez ◽

Andres Fuster-Guillo ◽

...

Keyword(s):

Deep Learning ◽

Feature Matching ◽

Learning Algorithm ◽

Target Selection ◽

New Paradigm ◽

Multiple Parameters ◽

Data Alignment ◽

Starting Point ◽

Alignment Task ◽

Time Requirements

This paper reviews recent deep learning-based registration methods. Registration is the process that computes the transformation that aligns datasets, and the accuracy of the result depends on multiple factors. The most significant factors are the size of input data; the presence of noise, outliers and occlusions; the quality of the extracted features; real-time requirements; and the type of transformation, especially those defined by multiple parameters, such as non-rigid deformations. Deep Registration Networks (DRNs) are those architectures trying to solve the alignment task using a learning algorithm. In this review, we classify these methods according to a proposed framework based on the traditional registration pipeline. This pipeline consists of four steps: target selection, feature extraction, feature matching, and transform computation for the alignment. This new paradigm introduces a higher-level understanding of registration, which makes explicit the challenging problems of traditional approaches. The main contribution of this work is to provide a comprehensive starting point to address registration problems from a learning-based perspective and to understand the new range of possibilities.

Download Full-text

Logic Constrained Pointer Networks for Interpretable Textual Similarity

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/333 ◽

2020 ◽

Author(s):

Subhadeep Maji ◽

Rohan Kumar ◽

Manish Bansal ◽

Kalyani Roy ◽

Pawan Goyal

Keyword(s):

Language Processing ◽

Source Code ◽

Order Logic ◽

First Order Logic ◽

Semantic Relationships ◽

First Order ◽

Syntactic Knowledge ◽

Sentence Level ◽

Alignment Task ◽

Interpretable Model

Systematically discovering semantic relationships in text is an important and extensively studied area in Natural Language Processing, with various tasks such as entailment, semantic similarity, etc. Decomposability of sentence-level scores via subsequence alignments has been proposed as a way to make models more interpretable. We study the problem of aligning components of sentences leading to an interpretable model for semantic textual similarity. In this paper, we introduce a novel pointer network based model with a sentinel gating function to align constituent chunks, which are represented using BERT. We improve this base model with a loss function to equally penalize misalignments in both sentences, ensuring the alignments are bidirectional. Finally, to guide the network with structured external knowledge, we introduce first-order logic constraints based on ConceptNet and syntactic knowledge. The model achieves an F1 score of 97.73 and 96.32 on the benchmark SemEval datasets for the chunk alignment task, showing large improvements over the existing solutions. Source code is available at https://github.com/manishb89/interpretable_sentence_similarity

Download Full-text

Vehicle Identity Recovery for Automatic Number Plate Recognition Data via Heterogeneous Network Embedding

Sustainability ◽

10.3390/su12083074 ◽

2020 ◽

Vol 12 (8) ◽

pp. 3074

Author(s):

Yixian Chen ◽

Zhaocheng He

Keyword(s):

Sustainable Transportation ◽

Vehicle Group ◽

Graph Structure ◽

Traffic Planning ◽

Camera Network ◽

Heterogeneous Information ◽

Individual Level ◽

Alignment Task ◽

The Common ◽

Embedding Methods

Automatic number plate recognition (ANPR) systems, which have been widely equipped in many cities, produce numerous travel data for intelligent and sustainable transportation. ANPR data operate at an individual level and carry the unique identities of vehicles, which can support personalized traffic planning. However, these systems also suffer from the common problem of missing data. Different from the traditional missing cases, we focus on the problem of the loss of vehicle identities in ANPR records due to recognition failure or other environmental factors. To address the issue, we propose a heterogeneous graph embedding framework that constructs a travel heterogeneous information network (THIN) and learns the embeddings of the entities to find the best matched vehicles for the unknown records. As a result, the recovery of vehicle identities is cast as an entity alignment task on a THIN. The proposed method integrates the vehicle group entities and context relations into the THIN for capturing the spatiotemporal relationships in vehicle travel and adopts a holographic embeddings model for better fitting the network structure. Empirically, we test it with a real ANPR dataset collected from Xuancheng, China, which has a densely-distributed camera network. The experiments demonstrate the effectiveness of the proposed graph structure under different missing rates. Further, we compare it with other embedding methods and the results support the superiority of holographic embeddings.

Download Full-text

Image-based visual servoing using a set for multiple pin-in-hole assembly

Assembly Automation ◽

10.1108/aa-08-2018-110 ◽

2019 ◽

Vol 40 (6) ◽

pp. 819-831

Author(s):

Chicheng Liu ◽

Libin Song ◽

Ken Chen ◽

Jing Xu

Keyword(s):

Adaptive Algorithm ◽

Visual Servoing ◽

Degrees Of Freedom ◽

Image Features ◽

Interaction Matrix ◽

Content Type ◽

Static Interaction ◽

Alignment Task ◽

Time Variations ◽

Assembly Tasks

Purpose This paper aims to present an image-based visual servoing algorithm for a multiple pin-in-hole assembly. This paper also aims to avoid the matching and tracking of image features and the remaining robust against image defects. Design/methodology/approach The authors derive a novel model in the set space and design three image errors to control the 3 degrees of freedom (DOF) of a single-lug workpiece in the alignment task. Analytic computations of the interaction matrix that link the time variations of the image errors to the single-lug workpiece motions are performed. The authors introduce two approximate hypotheses so that the interaction matrix has a decoupled form, and an auto-adaptive algorithm is designed to estimate the interaction matrix. Findings Image-based visual servoing in the set space avoids the matching and tracking of image features, and these methods are not sensitive to image effects. The control law using the auto-adaptive algorithm is more efficient than that using a static interaction matrix. Simulations and real-world experiments are performed to demonstrate the effectiveness of the proposed algorithm. Originality/value This paper proposes a new visual servoing method to achieve pin-in-hole assembly tasks. The main advantage of this new approach is that it does not require tracking or matching of the image features, and its supplementary advantage is that it is not sensitive to image defects.

Download Full-text

Influence of Gaze Direction on Hand Location and Orientation in a Memory-Guided Alignment Task

Journal of Vision ◽

10.1167/19.10.219b ◽

2019 ◽

Vol 19 (10) ◽

pp. 219b

Author(s):

Gaelle N. Luabeya ◽

Xiaogang Yan ◽

J. D. Crawford

Keyword(s):

Gaze Direction ◽

Alignment Task

Download Full-text

Aligning Learning Outcomes to Learning Resources: A Lexico-Semantic Spatial Approach

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/718 ◽

2019 ◽

Cited By ~ 1

Author(s):

Swarnadeep Saha ◽

Malolan Chetlur ◽

Tejas Indulal Dhamecha ◽

W M Gayathri K Wijayarathna ◽

Red Mendoza ◽

...

Keyword(s):

Learning Outcomes ◽

Spatial Models ◽

Training Data ◽

Learning Material ◽

Learning Resources ◽

Semantic Features ◽

The Novel ◽

Lexical Semantic ◽

Spatial Approach ◽

Alignment Task

Aligning Learning Outcomes (LO) to relevant portions of Learning Resources (LR) is necessary to help students quickly navigate within the recommended learning material. In general, the problem can be viewed as finding the relevant sections of a document (LR) that is pertinent to a broad question (LO). In this paper, we introduce the novel problem of aligning LOs (LO is usually a sentence long text) to relevant pages of LRs (LRs are in the form of slide decks). We observe that the set of relevant pages can be composed of multiple chunks (a chunk is a contiguous set of pages) and the same page of an LR might be relevant to multiple LOs. To this end, we develop a novel Lexico-Semantic Spatial approach that captures the lexical, semantic, and spatial aspects of the task, and also alleviates the limited availability of training data. Our approach first identifies the relevancy of a page to an LO by using lexical and semantic features from each page independently. The spatial model at a later stage exploits the dependencies between the sequence of pages in the LR to further improve the alignment task. We empirically establish the importance of the lexical, semantic, and spatial models within the proposed approach. We show that, on average, a student can navigate to a relevant page from the first predicted page by about four clicks within a 38 page slide deck, as compared to two clicks by human experts.

Download Full-text

A Vectorized Relational Graph Convolutional Network for Multi-Relational Network Alignment

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/574 ◽

2019 ◽

Cited By ~ 5

Author(s):

Rui Ye ◽

Xin Li ◽

Yujie Fang ◽

Hongyu Zang ◽

Mingzhong Wang

Keyword(s):

Network Alignment ◽

Network Embedding ◽

Convolutional Network ◽

Relational Networks ◽

Alignment Task ◽

Translation Property ◽

Robust Network ◽

Knowledge Graphs ◽

Real World Datasets ◽

Relational Network

Alignment of multiple multi-relational networks, such as knowledge graphs, is vital for AI applications. Different from the conventional alignment models, we apply the graph convolutional network (GCN) to achieve more robust network embedding for the alignment task. In comparison with existing GCNs which cannot fully utilize multi-relation information, we propose a vectorized relational graph convolutional network (VR-GCN) to learn the embeddings of both graph entities and relations simultaneously for multi-relational networks. The role discrimination and translation property of knowledge graphs are adopted in the convolutional process. Thereafter, AVR-GCN, the alignment framework based on VR-GCN, is developed for multi-relational network alignment tasks. Anchors are used to supervise the objective function which aims at minimizing the distances between anchors, and to generate new cross-network triplets to build a bridge between different knowledge graphs at the level of triplet to improve the performance of alignment. Experiments on real-world datasets show that the proposed solutions outperform the state-of-the-art methods in terms of network embedding, entity alignment, and relation alignment.

Download Full-text

alignment task
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Transcription Alignment of Historical Vietnamese Manuscripts without Human-Annotated Learning Samples

Microsaccades and attention in a high-acuity visual alignment task

Summary-Source Proposition-level Alignment: Task, Datasets and Supervised Baseline

When Deep Learning Meets Data Alignment: A Review on Deep Registration Networks (DRNs)

Logic Constrained Pointer Networks for Interpretable Textual Similarity

Vehicle Identity Recovery for Automatic Number Plate Recognition Data via Heterogeneous Network Embedding

Image-based visual servoing using a set for multiple pin-in-hole assembly

Influence of Gaze Direction on Hand Location and Orientation in a Memory-Guided Alignment Task

Aligning Learning Outcomes to Learning Resources: A Lexico-Semantic Spatial Approach

A Vectorized Relational Graph Convolutional Network for Multi-Relational Network Alignment

Export Citation Format

alignment taskRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Transcription Alignment of Historical Vietnamese Manuscripts without Human-Annotated Learning Samples

Microsaccades and attention in a high-acuity visual alignment task

Summary-Source Proposition-level Alignment: Task, Datasets and Supervised Baseline

When Deep Learning Meets Data Alignment: A Review on Deep Registration Networks (DRNs)

Logic Constrained Pointer Networks for Interpretable Textual Similarity

Vehicle Identity Recovery for Automatic Number Plate Recognition Data via Heterogeneous Network Embedding

Image-based visual servoing using a set for multiple pin-in-hole assembly

Influence of Gaze Direction on Hand Location and Orientation in a Memory-Guided Alignment Task

Aligning Learning Outcomes to Learning Resources: A Lexico-Semantic Spatial Approach

A Vectorized Relational Graph Convolutional Network for Multi-Relational Network Alignment

alignment task
Recently Published Documents