alignment task
Recently Published Documents


TOTAL DOCUMENTS

44
(FIVE YEARS 13)

H-INDEX

9
(FIVE YEARS 2)

2021 ◽  
Vol 11 (11) ◽  
pp. 4894
Author(s):  
Anna Scius-Bertrand ◽  
Michael Jungo ◽  
Beat Wolf ◽  
Andreas Fischer ◽  
Marc Bui

The current state of the art for automatic transcription of historical manuscripts is typically limited by the requirement of human-annotated learning samples, which are are necessary to train specific machine learning models for specific languages and scripts. Transcription alignment is a simpler task that aims to find a correspondence between text in the scanned image and its existing Unicode counterpart, a correspondence which can then be used as training data. The alignment task can be approached with heuristic methods dedicated to certain types of manuscripts, or with weakly trained systems reducing the required amount of annotations. In this article, we propose a novel learning-based alignment method based on fully convolutional object detection that does not require any human annotation at all. Instead, the object detection system is initially trained on synthetic printed pages using a font and then adapted to the real manuscripts by means of self-training. On a dataset of historical Vietnamese handwriting, we demonstrate the feasibility of annotation-free alignment as well as the positive impact of self-training on the character detection accuracy, reaching a detection accuracy of 96.4% with a YOLOv5m model without using any human annotation.


2021 ◽  
Vol 21 (2) ◽  
pp. 6
Author(s):  
Rakesh Nanjappa ◽  
Robert M. McPeek
Keyword(s):  

2021 ◽  
Author(s):  
Ori Ernst ◽  
Ori Shapira ◽  
Ramakanth Pasunuru ◽  
Michael Lepioshkin ◽  
Jacob Goldberger ◽  
...  
Keyword(s):  

2020 ◽  
Vol 10 (21) ◽  
pp. 7524
Author(s):  
Victor Villena-Martinez ◽  
Sergiu Oprea ◽  
Marcelo Saval-Calvo ◽  
Jorge Azorin-Lopez ◽  
Andres Fuster-Guillo ◽  
...  

This paper reviews recent deep learning-based registration methods. Registration is the process that computes the transformation that aligns datasets, and the accuracy of the result depends on multiple factors. The most significant factors are the size of input data; the presence of noise, outliers and occlusions; the quality of the extracted features; real-time requirements; and the type of transformation, especially those defined by multiple parameters, such as non-rigid deformations. Deep Registration Networks (DRNs) are those architectures trying to solve the alignment task using a learning algorithm. In this review, we classify these methods according to a proposed framework based on the traditional registration pipeline. This pipeline consists of four steps: target selection, feature extraction, feature matching, and transform computation for the alignment. This new paradigm introduces a higher-level understanding of registration, which makes explicit the challenging problems of traditional approaches. The main contribution of this work is to provide a comprehensive starting point to address registration problems from a learning-based perspective and to understand the new range of possibilities.


Author(s):  
Subhadeep Maji ◽  
Rohan Kumar ◽  
Manish Bansal ◽  
Kalyani Roy ◽  
Pawan Goyal

Systematically discovering semantic relationships in text is an important and extensively studied area in Natural Language Processing, with various tasks such as entailment, semantic similarity, etc. Decomposability of sentence-level scores via subsequence alignments has been proposed as a way to make models more interpretable. We study the problem of aligning components of sentences leading to an interpretable model for semantic textual similarity. In this paper, we introduce a novel pointer network based model with a sentinel gating function to align constituent chunks, which are represented using BERT. We improve this base model with a loss function to equally penalize misalignments in both sentences, ensuring the alignments are bidirectional. Finally, to guide the network with structured external knowledge, we introduce first-order logic constraints based on ConceptNet and syntactic knowledge. The model achieves an F1 score of 97.73 and 96.32 on the benchmark SemEval datasets for the chunk alignment task, showing large improvements over the existing solutions. Source code is available at https://github.com/manishb89/interpretable_sentence_similarity


2020 ◽  
Vol 12 (8) ◽  
pp. 3074
Author(s):  
Yixian Chen ◽  
Zhaocheng He

Automatic number plate recognition (ANPR) systems, which have been widely equipped in many cities, produce numerous travel data for intelligent and sustainable transportation. ANPR data operate at an individual level and carry the unique identities of vehicles, which can support personalized traffic planning. However, these systems also suffer from the common problem of missing data. Different from the traditional missing cases, we focus on the problem of the loss of vehicle identities in ANPR records due to recognition failure or other environmental factors. To address the issue, we propose a heterogeneous graph embedding framework that constructs a travel heterogeneous information network (THIN) and learns the embeddings of the entities to find the best matched vehicles for the unknown records. As a result, the recovery of vehicle identities is cast as an entity alignment task on a THIN. The proposed method integrates the vehicle group entities and context relations into the THIN for capturing the spatiotemporal relationships in vehicle travel and adopts a holographic embeddings model for better fitting the network structure. Empirically, we test it with a real ANPR dataset collected from Xuancheng, China, which has a densely-distributed camera network. The experiments demonstrate the effectiveness of the proposed graph structure under different missing rates. Further, we compare it with other embedding methods and the results support the superiority of holographic embeddings.


2019 ◽  
Vol 40 (6) ◽  
pp. 819-831
Author(s):  
Chicheng Liu ◽  
Libin Song ◽  
Ken Chen ◽  
Jing Xu

Purpose This paper aims to present an image-based visual servoing algorithm for a multiple pin-in-hole assembly. This paper also aims to avoid the matching and tracking of image features and the remaining robust against image defects. Design/methodology/approach The authors derive a novel model in the set space and design three image errors to control the 3 degrees of freedom (DOF) of a single-lug workpiece in the alignment task. Analytic computations of the interaction matrix that link the time variations of the image errors to the single-lug workpiece motions are performed. The authors introduce two approximate hypotheses so that the interaction matrix has a decoupled form, and an auto-adaptive algorithm is designed to estimate the interaction matrix. Findings Image-based visual servoing in the set space avoids the matching and tracking of image features, and these methods are not sensitive to image effects. The control law using the auto-adaptive algorithm is more efficient than that using a static interaction matrix. Simulations and real-world experiments are performed to demonstrate the effectiveness of the proposed algorithm. Originality/value This paper proposes a new visual servoing method to achieve pin-in-hole assembly tasks. The main advantage of this new approach is that it does not require tracking or matching of the image features, and its supplementary advantage is that it is not sensitive to image defects.


2019 ◽  
Vol 19 (10) ◽  
pp. 219b
Author(s):  
Gaelle N. Luabeya ◽  
Xiaogang Yan ◽  
J. D. Crawford

Author(s):  
Swarnadeep Saha ◽  
Malolan Chetlur ◽  
Tejas Indulal Dhamecha ◽  
W M Gayathri K Wijayarathna ◽  
Red Mendoza ◽  
...  

Aligning Learning Outcomes (LO) to relevant portions of Learning Resources (LR) is necessary to help students quickly navigate within the recommended learning material. In general, the problem can be viewed as finding the relevant sections of a document (LR) that is pertinent to a broad question (LO). In this paper, we introduce the novel problem of aligning LOs (LO is usually a sentence long text) to relevant pages of LRs (LRs are in the form of slide decks). We observe that the set of relevant pages can be composed of multiple chunks (a chunk is a contiguous set of pages) and the same page of an LR might be relevant to multiple LOs. To this end, we develop a novel Lexico-Semantic Spatial approach that captures the lexical, semantic, and spatial aspects of the task, and also alleviates the limited availability of training data. Our approach first identifies the relevancy of a page to an LO by using lexical and semantic features from each page independently. The spatial model at a later stage exploits the dependencies between the sequence of pages in the LR to further improve the alignment task. We empirically establish the importance of the lexical, semantic, and spatial models within the proposed approach. We show that, on average, a student can navigate to a relevant page from the first predicted page by about four clicks within a 38 page slide deck, as compared to two clicks by human experts.


Author(s):  
Rui Ye ◽  
Xin Li ◽  
Yujie Fang ◽  
Hongyu Zang ◽  
Mingzhong Wang

Alignment of multiple multi-relational networks, such as knowledge graphs, is vital for AI applications. Different from the conventional alignment models, we apply the graph convolutional network (GCN) to achieve more robust network embedding for the alignment task. In comparison with existing GCNs which cannot fully utilize multi-relation information, we propose a vectorized relational graph convolutional network (VR-GCN) to learn the embeddings of both graph entities and relations simultaneously for multi-relational networks. The role discrimination and translation property of knowledge graphs are adopted in the convolutional process. Thereafter, AVR-GCN, the alignment framework based on VR-GCN, is developed for multi-relational network alignment tasks. Anchors are used to supervise the objective function which aims at minimizing the distances between anchors, and to generate new cross-network triplets to build a bridge between different knowledge graphs at the level of triplet to improve the performance of alignment. Experiments on real-world datasets show that the proposed solutions outperform the state-of-the-art methods in terms of network embedding, entity alignment, and relation alignment.


Sign in / Sign up

Export Citation Format

Share Document