Multi-user Feedback for Large-scale Cross-lingual Ontology Matching

Sensor ontology models the sensor information and knowledge in a machine-understandable way, which aims at addressing the data heterogeneity problem on the Internet of Things (IoT). However, the existing sensor ontologies are maintained independently for different requirements, which might define the same concept with different terms or context, yielding the heterogeneity issue. Since the complex semantic relationship between the sensor concepts and the large-scale entities is to be dealt with, finding the identical entity correspondences is an error-prone task. To effectively determine the sensor entity correspondences, this work proposes a semisupervised learning-based sensor ontology matching technique. First, we borrow the idea of “centrality” from the social network to construct the training examples; then, we present an evolutionary algorithm- (EA-) based metamatching technique to train the model of aggregating different similarity measures; finally, we use the trained model to match the rest entities. The experiment uses the benchmark as well as three real sensor ontologies to test our proposal’s performance. The experimental results show that our approach is able to determine high-quality sensor entity correspondences in all matching tasks.

Download Full-text

Light-Weight Cross-Lingual Ontology Matching with LYAM++

Lecture Notes in Computer Science - On the Move to Meaningful Internet Systems: OTM 2015 Conferences ◽

10.1007/978-3-319-26148-5_36 ◽

2015 ◽

pp. 527-544 ◽

Cited By ~ 7

Author(s):

Abdel Nasser Tigrine ◽

Zohra Bellahsene ◽

Konstantin Todorov

Keyword(s):

Ontology Matching ◽

Light Weight ◽

Cross Lingual

Download Full-text

Semantic Synchronization in B2B Transactions

Business Information Systems ◽

10.4018/978-1-61520-969-9.ch094 ◽

2010 ◽

pp. 1518-1542

Author(s):

Janina Fengel ◽

Heiko Paulheim ◽

Michael Rebstock

Keyword(s):

Large Scale ◽

Business Processes ◽

User Participation ◽

Ontology Matching ◽

Ontological Engineering ◽

Business Information Systems ◽

Business Partners ◽

Business Information ◽

Mapping Technology

Despite the development of e-business standards, the integration of business processes and business information systems is still a non-trivial issue if business partners use different e-business standards for formatting and describing information to be processed. Since those standards can be understood as ontologies, ontological engineering technologies can be applied for processing, especially ontology matching for reconciling them. However, as e-business standards tend to be rather large-scale ontologies, scalability is a crucial requirement. To serve this demand, we present our ORBI Ontology Mediator. It is linked with our Malasco system for partition-based ontology matching with currently available matching systems, which so far do not scale well, if at all. In our case study we show how to provide dynamic semantic synchronization between business partners using different e-business standards without initial ramp-up effort, based on ontological mapping technology combined with interactive user participation.

Download Full-text

iFeedback: Exploiting User Feedback for Real-Time Issue Detection in Large-Scale Online Service Systems

2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE) ◽

10.1109/ase.2019.00041 ◽

2019 ◽

Cited By ~ 1

Author(s):

Wujie Zheng ◽

Haochuan Lu ◽

Yangfan Zhou ◽

Jianming Liang ◽

Haibing Zheng ◽

...

Keyword(s):

Real Time ◽

Large Scale ◽

User Feedback ◽

Service Systems ◽

Online Service ◽

Time Issue

Download Full-text

Fuzzy and Cross-Lingual Ontology Matching Mediated by Background Knowledge

Uncertainty Reasoning for the Semantic Web III - Lecture Notes in Computer Science ◽

10.1007/978-3-319-13413-0_8 ◽

2014 ◽

pp. 142-162 ◽

Cited By ~ 2

Author(s):

Konstantin Todorov ◽

Celiné Hudelot ◽

Peter Geibel

Keyword(s):

Background Knowledge ◽

Ontology Matching ◽

Cross Lingual

Download Full-text

A Machine Learning Approach to Multilingual and Cross-Lingual Ontology Matching

The Semantic Web – ISWC 2011 - Lecture Notes in Computer Science ◽

10.1007/978-3-642-25073-6_42 ◽

2011 ◽

pp. 665-680 ◽

Cited By ~ 35

Author(s):

Dennis Spohr ◽

Laura Hollink ◽

Philipp Cimiano

Keyword(s):

Machine Learning ◽

Ontology Matching ◽

Learning Approach ◽

Machine Learning Approach ◽

Cross Lingual

Download Full-text

Multi-SimLex: A Large-Scale Evaluation of Multilingual and Cross-Lingual Lexical Semantic Similarity

Computational Linguistics ◽

10.1162/coli_a_00391 ◽

2020 ◽

pp. 1-51

Author(s):

Ivan Vulić ◽

Simon Baker ◽

Edoardo Maria Ponti ◽

Ulla Petti ◽

Ira Leviant ◽

...

Keyword(s):

Semantic Similarity ◽

Large Scale ◽

Representation Learning ◽

Data Sets ◽

Word Embeddings ◽

Data Set ◽

Lexical Representations ◽

Language Data ◽

Weakly Supervised ◽

Cross Lingual

We introduce Multi-SimLex, a large-scale lexical resource and evaluation benchmark covering data sets for 12 typologically diverse languages, including major languages (e.g., Mandarin Chinese, Spanish, Russian) as well as less-resourced ones (e.g., Welsh, Kiswahili). Each language data set is annotated for the lexical relation of semantic similarity and contains 1,888 semantically aligned concept pairs, providing a representative coverage of word classes (nouns, verbs, adjectives, adverbs), frequency ranks, similarity intervals, lexical fields, and concreteness levels. Additionally, owing to the alignment of concepts across languages, we provide a suite of 66 crosslingual semantic similarity data sets. Because of its extensive size and language coverage, Multi-SimLex provides entirely novel opportunities for experimental evaluation and analysis. On its monolingual and crosslingual benchmarks, we evaluate and analyze a wide array of recent state-of-the-art monolingual and crosslingual representation models, including static and contextualized word embeddings (such as fastText, monolingual and multilingual BERT, XLM), externally informed lexical representations, as well as fully unsupervised and (weakly) supervised crosslingual word embeddings. We also present a step-by-step data set creation protocol for creating consistent, Multi-Simlex -style resources for additional languages.We make these contributions—the public release of Multi-SimLex data sets, their creation protocol, strong baseline results, and in-depth analyses which can be be helpful in guiding future developments in multilingual lexical semantics and representation learning—available via aWeb site that will encourage community effort in further expansion of Multi-Simlex to many more languages. Such a large-scale semantic resource could inspire significant further advances in NLP across languages.

Download Full-text

Hybrid large-scale ontology matching strategy on big data environment

Proceedings of the 18th International Conference on Information Integration and Web-based Applications and Services - iiWAS '16 ◽

10.1145/3011141.3011185 ◽

2016 ◽

Cited By ~ 3

Author(s):

Imadeddine Mountasser ◽

Brahim Ouhbi ◽

Bouchra Frikh

Keyword(s):

Big Data ◽

Large Scale ◽

Ontology Matching ◽

Data Environment ◽

Matching Strategy

Download Full-text

Large-scale biomedical ontology matching with ServOMap

IRBM ◽

10.1016/j.irbm.2012.12.011 ◽

2013 ◽

Vol 34 (1) ◽

pp. 56-59 ◽

Cited By ~ 12

Author(s):

M. Ba ◽

G. Diallo

Keyword(s):

Large Scale ◽

Biomedical Ontology ◽

Ontology Matching

Download Full-text

Cross-lingual citations in English papers: a large-scale analysis of prevalence, usage, and impact

International Journal on Digital Libraries ◽

10.1007/s00799-021-00312-z ◽

2021 ◽

Author(s):

Tarek Saier ◽

Michael Färber ◽

Tornike Tsereteli

Keyword(s):

Large Scale ◽

Data Sets ◽

Learning Approaches ◽

Large Scale Analysis ◽

Scientific Disciplines ◽

Limited Degree ◽

Trends Over Time ◽

Scholarly Data ◽

Cross Lingual ◽

Scholarly Discourse

AbstractCitation information in scholarly data is an important source of insight into the reception of publications and the scholarly discourse. Outcomes of citation analyses and the applicability of citation-based machine learning approaches heavily depend on the completeness of such data. One particular shortcoming of scholarly data nowadays is that non-English publications are often not included in data sets, or that language metadata is not available. Because of this, citations between publications of differing languages (cross-lingual citations) have only been studied to a very limited degree. In this paper, we present an analysis of cross-lingual citations based on over one million English papers, spanning three scientific disciplines and a time span of three decades. Our investigation covers differences between cited languages and disciplines, trends over time, and the usage characteristics as well as impact of cross-lingual citations. Among our findings are an increasing rate of citations to publications written in Chinese, citations being primarily to local non-English languages, and consistency in citation intent between cross- and monolingual citations. To facilitate further research, we make our collected data and source code publicly available.

Download Full-text