scholarly journals A New Approach for Semi-Automatic Building and Extending a Multilingual Terminology Thesaurus

2019 ◽  
Vol 28 (02) ◽  
pp. 1950008
Author(s):  
Aleš Horák ◽  
Vít Baisa ◽  
Adam Rambousek ◽  
Vít Suchomel

This paper describes a new system for semi-automatically building, extending and managing a terminological thesaurus — a multilingual terminology dictionary enriched with relationships between the terms themselves to form a thesaurus. The system allows to radically enhance the workow of current terminology expert groups, where most of the editing decisions still come from introspection. The presented system supplements the lexicographic process with natural language processing techniques, which are seamlessly integrated to the thesaurus editing environment. The system’s methodology and the resulting thesaurus are closely connected to new domain corpora in the six languages involved. They are used for term usage examples as well as for the automatic extraction of new candidate terms. The terminological thesaurus is now accessible via a web-based application, which (a) presents rich detailed information on each term, (b) visualizes term relations, and (c) displays real-life usage examples of the term in the domain-related documents and in the context-based similar terms. Furthermore, the specialized corpora are used to detect candidate translations of terms from the central language (Czech) to the other languages (English, French, German, Russian and Slovak) as well as to detect broader Czech terms, which help to place new terms in the actual thesaurus hierarchy. This project has been realized as a terminological thesaurus of land surveying, but the presented tools and methodology are reusable for other terminology domains.

2011 ◽  
pp. 503-521
Author(s):  
Flavius Frasincar ◽  
Jethro Borsje ◽  
Leonard Levering

This article proposes Hermes, a Semantic Webbased framework for building personalized news services. It makes use of ontologies for knowledge representation, natural language processing techniques for semantic text analysis, and semantic query languages for specifying wanted information. Hermes is supported by an implementation of the framework, the Hermes News Portal, a tool which allows users to have a personalized online access to news items. The Hermes framework and its associated implementation aim at advancing the state-of-the-art of semantic approaches for personalized news services by employing Semantic Web standards, exploiting domain information, using a word sense disambiguation procedure, and being able to express temporal constraints for the desired news items.


Author(s):  
Pedro Tavares ◽  
José Lima ◽  
Pedro Costa ◽  
A. Paulo Moreira

Purpose Streamlining automated processes is currently undertaken by developing optimization methods and algorithms for robotic manipulators. This paper aims to present a new approach to improve streamlining of automatic processes. This new approach allows for multiple robotic manipulators commonly found in the industrial environment to handle different scenarios, thus providing a high-flexibility solution to automated processes. Design/methodology/approach The developed system is based on a spatial discretization methodology capable of describing the surrounding environment of the robot, followed by a novel path-planning algorithm. Gazebo was the simulation engine chosen, and the robotic manipulator used was the Universal Robot 5 (UR5). The proposed system was tested using the premises of two robotic challenges: EuRoC and Amazon Picking Challenge. Findings The developed system was able to identify and describe the influence of each joint in the Cartesian space, and it was possible to control multiple robotic manipulators safely regardless of any obstacles in a given scene. Practical implications This new system was tested in both real and simulated environments, and data collected showed that this new system performed well in real-life scenarios, such as EuRoC and Amazon Picking Challenge. Originality/value The new proposed approach can be valuable in the robotics field with applications in various industrial scenarios, as it provides a flexible solution for multiple robotic manipulator path and motion planning.


Author(s):  
Snezhana Sulova ◽  
Boris Bankov

The impact of social networks on our liveskeeps increasing because they provide content,generated and controlled by users, that is constantly evolving. They aid us in spreading news, statements, ideas and comments very quickly. Social platforms are currently one of the richest sources of customer feedback on a variety of topics. A topic that is frequently discussed is the resort and holiday villages and the tourist services offered there. Customer comments are valuable to both travel planners and tour operators. The accumulation of opinions in the web space is a prerequisite for using and applying appropriate tools for their computer processing and for extracting useful knowledge from them. While working with unstructured data, such as social media messages, there isn’t a universal text processing algorithm because each social network and its resources have their own characteristics. In this article, we propose a new approach for an automated analysis of a static set of historical data of user messages about holiday and vacation resorts, published on Twitter. The approach is based on natural language processing techniques and the application of machine learning methods. The experiments are conducted using softwareproduct RapidMiner. 


2019 ◽  
Author(s):  
Joël Legrand ◽  
Romain Gogdemir ◽  
Cédric Bousquet ◽  
Kevin Dalleau ◽  
Marie-Dominique Devignes ◽  
...  

AbstractPharmacogenomics (PGx) studies how individual gene variations impact drug response phenotypes, which makes knowledge related to PGx a key component towards precision medicine. A significant part of the state-of-the-art knowledge in PGx is accumulated in scientific publications, where it is hardly usable to humans or software. Natural language processing techniques have been developed and are indeed employed for guiding experts curating this amount of knowledge. But, existing works are limited by the absence of high quality annotated corpora focusing on the domain. This absence restricts in particular the use of supervised machine learning approaches. This article introduces PGxCorpus, a manually annotated corpus, designed for the automatic extraction of PGx relationships from text. It comprises 945 sentences from 911 PubMed abstracts, annotated with PGx entities of interest (mainly genes variations, gene, drugs and phenotypes), and relationships between those. We present in this article the method used to annotate consistently texts, and a baseline experiment that illustrates how this resource may be leveraged to synthesize and summarize PGx knowledge.


2021 ◽  
Vol 20 (No.4) ◽  
pp. 629-649
Author(s):  
Maha Thabet ◽  
Mehdi Ellouze ◽  
Mourad Zaied

Video concept detection means describing a video with semantic concepts that correspond to the content of the video. The concepts help to retrieve video quickly. These semantic concepts describe high-level elements that depict the key information present in the content. In recent years, many efforts have been done to automate this task because the manual solution is time-consuming. Nowadays, videos come with comments. Therefore, in addition to the content of the videos, the comments should be analyzed because they contain valuable data that help to retrieve videos. This paper focused especially on videos shared on social media. The specificity of these videos was the presence of massive comments. This paper attempted to exploit comments by extracting concepts from them. This would support the research effort that works only on the visual content. Natural language processing techniques were used to analyze comments and to filter words to retain only the ones that could be considered as concepts. The proposed approach was tested on YouTube videos. The results demonstrated that the proposed approach was able to extract accurate data and concepts from the comments that could be used to ease the retrieval of videos. The findings supported the research effort of working on the visual and audio contents of the videos.


Author(s):  
Xinfeng Ye ◽  
Yuqian Lu

Abstract Manufacturers use cloud manufacturing platforms to offer their services. The literature has suggested a semantic web-based cloud manufacturing framework, in which engineering knowledge is modeled using structured syntax. Translating engineering rules to semantic rules by human is a painstaking task and prone to mistakes. We present a scheme that treats converting engineering knowledge into semantic rules as a machine translation task and uses neural machine translation techniques to carry out the conversion.


2014 ◽  
Vol 39 (3) ◽  
pp. 315-330 ◽  
Author(s):  
Thomas Scheff

This note links three hitherto separate subjects: role-taking, meditation, and theories of emotion, in order to conceptualize the makeup of the self. The idea of role-taking plays a central part in sociological theories of the self. Meditation implies the same process in terms of a deep self able to witness itself. Drama theories also depend upon a deep self that establishes a safe zone for resolving intense emotions. All three approaches imply both a creative deep self and the everyday self (ego) that is largely automated. The creativity of the deep self is illustrated with a real life example: an extraordinary psychotherapy experiment appears to have succeeded because it was based entirely on the intuitions of the therapist. At the other end from intuition, in one of her novels, Virginia Woolf suggested three crucial points about automated thought: incredible speed, role-taking, and by implication, the presence of a deep self. This essay goes on to explain how the ego is repetitive to the extent that it becomes mostly, and in unusual cases, completely automated (as in most dreams and all hallucinations). The rapidity of ordinary discourse and thought usually means that it is superficial, leading to greater and greater dysfunction, and less and less emotion. This idea suggests a new approach to the basis of ‘mental illness’ and of modern alienation.


2018 ◽  
Author(s):  
Gerardo Lagunes-García ◽  
Alejandro Rodríguez-González ◽  
Lucía Prieto-Santamaría ◽  
Eduardo P. García del Valle ◽  
Massimiliano Zanin ◽  
...  

AbstractWithin the global endeavour of improving population health, one major challenge is the increasingly high cost associated with drug development. Drug repositioning, i.e. finding new uses for existing drugs, is a promising alternative; yet, its effectiveness has hitherto been hindered by our limited knowledge about diseases and their relationships. In this paper, we present DISNET (disnet.ctb.upm.es), a web-based system designed to extract knowledge from signs and symptoms retrieved from medical databases, and to enable the creation of customisable disease networks. We here present the main features of the DISNET system. We describe how information on diseases and their phenotypic manifestations is extracted from Wikipedia, PubMed and Mayo Clinic; specifically, texts from these sources are processed through a combination of text mining and natural language processing techniques. We further present a validation of the processing performed by the system; and describe, with some simple use cases, how a user can interact with it and extract information that could be used for subsequent analyses.


Author(s):  
Olli Mäkinen

This chapter deals with the influence of mediation in different kinds of virtual environments, e.g. virtual conferences, e-learning platforms, distance learning environments and surroundings, Internet Relay Chat (IRC), and various other user interfaces. Mediation is a means in which messages, discussion and behaviour are becoming more and more conceptual and abstract and have an effect on our social being. As a result of mediation there is no first-hand experience of reality, everything is constructed, and in virtual reality we have receded a long way off from real life. Mediation affects our capability to make independent ethical decisions. The same process is discerned in all the social and commercial practice where it is rationalized by processing techniques or when it is made virtual. Here mediation is studied from a phenomenological perspective. Quantification, modelling and regulation also describe aspects of mediation. This chapter is a review article and an opening in mediational ethics based on classical philosophy.


Author(s):  
Ramin Sabbagh ◽  
Farhad Ameri

The descriptions of capabilities of manufacturing companies can be found in multiple locations including company websites, legacy system databases, and ad hoc documents and spreadsheets. The capability descriptions are often represented using natural language. To unlock the value of unstructured capability information and learn from it, there is a need for developing advanced quantitative methods supported by machine learning and natural language processing techniques. This research proposes a multi-step unsupervised learning methodology using K-means clustering and topic modeling techniques in order to build clusters of suppliers based on their capabilities, extract and organize the manufacturing capability terminology, and discover nontrivial patterns in manufacturing capability corpora. The capability data is extracted either directly from the website of manufacturing firms or from their profiles in e-sourcing portals and directories. Feature extraction and dimensionality reduction process in this work in supported by Ngram extraction and Latent Semantic Analysis (LSA) methods. The proposed clustering method is validated experimentally based a dataset composed of 150 capability descriptions collected from web-based sourcing directories such as the Thomas Net directory for manufacturing companies. The results of the experiment show that the proposed method creates supplier cluster with high accuracy.


Sign in / Sign up

Export Citation Format

Share Document