METHOD OF DOMAIN ONTOLOGY AUTOMATED REPLENISHMENT FOR THE SUPPORT OF NEW TECHNICAL SOLUTIONS SYNTHESIS. PART I

Author(s):  
S. S. Vasiliev ◽  
D. M. Korobkin ◽  
S. A. Fomenkov

To solve the problem of information support for the synthesis of new technical solutions, a method of extracting structured data from an array of Russian-language patents is presented. The key features of the invention, such as the structural elements of the technical object and the relationships between them, are considered as information support. The data source addresses the main claim of the invention in the device patent. The unit of extraction is the semantic structure Subject-Action-Object (SAO), which semantically describes the constructive elements. The extraction method is based on shallow parsing and claim segmentation, taking into account the specifics of writing patent texts. Often the excessive length of the claim sentence and the specificity of the patent language make it difficult to efficiently use off-the-shelf tools for data extracting. All processing steps include: segmentation of the claim sentences; extraction of primary SAO structures; construction of the graph of the construct elements f the invention; integration of the data into the domain ontology. This article deals with the first two stages. Segmentation is carried out according to a number of heuristic rules, and several natural language processing tools are used to reduce analysis errors. The primary SAO elements are extracted considering the valences of the predefined semantic group of verbs, as well as information about the type of processed segment. The result of the work is the organization of the domain ontology, which can be used to find alternative designs for nodes in a technical object. In the second part of the article, an algorithm for constructing a graph of structural elements of a separate technical object, an assessment of the effectiveness of the system, as well as ontology organization and the result are considered.

Author(s):  
Timnit Gebru

This chapter discusses the role of race and gender in artificial intelligence (AI). The rapid permeation of AI into society has not been accompanied by a thorough investigation of the sociopolitical issues that cause certain groups of people to be harmed rather than advantaged by it. For instance, recent studies have shown that commercial automated facial analysis systems have much higher error rates for dark-skinned women, while having minimal errors on light-skinned men. Moreover, a 2016 ProPublica investigation uncovered that machine learning–based tools that assess crime recidivism rates in the United States are biased against African Americans. Other studies show that natural language–processing tools trained on news articles exhibit societal biases. While many technical solutions have been proposed to alleviate bias in machine learning systems, a holistic and multifaceted approach must be taken. This includes standardization bodies determining what types of systems can be used in which scenarios, making sure that automated decision tools are created by people from diverse backgrounds, and understanding the historical and political factors that disadvantage certain groups who are subjected to these tools.


2020 ◽  
Vol 45 (4) ◽  
pp. 106-114
Author(s):  
Y. Kim ◽  
◽  
А. Yermekbayeva ◽  

This work is devoted to the speech impact of advertising texts, in other words, the language of advertising, the purpose of which is to attract the attention of a potential consumer by making the message as memorable and unusual as possible, lively and catchy, colorful and attractive to a potential listener / buyer. The significance of the work lies in the fact that the author, in the process of analyzing the basic structural elements of the advertising message (slogan and main body), determines the main speech techniques for the influence of advertising texts: expressive means, including metaphors, epithets metonymy, speech turns, paths, various grammatical forms and other forms of influence: nominative, one-part, verb sentences, comparative and superlative adjectives, rhymes, imperative verbs, adverbs, lexical repetition. On specific examples of advertising slogans, evidence is given that the above speech means contribute to increased demand for the advertised product or service. During the study, the author confirms the hypothesis put forward at the beginning of the study: if you skillfully use speech exposure, i.e. to choose words whose harmonious combination lays in the subconscious of a person the information transmitted to him by the manufacturer through high-quality advertising, then such an advertising text can become the key to the success of trade. The work is of great practical importance: the material presented in it can be used by students to improve the culture of speech, improve stylistically differentiated speech, as well as school teachers as methodological material in the Russian language when studying the section «Vocabulary», «Stylistics».


2018 ◽  
Vol 25 (6) ◽  
pp. 726-733
Author(s):  
Maria S. Karyaeva ◽  
Pavel I. Braslavski ◽  
Valery A. Sokolov

The ability to identify semantic relations between words has made a word2vec model widely used in NLP tasks. The idea of word2vec is based on a simple rule that a higher similarity can be reached if two words have a similar context. Each word can be represented as a vector, so the closest coordinates of vectors can be interpreted as similar words. It allows to establish semantic relations (synonymy, relations of hypernymy and hyponymy and other semantic relations) by applying an automatic extraction. The extraction of semantic relations by hand is considered as a time-consuming and biased task, requiring a large amount of time and some help of experts. Unfortunately, the word2vec model provides an associative list of words which does not consist of relative words only. In this paper, we show some additional criteria that may be applicable to solve this problem. Observations and experiments with well-known characteristics, such as word frequency, a position in an associative list, might be useful for improving results for the task of extraction of semantic relations for the Russian language by using word embedding. In the experiments, the word2vec model trained on the Flibusta and pairs from Wiktionary are used as examples with semantic relationships. Semantically related words are applicable to thesauri, ontologies and intelligent systems for natural language processing.


Author(s):  
Berit I. Helgheim ◽  
Rui Maia ◽  
Joao C. Ferreira ◽  
Ana Lucia Martins

Medicine is a knowledge area continuously experiencing changes. Every day, discoveries and procedures are tested with the goal of providing improved service and quality of life to patients. With the evolution of computer science, multiple areas experienced an increase in productivity with the implementation of new technical solutions. Medicine is no exception. Providing healthcare services in the future will involve the storage and manipulation of large volumes of data (big data) from medical records, requiring the integration of different data sources, for a multitude of purposes, such as prediction, prevention, personalization, participation, and becoming digital. Data integration and data sharing will be essential to achieve these goals. Our work focuses on the development of a framework process for the integration of data from different sources to increase its usability potential. We integrated data from an internal hospital database, external data, and also structured data resulting from natural language processing (NPL) applied to electronic medical records. An extract-transform and load (ETL) process was used to merge different data sources into a single one, allowing more effective use of these data and, eventually, contributing to more efficient use of the available resources.


Author(s):  
Lauri Karttunen

The article introduces the basic concepts of finite-state language processing: regular languages and relations, finite-state automata, and regular expressions. Many basic steps in language processing, ranging from tokenization, to phonological and morphological analysis, disambiguation, spelling correction, and shallow parsing, can be performed efficiently by means of finite-state transducers. The article discusses examples of finite-state languages and relations. Finite-state networks can represent only a subset of all possible languages and relations; that is, only some languages are finite-state languages. Furthermore, this article introduces two types of complex regular expressions that have many linguistic applications, restriction and replacement. Finally, the article discusses the properties of finite-state automata. The three important properties of networks are: that they are epsilon free, deterministic, and minimal. If a network encodes a regular language and if it is epsilon free, deterministic, and minimal, the network is guaranteed to be the best encoding for that language.


2020 ◽  
Vol 38 (02) ◽  
Author(s):  
TẠ DUY CÔNG CHIẾN

Question answering systems are applied to many different fields in recent years, such as education, business, and surveys. The purpose of these systems is to answer automatically the questions or queries of users about some problems. This paper introduces a question answering system is built based on a domain specific ontology. This ontology, which contains the data and the vocabularies related to the computing domain are built from text documents of the ACM Digital Libraries. Consequently, the system only answers the problems pertaining to the information technology domains such as database, network, machine learning, etc. We use the methodologies of Natural Language Processing and domain ontology to build this system. In order to increase performance, I use a graph database to store the computing ontology and apply no-SQL database for querying data of computing ontology.


2013 ◽  
Vol 778 ◽  
pp. 903-910 ◽  
Author(s):  
Krzysztof Ałykow ◽  
Magdalena Napiórkowska-Ałykow

In this article the authors present a monumental rafter framing of a baroque church in Nowy Kościół Lower-Silesia, Poland. The rafter framing was built in the 18th century and it was repaired in the middle of the 19th century by adding some new structural elements. The authors have analyzed the original construction and the reinforced construction from the 19th century and they found some large destruction of particular elements. In the presented example, the rafter framing required immediate renovation at the time, on account of its very bad technical state. This bad condition resulted from the damage of structural elements during ineffective attempts of repair, which were made in the middle of the 19th century, and due to a natural ageing process and the destruction of materials. The authors of article suggest renovation of the structural elements by adding in new supporting elements to strength them, which will force modification of reinforced elements work. The authors suggest renovation in such a form, that the monumental character of rafter framing would be preserved.


Author(s):  
V.N. Skosyrev ◽  
R.O. Stepanov ◽  
N.A. Golov ◽  
V.P. Savchenko ◽  
V.A. Usachev

The existing radar and radio navigation facilities sometimes do not satisfy the increased requirements for the accuracy, efficiency and reliability of information support for navigation in the organization of movement in the rough waters of the Arctic region in difficult climatic and meteorological conditions. This article offers a comprehensive approach to the information support of navigation based on technical solutions that significantly increase the capabilities of navigation tools in the rough waters of the Arctic zone. The proposed approach to the creation of fundamentally new high-precision information tools for solving a wide range of new tasks in the Arctic zone based on radar provides a higher class of accuracy and functionality compared to those currently used. The technical requirements for radar facilities are defined and the use of a new generation of highly informative multifunctional coastal radars in combination with new mobile pilotage terminals is proposed. The application of the proposed technical solutions and the principles of building navigation systems will significantly increase the safety and efficiency of navigation in waters with increased complexity of navigation organization. The proposed approaches will ensure the solution of the such tasks as: monitoring of air, surface and ground space, local navigation system for safe navigation of ships, helicopter flight, landing of helicopters on offshore drilling platforms and ground airfields, mooring of vessels to drilling platforms and terminal berths. It also support monitoring and dispatching of ship traffic in ports and the area of responsibility of terminals, including monitoring the position of ships at anchorage. Will be sufficient simplify high-precision operational assessment of the ice situation and the weather in the radar area of responsibility, protection of offshore drilling platforms and territories of onshore terminals, information support of means of protection against potential terrorist threats.


2020 ◽  
Vol 27 (1) ◽  
Author(s):  
MK Aregbesola ◽  
RA Ganiyu ◽  
SO Olabiyisi ◽  
EO Omidiora

The concept of automated grammar evaluation of natural language texts is one that has attracted significant interests in the natural language processing community. It is the examination of natural language text for grammatical accuracy using computer software. The current work is a comparative study of different deep and shallow parsing techniques that have been applied to lexical analysis and grammaticality evaluation of natural language texts. The comparative analysis was based on data gathered from numerous related works. Shallow parsing using induced grammars was first examined along with its two main sub-categories, the probabilistic statistical parsers and the connectionist approach using neural networks. Deep parsing using handcrafted grammar was subsequently examined along with several of it‟s subcategories including Transformational Grammars, Feature Based Grammars, Lexical Functional Grammar (LFG), Definite Clause Grammar (DCG), Property Grammar (PG), Categorial Grammar (CG), Generalized Phrase Structure Grammar (GPSG), and Head-driven Phrase Structure Grammar (HPSG). Based on facts gathered from literature on the different aforementioned formalisms, a comparative analysis of the deep and shallow parsing techniques was performed. The comparative analysis showed among other things that while the shallow parsing approach was usually domain dependent, influenced by sentence length and lexical frequency and employed machine learning to induce grammar rules, the deep parsing approach were not domain dependent, not influenced by sentence length nor lexical frequency, and they made use of well spelt out set of precise linguistic rules. The deep parsing techniques proved to be a more labour intensive approach while the induced grammar rules were usually faster and reliability increased with size, accuracy and coverage of training data. The shallow parsing approach has gained immense popularity owing to availability of large corpora for different languages, and has therefore become the most accepted and adopted approach in recent times. Keywords: Grammaticality, Natural language processing, Deep parsing, Shallow parsing, Handcrafted grammar, Precision grammar, Induced grammar, Automated scoring, Computational linguistics, Comparative study.


Sign in / Sign up

Export Citation Format

Share Document