scholarly journals A New Statistical and Verbal-Semantic Approach to Pattern Extraction in Text Mining Applications

2019 ◽  
Vol 22 (3) ◽  
Author(s):  
Dildre Georgiana Vasques ◽  
Paulo Sérgio Martins ◽  
Solange Oliveira Rezende

The discovery of knowledge in textual databases is an approach that basically seeks for implicitrelationships between different concepts in different documents written in natural language, inorder to identify new useful knowledge. To assist in this process, this approach can count on thehelp of Text Mining techniques. Despite all the progress made, researchers in this area must stilldeal with the large number of false relationships generated by most of the available processes.A statistical and verbal semantic approach that supports the understanding of the logic betweenrelationships may bridge this gap. Thus, the objective of this work is to support the user with theidentification of implicit relationships between concepts present in different texts, consideringthe causal relationships between concepts in the texts. To this end, this work proposes a hybridapproach for the discovery of implicit knowledge present in a text corpus, using analysis based onassociation rules together with metrics from complex networks and verbal semantics. Througha case study, a set of texts from alternative medicine was selected and the different extractionsshowed that the proposed approach facilitates the identification of implicit knowledge by theuser

2021 ◽  
Author(s):  
Edgar Bernier ◽  
Sebastien Perrier

Abstract Maximizing operational efficiency is a critical challenge in oil and gas production, particularly important for mature assets in the North Sea. The causes of production shortfalls are numerous, distributed across a wide range of disciplines, technical and non-technical causes. The primary reason to apply Natural Language Processing (NLP) and text mining on several years of shortfall history was the need to support efficiently the evaluation of digital transformation use-case screenings and value mapping exercises, through a proper mapping of the issues faced. Obviously, this mapping contributed as well to reflect on operational surveillance and maintenance strategies to reduce the production shortfalls. This paper presents a methodology where the historical records of descriptions, comments and results of investigation regarding production shortfalls are revisited, adding to existing shortfall classifications and statistics, in particular in two domains: richer first root-cause mapping, and a series of advanced visualizations and analytics. The methodology put in place uses natural-language pre-processing techniques, combined with keyword-based text-mining and classification techniques. The limitations associated to the size and quality of these language datasets will be described, and the results discussed, highlighting the value of reaching high level of data granularity while defeating the ‘more information, less attention’ bias. At the same time, visual designs are introduced to display efficiently the different dimensions of this data (impact, frequency evolution through time, location in term of field and affected systems, root causes and other cause-related categories). The ambition in the domain of visualization is to create User Experience-friendly shortfall analytics, that can be displayed in smart rooms and collaborative rooms, where display's efficiency is higher when user-interactions are kept minimal, number of charts is limited and multiple dimensions do not collide. The paper is based on several applications across the North Sea. This case study and the associated lessons learned regarding natural language processing and text mining applied to similar technical concise data are answering several frequently asked questions on the value of the textual data records gathered over years.


Discourse ◽  
2020 ◽  
Vol 6 (3) ◽  
pp. 109-117
Author(s):  
O. M. Polyakov

Introduction. The article continues the series of publications on the linguistics of relations (hereinafter R–linguistics) and is devoted to an introduction to the logic of natural language in relation to the approach considered in the series. The problem of natural language logic still remains relevant, since this logic differs significantly from traditional mathematical logic. Moreover, with the appearance of artificial intelligence systems, the importance of this problem only increases. The article analyzes logical problems that prevent the application of classical logic methods to natural languages. This is possible because R-linguistics forms the semantics of a language in the form of world model structures in which language sentences are interpreted.Methodology and sources. The results obtained in the previous parts of the series are used as research tools. To develop the necessary mathematical representations in the field of logic and semantics, the formulated concept of the interpretation operator is used.Results and discussion. The problems that arise when studying the logic of natural language in the framework of R–linguistics are analyzed. These issues are discussed in three aspects: the logical aspect itself; the linguistic aspect; the aspect of correlation with reality. A very General approach to language semantics is considered and semantic axioms of the language are formulated. The problems of the language and its logic related to the most General view of semantics are shown.Conclusion. It is shown that the application of mathematical logic, regardless of its type, to the study of natural language logic faces significant problems. This is a consequence of the inconsistency of existing approaches with the world model. But it is the coherence with the world model that allows us to build a new logical approach. Matching with the model means a semantic approach to logic. Even the most General view of semantics allows to formulate important results about the properties of languages that lack meaning. The simplest examples of semantic interpretation of traditional logic demonstrate its semantic problems (primarily related to negation).


2021 ◽  
pp. 1-13
Author(s):  
Lamiae Benhayoun ◽  
Daniel Lang

BACKGROUND: The renewed advent of Artificial Intelligence (AI) is inducing profound changes in the classic categories of technology professions and is creating the need for new specific skills. OBJECTIVE: Identify the gaps in terms of skills between academic training on AI in French engineering and Business Schools, and the requirements of the labour market. METHOD: Extraction of AI training contents from the schools’ websites and scraping of a job advertisements’ website. Then, analysis based on a text mining approach with a Python code for Natural Language Processing. RESULTS: Categorization of occupations related to AI. Characterization of three classes of skills for the AI market: Technical, Soft and Interdisciplinary. Skills’ gaps concern some professional certifications and the mastery of specific tools, research abilities, and awareness of ethical and regulatory dimensions of AI. CONCLUSIONS: A deep analysis using algorithms for Natural Language Processing. Results that provide a better understanding of the AI capability components at the individual and the organizational levels. A study that can help shape educational programs to respond to the AI market requirements.


2021 ◽  
Vol 26 (4) ◽  
Author(s):  
Alvaro Veizaga ◽  
Mauricio Alferez ◽  
Damiano Torre ◽  
Mehrdad Sabetzadeh ◽  
Lionel Briand

AbstractNatural language (NL) is pervasive in software requirements specifications (SRSs). However, despite its popularity and widespread use, NL is highly prone to quality issues such as vagueness, ambiguity, and incompleteness. Controlled natural languages (CNLs) have been proposed as a way to prevent quality problems in requirements documents, while maintaining the flexibility to write and communicate requirements in an intuitive and universally understood manner. In collaboration with an industrial partner from the financial domain, we systematically develop and evaluate a CNL, named Rimay, intended at helping analysts write functional requirements. We rely on Grounded Theory for building Rimay and follow well-known guidelines for conducting and reporting industrial case study research. Our main contributions are: (1) a qualitative methodology to systematically define a CNL for functional requirements; this methodology is intended to be general for use across information-system domains, (2) a CNL grammar to represent functional requirements; this grammar is derived from our experience in the financial domain, but should be applicable, possibly with adaptations, to other information-system domains, and (3) an empirical evaluation of our CNL (Rimay) through an industrial case study. Our contributions draw on 15 representative SRSs, collectively containing 3215 NL requirements statements from the financial domain. Our evaluation shows that Rimay is expressive enough to capture, on average, 88% (405 out of 460) of the NL requirements statements in four previously unseen SRSs from the financial domain.


2020 ◽  
Vol 44 (12) ◽  
Author(s):  
Ishita Dasgupta ◽  
Demi Guo ◽  
Samuel J. Gershman ◽  
Noah D. Goodman
Keyword(s):  

2021 ◽  
pp. 147332502199086
Author(s):  
Stéphanie Wahab ◽  
Gita R Mehrotra ◽  
Kelly E Myers

Expediency, efficiency, and rapid production within compressed time frames represent markers for research and scholarship within the neoliberal academe. Scholars who wish to resist these practices of knowledge production have articulated the need for Slow scholarship—a slower pace to make room for thinking, creativity, and useful knowledge. While these calls are important for drawing attention to the costs and problems of the neoliberal academy, many scholars have moved beyond “slow” as being uniquely referencing pace and duration, by calling for the different conceptualizations of time, space, and knowing. Guided by post-structural feminisms, we engaged in a research project that moved at the pace of trust in the integrity of our ideas and relationships. Our case study aimed to better understand the ways macro forces such as neoliberalism, criminalization and professionalization shape domestic violence work. This article discusses our praxis of Slow scholarship by showcasing four specific key markers of Slow scholarship in our research; time reimagined, a relational ontology, moving inside and towards complexity, and embodiment. We discuss how Slow scholarship complicates how we understand constructs of productivity and knowledge production, as well as map the ways Slow scholarship offers a praxis of resistance for generating power from the epistemic margins within social work and the neoliberal academy.


Sign in / Sign up

Export Citation Format

Share Document