Clustering Techniques for Rule Extraction from Unstructured Text Fragments

Author(s):  
A. Clark ◽  
D. Filev
Author(s):  
SungKu Kang ◽  
Lalit Patil ◽  
Arvind Rangarajan ◽  
Abha Moitra ◽  
Dean Robinson ◽  
...  

Manufacturing companies maintain manufacturing knowledge primarily as unstructured text. To facilitate formal use of such knowledge, previous efforts have utilized natural language processing (NLP) to classify manufacturing documents or extract manufacturing concepts/relations. However, extracting more complex knowledge, such as manufacturing rules, has been evasive due to the lack of methods to resolve ambiguities. Specifically, standard NLP techniques do not address domain-specific ambiguities that are due to manufacturing-specific meanings implicit in the text. To address this important gap, we propose an ambiguity resolution method that utilizes domain ontology as the mechanism to incorporate the domain context. We demonstrate its feasibility by extending our previously implemented manufacturing rule extraction framework. The effectiveness of the method is demonstrated by resolving all the domain-specific ambiguities in the dataset and an improvement in correct detection of rules to 70% (increased by about 13%). We expect that this work will contribute to the adoption of semantics-based technology in manufacturing field, by enabling the extraction of precise formal knowledge from text.


2020 ◽  
Author(s):  
Andrea Giani ◽  
de Souza Patricia Borges ◽  
Stefania Bartoletti ◽  
Flavio Morselli ◽  
Andrea Conti ◽  
...  

2019 ◽  
Vol 7 (3) ◽  
pp. 50-54
Author(s):  
N. Thilagavathi ◽  
Christy Wood ◽  
V. Hemalakshumi ◽  
V. Mathumiithaa

Author(s):  
Wing Chiu Tam ◽  
Osei Poku ◽  
R. D. (Shawn) Blanton

Abstract Systematic defects due to design-process interactions are a dominant component of integrated circuit (IC) yield loss in nano-scaled technologies. Test structures do not adequately represent the product in terms of feature diversity and feature volume, and therefore are unable to identify all the systematic defects that affect the product. This paper describes a method that uses diagnosis to identify layout features that do not yield as expected. Specifically, clustering techniques are applied to layout snippets of diagnosis-implicated regions from (ideally) a statistically-significant number of IC failures for identifying feature commonalties. Experiments involving an industrial chip demonstrate the identification of possible systematic yield loss due to lithographic hotspots.


Energies ◽  
2021 ◽  
Vol 14 (4) ◽  
pp. 1028
Author(s):  
Silvia Corigliano ◽  
Federico Rosato ◽  
Carla Ortiz Dominguez ◽  
Marco Merlo

The scientific community is active in developing new models and methods to help reach the ambitious target set by UN SDGs7: universal access to electricity by 2030. Efficient planning of distribution networks is a complex and multivariate task, which is usually split into multiple subproblems to reduce the number of variables. The present work addresses the problem of optimal secondary substation siting, by means of different clustering techniques. In contrast with the majority of approaches found in the literature, which are devoted to the planning of MV grids in already electrified urban areas, this work focuses on greenfield planning in rural areas. K-means algorithm, hierarchical agglomerative clustering, and a method based on optimal weighted tree partitioning are adapted to the problem and run on two real case studies, with different population densities. The algorithms are compared in terms of different indicators useful to assess the feasibility of the solutions found. The algorithms have proven to be effective in addressing some of the crucial aspects of substations siting and to constitute relevant improvements to the classic K-means approach found in the literature. However, it is found that it is very challenging to conjugate an acceptable geographical span of the area served by a single substation with a substation power high enough to justify the installation when the load density is very low. In other words, well known standards adopted in industrialized countries do not fit with developing countries’ requirements.


Sign in / Sign up

Export Citation Format

Share Document