An Automatic and General Framework for Domain-Specific Knowledge Bases Extracting

Author(s):  
Kejun Deng ◽  
Hongjie Fan ◽  
Junfei Liu
2004 ◽  
Vol 13 (03) ◽  
pp. 721-738 ◽  
Author(s):  
XIAOYING GAO ◽  
MENGJIE ZHANG

This paper describes a learning/adaptive approach to automatically building knowledge bases for information extraction from text based web pages. A frame based representation is introduced to represent domain knowledge as knowledge unit frames. A frame learning algorithm is developed to automatically learn knowledge unit frames from training examples. Some training examples can be obtained by automatically parsing a number of tabular web pages in the same domain, which greatly reduced the amount of time consuming manual work. This approach was investigated on ten web sites of real estate advertisements and car advertisements and nearly all the information was successfully extracted with very few false alarms. These results suggest that both the knowledge unit frame representation and the frame learning algorithm work well, domain specific knowledge bases can be learned from training examples, and the domain specific knowledge base can be used for information extraction from flexible text-based semi-structured Web pages on multiple Web sites. The investigation of the knowledge representation on five other domains suggests that this approach can be easily applied to other domains by simply changing the training examples.


2001 ◽  
Vol 10 (01n02) ◽  
pp. 65-86 ◽  
Author(s):  
DAN I. MOLDOVAN ◽  
ROXANA C. GÎRJU

It is widely accepted that more knowledge means more intelligence. In many knowledge intensive applications, it is necessary to have extensive domain-specific knowledge in addition to general-purpose knowledge bases. This paper presents a methodology for discovering domain-specific concepts and relationships in an attempt to extend WordNet. The method was tested on five seed concepts selected from the financial domain: interest rate, stock market, inflation, economic growth, and employment. Queries were formed with each of these concepts and a corpus of 5000 sentences was extracted automatically from the Internet and TREC-8 corpora. On this corpus, the system discovered a total of 264 new concepts not defined in WordNet, of which 221 contain the seeds and 43 are other related concepts. The system also discovered 64 relationships that link these concepts with either WordNet concepts or with each other. The relationships were extracted with the help of 22 distinct lexico-syntactic patterns representing four semantic relations. It takes the system approximately 40 minutes per seed working in interactive mode to discover the new concepts and relationships on the 5000 sentence corpus.


2020 ◽  
Author(s):  
Abeed Sarker ◽  
Yuan-Chi Yang ◽  
Mohammed Ali Al-Garadi

AbstractThe performances of current medical text summarization systems rely on resource-heavy domain-specific knowledge sources, and preprocessing methods (e.g., classification or deep learning) for deriving semantic information. Consequently, these systems are often difficult to customize, extend or deploy in low-resource settings, and are operationally slow. We propose a fast summarization system that can aid practitioners at point-of-care, and, thus, improve evidence-based healthcare. At runtime, our system utilizes similarity measurements derived from pre-trained domain-specific word embeddings in addition to simple features, rather than clunky knowledge bases and resource-heavy preprocessing. Automatic evaluation on a public dataset for evidence-based medicine shows that our system’s performance, despite the simple implementation, is statistically comparable with the state-of-the-art.


2017 ◽  
Vol 10 (12) ◽  
pp. 1965-1968 ◽  
Author(s):  
S. Bharadwaj ◽  
L. Chiticariu ◽  
M. Danilevsky ◽  
S. Dhingra ◽  
S. Divekar ◽  
...  

2014 ◽  
Vol 10 (3) ◽  
pp. 249-261 ◽  
Author(s):  
Tessa Sanderson ◽  
Jo Angouri

The active involvement of patients in decision-making and the focus on patient expertise in managing chronic illness constitutes a priority in many healthcare systems including the NHS in the UK. With easier access to health information, patients are almost expected to be (or present self) as an ‘expert patient’ (Ziebland 2004). This paper draws on the meta-analysis of interview data collected for identifying treatment outcomes important to patients with rheumatoid arthritis (RA). Taking a discourse approach to identity, the discussion focuses on the resources used in the negotiation and co-construction of expert identities, including domain-specific knowledge, access to institutional resources, and ability to self-manage. The analysis shows that expertise is both projected (institutionally sanctioned) and claimed by the patient (self-defined). We close the paper by highlighting the limitations of our pilot study and suggest avenues for further research.


Sign in / Sign up

Export Citation Format

Share Document