Training Set Expansion Using Word Embeddings for Korean Medical Information Extraction

Author(s):  
Young-Min Kim
Author(s):  
Young-Min Kim ◽  
Sa-kwang Song ◽  
Sungho Shin ◽  
Choong-Nyoung Seon ◽  
Seunggyun Hong ◽  
...  

Author(s):  
Jingtan Li ◽  
Maolin Xu ◽  
Hongling Xiu

With the resolution of remote sensing images is getting higher and higher, high-resolution remote sensing images are widely used in many areas. Among them, image information extraction is one of the basic applications of remote sensing images. In the face of massive high-resolution remote sensing image data, the traditional method of target recognition is difficult to cope with. Therefore, this paper proposes a remote sensing image extraction based on U-net network. Firstly, the U-net semantic segmentation network is used to train the training set, and the validation set is used to verify the training set at the same time, and finally the test set is used for testing. The experimental results show that U-net can be applied to the extraction of buildings.


Author(s):  
Maria Antoniak ◽  
David Mimno

Word embeddings are increasingly being used as a tool to study word associations in specific corpora. However, it is unclear whether such embeddings reflect enduring properties of language or if they are sensitive to inconsequential variations in the source documents. We find that nearest-neighbor distances are highly sensitive to small changes in the training corpus for a variety of algorithms. For all methods, including specific documents in the training set can result in substantial variations. We show that these effects are more prominent for smaller training corpora. We recommend that users never rely on single embedding models for distance calculations, but rather average over multiple bootstrap samples, especially for small corpora.


2002 ◽  
Vol 8 (2-3) ◽  
pp. 167-191 ◽  
Author(s):  
J. TURMO ◽  
H. RODRIGUEZ

The growing availability of textual sources has lead to an increase in the use of automatic knowledge acquisition approaches from textual data, as in Information Extraction (IE). Most IE systems use knowledge explicitly represented as sets of IE rules usually manually acquired. Recently, however, the acquisition of this knowledge has been faced by applying a huge variety of Machine Learning (ML) techniques. Within this framework, new problems arise in relation to the way of selecting and annotating positive examples, and sometimes negative ones, in supervised approaches, or the way of organizing unsupervised or semi-supervised approaches. This paper presents a new IE-rule learning system that deals with these training set problems and describes a set of experiments for testing this capability of the new learning approach.


Sign in / Sign up

Export Citation Format

Share Document