Type-aware Convolutional Neural Networks for Slot Filling

The slot filling task aims at extracting answers for queries about entities from text, such as "Who founded Apple". In this paper, we focus on the relation classification component of a slot filling system. We propose type-aware convolutional neural networks to benefit from the mutual dependencies between entity and relation classification. In particular, we explore different ways of integrating the named entity types of the relation arguments into a neural network for relation classification, including a joint training and a structured prediction approach. To the best of our knowledge, this is the first study on type-aware neural networks for slot filling. The type-aware models lead to the best results of our slot filling pipeline. Joint training performs comparable to structured prediction. To understand the impact of the different components of the slot filling pipeline, we perform a recall analysis, a manual error analysis and several ablation studies. Such analyses are of particular importance to other slot filling researchers since the official slot filling evaluations only assess pipeline outputs. The analyses show that especially coreference resolution and our convolutional neural networks have a large positive impact on the final performance of the slot filling pipeline. The presented models, the source code of our system as well as our coreference resource is publicly available.

Download Full-text

Segment convolutional neural networks (Seg-CNNs) for classifying relations in clinical notes

Journal of the American Medical Informatics Association ◽

10.1093/jamia/ocx090 ◽

2017 ◽

Vol 25 (1) ◽

pp. 93-98 ◽

Cited By ~ 31

Author(s):

Yuan Luo ◽

Yu Cheng ◽

Özlem Uzuner ◽

Peter Szolovits ◽

Justin Starren

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

State Of The Art ◽

Graphics Processing Unit ◽

Medical Problem ◽

Feature Engineering ◽

Processing Unit ◽

Clinical Notes ◽

Overall Evaluation ◽

Relation Classification

Abstract We propose Segment Convolutional Neural Networks (Seg-CNNs) for classifying relations from clinical notes. Seg-CNNs use only word-embedding features without manual feature engineering. Unlike typical CNN models, relations between 2 concepts are identified by simultaneously learning separate representations for text segments in a sentence: preceding, concept1, middle, concept2, and succeeding. We evaluate Seg-CNN on the i2b2/VA relation classification challenge dataset. We show that Seg-CNN achieves a state-of-the-art micro-average F-measure of 0.742 for overall evaluation, 0.686 for classifying medical problem–treatment relations, 0.820 for medical problem–test relations, and 0.702 for medical problem–medical problem relations. We demonstrate the benefits of learning segment-level representations. We show that medical domain word embeddings help improve relation classification. Seg-CNNs can be trained quickly for the i2b2/VA dataset on a graphics processing unit (GPU) platform. These results support the use of CNNs computed over segments of text for classifying medical relations, as they show state-of-the-art performance while requiring no manual feature engineering.

Download Full-text

The Impact of Padding on Image Classification by Using Pre-trained Convolutional Neural Networks

Lecture Notes in Computer Science - Image Analysis and Processing – ICIAP 2019 ◽

10.1007/978-3-030-30645-8_31 ◽

2019 ◽

pp. 337-344

Author(s):

Hongxiang Tang ◽

Alessandro Ortis ◽

Sebastiano Battiato

Keyword(s):

Neural Networks ◽

Image Classification ◽

Convolutional Neural Networks ◽

The Impact

Download Full-text

The Impact of Soft Errors in Memory Units of Edge Devices Executing Convolutional Neural Networks

IEEE Transactions on Circuits & Systems II Express Briefs ◽

10.1109/tcsii.2022.3141243 ◽

2022 ◽

pp. 1-1

Author(s):

Geancarlo Abich ◽

Rafael Garibotti ◽

Ricardo Reis ◽

Luciano Ost

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Soft Errors ◽

The Impact

Download Full-text

Multi-Channel Convolutional Neural Networks with Adversarial Training for Few-Shot Relation Classification (Student Abstract)

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i10.7256 ◽

2020 ◽

Vol 34 (10) ◽

pp. 13967-13968

Author(s):

Yuxiang Xie ◽

Hua Xu ◽

Congcong Yang ◽

Kai Gao

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Word Embedding ◽

Vector Representation ◽

Adversarial Learning ◽

Model Experiments ◽

Robust Model ◽

Adversarial Training ◽

Relation Classification

The distant supervised (DS) method has improved the performance of relation classification (RC) by means of extending the dataset. However, DS also brings the problem of wrong labeling. Contrary to DS, the few-shot method relies on few supervised data to predict the unseen classes. In this paper, we use word embedding and position embedding to construct multi-channel vector representation and use the multi-channel convolutional method to extract features of sentences. Moreover, in order to alleviate few-shot learning to be sensitive to overfitting, we introduce adversarial learning for training a robust model. Experiments on the FewRel dataset show that our model achieves significant and consistent improvements on few-shot RC as compared with baselines.

Download Full-text

Rectified Wing Loss for Efficient and Robust Facial Landmark Localisation with Convolutional Neural Networks

International Journal of Computer Vision ◽

10.1007/s11263-019-01275-0 ◽

2019 ◽

Vol 128 (8-9) ◽

pp. 2126-2145 ◽

Cited By ~ 3

Author(s):

Zhen-Hua Feng ◽

Josef Kittler ◽

Muhammad Awais ◽

Xiao-Jun Wu

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Loss Function ◽

Data Augmentation ◽

Small Sample Size ◽

Small Sample ◽

Imbalance Problem ◽

Facial Landmark ◽

The Impact ◽

Coarse To Fine

AbstractEfficient and robust facial landmark localisation is crucial for the deployment of real-time face analysis systems. This paper presents a new loss function, namely Rectified Wing (RWing) loss, for regression-based facial landmark localisation with Convolutional Neural Networks (CNNs). We first systemically analyse different loss functions, including L2, L1 and smooth L1. The analysis suggests that the training of a network should pay more attention to small-medium errors. Motivated by this finding, we design a piece-wise loss that amplifies the impact of the samples with small-medium errors. Besides, we rectify the loss function for very small errors to mitigate the impact of inaccuracy of manual annotation. The use of our RWing loss boosts the performance significantly for regression-based CNNs in facial landmarking, especially for lightweight network architectures. To address the problem of under-representation of samples with large pose variations, we propose a simple but effective boosting strategy, referred to as pose-based data balancing. In particular, we deal with the data imbalance problem by duplicating the minority training samples and perturbing them by injecting random image rotation, bounding box translation and other data augmentation strategies. Last, the proposed approach is extended to create a coarse-to-fine framework for robust and efficient landmark localisation. Moreover, the proposed coarse-to-fine framework is able to deal with the small sample size problem effectively. The experimental results obtained on several well-known benchmarking datasets demonstrate the merits of our RWing loss and prove the superiority of the proposed method over the state-of-the-art approaches.

Download Full-text

Causal Relation Classification using Convolutional Neural Networks and Grammar Tags

2019 IEEE 16th India Council International Conference (INDICON) ◽

10.1109/indicon47234.2019.9028985 ◽

2019 ◽

Author(s):

Raja Ayyanar ◽

George Koomullil ◽

Hariharan Ramasangu

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Causal Relation ◽

Relation Classification

Download Full-text

Deep Transfer Learning for Histopathological Diagnosis of Cervical Cancer Using Convolutional Neural Networks with Visualization Schemes

Journal of Medical Imaging and Health Informatics ◽

10.1166/jmihi.2020.2967 ◽

2020 ◽

Vol 10 (2) ◽

pp. 391-400 ◽

Cited By ~ 3

Author(s):

Ying Chen ◽

Xiaomin Qin ◽

Jingyu Xiong ◽

Shugong Xu ◽

Jun Shi ◽

...

Keyword(s):

Neural Networks ◽

Cervical Cancer ◽

Learning Strategies ◽

Transfer Learning ◽

Convolutional Neural Networks ◽

Characteristic Curve ◽

Disease Classification ◽

Natural Image ◽

Histopathological Diagnosis ◽

The Impact

This study aimed to propose a deep transfer learning framework for histopathological image analysis by using convolutional neural networks (CNNs) with visualization schemes, and to evaluate its usage for automated and interpretable diagnosis of cervical cancer. First, in order to examine the potential of the transfer learning for classifying cervix histopathological images, we pre-trained three state-of-the-art CNN architectures on large-size natural image datasets and then fine-tuned them on small-size histopathological datasets. Second, we investigated the impact of three learning strategies on classification accuracy. Third, we visualized both the multiple-layer convolutional kernels of CNNs and the regions of interest so as to increase the clinical interpretability of the networks. Our method was evaluated on a database of 4993 cervical histological images (2503 benign and 2490 malignant). The experimental results demonstrated that our method achieved 95.88% sensitivity, 98.93% specificity, 97.42% accuracy, 94.81% Youden's index and 99.71% area under the receiver operating characteristic curve. Our method can reduce the cognitive burden on pathologists for cervical disease classification and improve their diagnostic efficiency and accuracy. It may be potentially used in clinical routine for histopathological diagnosis of cervical cancer.

Download Full-text

Relation classification using revised convolutional neural networks

2017 4th International Conference on Systems and Informatics (ICSAI) ◽

10.1109/icsai.2017.8248512 ◽

2017 ◽

Author(s):

Bo Li ◽

Xiang Zhao ◽

Shuai Wang ◽

Weihong Lin ◽

Weidong Xiao

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Relation Classification

Download Full-text

Global Normalization of Convolutional Neural Networks for Joint Entity and Relation Classification

10.18653/v1/d17-1181 ◽

2017 ◽

Cited By ~ 14

Author(s):

Heike Adel ◽

Hinrich Schütze

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Relation Classification

Download Full-text

Convolutional Neural Networks for Decoding of Covert Attention Focus and Saliency Maps for EEG Feature Visualization

10.1101/614784 ◽

2019 ◽

Author(s):

Amr Farahat ◽

Christoph Reichert ◽

Catherine M. Sweeney-Reed ◽

Hermann Hinrichs

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Cognitive Task ◽

Event Related Potentials ◽

Eeg Signal ◽

Linear Discriminant ◽

Saliency Maps ◽

Significant Difference ◽

Related Potentials ◽

The Impact

ABSTRACTObjectiveConvolutional neural networks (CNNs) have proven successful as function approximators and have therefore been used for classification problems including electroencephalography (EEG) signal decoding for brain-computer interfaces (BCI). Artificial neural networks, however, are considered black boxes, because they usually have thousands of parameters, making interpretation of their internal processes challenging. Here we systematically evaluate the use of CNNs for EEG signal decoding and investigate a method for visualizing the CNN model decision process.ApproachWe developed a CNN model to decode the covert focus of attention from EEG event-related potentials during object selection. We compared the CNN and the commonly used linear discriminant analysis (LDA) classifier performance, applied to datasets with different dimensionality, and analyzed transfer learning capacity. Moreover, we validated the impact of single model components by systematically altering the model. Furthermore, we investigated the use of saliency maps as a tool for visualizing the spatial and temporal features driving the model output.Main resultsThe CNN model and the LDA classifier achieved comparable accuracy on the lower-dimensional dataset, but CNN exceeded LDA performance significantly on the higher-dimensional dataset (without hypothesis-driven preprocessing), achieving an average decoding accuracy of 90.7% (chance level = 8.3%). Parallel convolutions, tanh or ELU activation functions, and dropout regularization proved valuable for model performance, whereas the sequential convolutions, ReLU activation function, and batch normalization components, reduced accuracy or yielded no significant difference. Saliency maps revealed meaningful features, displaying the typical spatial distribution and latency of the P300 component expected during this task.SignificanceFollowing systematic evaluation, we provide recommendations for when and how to use CNN models in EEG decoding. Moreover, we propose a new approach for investigating the neural correlates of a cognitive task by training CNN models on raw high-dimensional EEG data and utilizing saliency maps for relevant feature extraction.

Download Full-text