An Approach Based on Multilevel Convolution for Sentence-Level Element Extraction of Legal Text
In the judicial field, with the increase of legal text data, the extraction of legal text elements plays a more and more important role. In this paper, we propose a sentence-level model of legal text element extraction based on the structure of multilabel text classification. Our proposed model contains an encoder and an improved decoder. The encoder applies multilevel convolutional neural networks (CNN) and Long Short-Term Memory (LSTM) as feature extraction networks to extract local neighborhood and context information from legal text, and a decoder applies LSTM with multiattention and full connection layer with an improved initialization method to decode and generate label sequences. To our best knowledge, it is one of the first attempts to apply a multilabel classification algorithm for element extraction of legal text. In order to verify the effectiveness of our model, we conduct experiments not only on three real legal text datasets but also on a general multilabel text classification dataset.The experimental results demonstrate that our proposed model outperforms baseline models on legal text datasets, and our model is competitive to baseline models on the general text multilabel classification dataset, which indicates that our proposed model is useful for multilabel classification tasks of ordinary texts and legal texts with an uncertain number of characters in words and short lengths.