Experience: Automated Prediction of Experimental Metadata from Scientific Publications

2021 ◽  
Vol 13 (4) ◽  
pp. 1-11
Author(s):  
Stuti Nayak ◽  
Amrapali Zaveri ◽  
Pedro Hernandez Serrano ◽  
Michel Dumontier

While there exists an abundance of open biomedical data, the lack of high-quality metadata makes it challenging for others to find relevant datasets and to reuse them for another purpose. In particular, metadata are useful to understand the nature and provenance of the data. A common approach to improving the quality of metadata relies on expensive human curation, which itself is time-consuming and also prone to error. Towards improving the quality of metadata, we use scientific publications to automatically predict metadata key:value pairs. For prediction, we use a Convolutional Neural Network (CNN) and a Bidirectional Long-short term memory network (BiLSTM). We focus our attention on the NCBI Disease Corpus, which is used for training the CNN and BiLSTM. We perform two different kinds of experiments with these two architectures: (1) we predict the disease names by using their unique ID in the MeSH ontology and (2) we use the tree structures of MeSH ontology to move up in the hierarchy of these disease terms, which reduces the number of labels. We also perform various multi-label classification techniques for the above-mentioned experiments. We find that in both cases CNN achieves the best results in predicting the superclasses for disease with an accuracy of 83%.

2021 ◽  
Vol 9 (6) ◽  
pp. 651
Author(s):  
Yan Yan ◽  
Hongyan Xing

In order for the detection ability of floating small targets in sea clutter to be improved, on the basis of the complete ensemble empirical mode decomposition (CEEMD) algorithm, the high-frequency parts and low-frequency parts are determined by the energy proportion of the intrinsic mode function (IMF); the high-frequency part is denoised by wavelet packet transform (WPT), whereas the denoised high-frequency IMFs and low-frequency IMFs reconstruct the pure sea clutter signal together. According to the chaotic characteristics of sea clutter, we proposed an adaptive training timesteps strategy. The training timesteps of network were determined by the width of embedded window, and the chaotic long short-term memory network detection was designed. The sea clutter signals after denoising were predicted by chaotic long short-term memory (LSTM) network, and small target signals were detected from the prediction errors. The experimental results showed that the CEEMD-WPT algorithm was consistent with the target distribution characteristics of sea clutter, and the denoising performance was improved by 33.6% on average. The proposed chaotic long- and short-term memory network, which determines the training step length according to the width of embedded window, is a new detection method that can accurately detect small targets submerged in the background of sea clutter.


Sign in / Sign up

Export Citation Format

Share Document