classification tasks
Recently Published Documents





2022 ◽  
Vol 16 (4) ◽  
pp. 1-16
Fereshteh Jafariakinabad ◽  
Kien A. Hua

The syntactic structure of sentences in a document substantially informs about its authorial writing style. Sentence representation learning has been widely explored in recent years and it has been shown that it improves the generalization of different downstream tasks across many domains. Even though utilizing probing methods in several studies suggests that these learned contextual representations implicitly encode some amount of syntax, explicit syntactic information further improves the performance of deep neural models in the domain of authorship attribution. These observations have motivated us to investigate the explicit representation learning of syntactic structure of sentences. In this article, we propose a self-supervised framework for learning structural representations of sentences. The self-supervised network contains two components; a lexical sub-network and a syntactic sub-network which take the sequence of words and their corresponding structural labels as the input, respectively. Due to the n -to-1 mapping of words to their structural labels, each word will be embedded into a vector representation which mainly carries structural information. We evaluate the learned structural representations of sentences using different probing tasks, and subsequently utilize them in the authorship attribution task. Our experimental results indicate that the structural embeddings significantly improve the classification tasks when concatenated with the existing pre-trained word embeddings.

Iqra Muneer ◽  
Rao Muhammad Adeel Nawab

Cross-Lingual Text Reuse Detection (CLTRD) has recently attracted the attention of the research community due to a large amount of digital text readily available for reuse in multiple languages through online digital repositories. In addition, efficient machine translation systems are freely and readily available to translate text from one language into another, which makes it quite easy to reuse text across languages, and consequently difficult to detect it. In the literature, the most prominent and widely used approach for CLTRD is Translation plus Monolingual Analysis (T+MA). To detect CLTR for English-Urdu language pair, T+MA has been used with lexical approaches, namely, N-gram Overlap, Longest Common Subsequence, and Greedy String Tiling. This clearly shows that T+MA has not been thoroughly explored for the English-Urdu language pair. To fulfill this gap, this study presents an in-depth and detailed comparison of 26 approaches that are based on T+MA. These approaches include semantic similarity approaches (semantic tagger based approaches, WordNet-based approaches), probabilistic approach (Kullback-Leibler distance approach), monolingual word embedding-based approaches siamese recurrent architecture, and monolingual sentence transformer-based approaches for English-Urdu language pair. The evaluation was carried out using the CLEU benchmark corpus, both for the binary and the ternary classification tasks. Our extensive experimentation shows that our proposed approach that is a combination of 26 approaches obtained an F 1 score of 0.77 and 0.61 for the binary and ternary classification tasks, respectively, and outperformed the previously reported approaches [ 41 ] ( F 1 = 0.73) for the binary and ( F 1 = 0.55) for the ternary classification tasks) on the CLEU corpus.

2022 ◽  
Vol 12 ◽  
Shenda Hong ◽  
Wenrui Zhang ◽  
Chenxi Sun ◽  
Yuxi Zhou ◽  
Hongyan Li

Cardiovascular diseases (CVDs) are one of the most fatal disease groups worldwide. Electrocardiogram (ECG) is a widely used tool for automatically detecting cardiac abnormalities, thereby helping to control and manage CVDs. To encourage more multidisciplinary researches, PhysioNet/Computing in Cardiology Challenge 2020 (Challenge 2020) provided a public platform involving multi-center databases and automatic evaluations for ECG classification tasks. As a result, 41 teams successfully submitted their solutions and were qualified for rankings. Although Challenge 2020 was a success, there has been no in-depth methodological meta-analysis of these solutions, making it difficult for researchers to benefit from the solutions and results. In this study, we aim to systematically review the 41 solutions in terms of data processing, feature engineering, model architecture, and training strategy. For each perspective, we visualize and statistically analyze the effectiveness of the common techniques, and discuss the methodological advantages and disadvantages. Finally, we summarize five practical lessons based on the aforementioned analysis: (1) Data augmentation should be employed and adapted to specific scenarios; (2) Combining different features can improve performance; (3) A hybrid design of different types of deep neural networks (DNNs) is better than using a single type; (4) The use of end-to-end architectures should depend on the task being solved; (5) Multiple models are better than one. We expect that our meta-analysis will help accelerate the research related to ECG classification based on machine-learning models.

2022 ◽  
Vol 12 (2) ◽  
pp. 834
Zhuang Li ◽  
Xincheng Tian ◽  
Xin Liu ◽  
Yan Liu ◽  
Xiaorui Shi

Aiming to address the currently low accuracy of domestic industrial defect detection, this paper proposes a Two-Stage Industrial Defect Detection Framework based on Improved-YOLOv5 and Optimized-Inception-ResnetV2, which completes positioning and classification tasks through two specific models. In order to make the first-stage recognition more effective at locating insignificant small defects with high similarity on the steel surface, we improve YOLOv5 from the backbone network, the feature scales of the feature fusion layer, and the multiscale detection layer. In order to enable second-stage recognition to better extract defect features and achieve accurate classification, we embed the convolutional block attention module (CBAM) attention mechanism module into the Inception-ResnetV2 model, then optimize the network architecture and loss function of the accurate model. Based on the Pascal Visual Object Classes 2007 (VOC2007) dataset, the public dataset NEU-DET, and the optimized dataset Enriched-NEU-DET, we conducted multiple sets of comparative experiments on the Improved-YOLOv5 and Inception-ResnetV2. The testing results show that the improvement is obvious. In order to verify the superiority and adaptability of the two-stage framework, we first test based on the Enriched-NEU-DET dataset, and further use AUBO-i5 robot, Intel RealSense D435 camera, and other industrial steel equipment to build actual industrial scenes. In experiments, a two-stage framework achieves the best performance of 83.3% mean average precision (mAP), evaluated on the Enriched-NEU-DET dataset, and 91.0% on our built industrial defect environment.

2022 ◽  
Vol 4 (1) ◽  
pp. 22-41
Nermeen Abou Baker ◽  
Nico Zengeler ◽  
Uwe Handmann

Transfer learning is a machine learning technique that uses previously acquired knowledge from a source domain to enhance learning in a target domain by reusing learned weights. This technique is ubiquitous because of its great advantages in achieving high performance while saving training time, memory, and effort in network design. In this paper, we investigate how to select the best pre-trained model that meets the target domain requirements for image classification tasks. In our study, we refined the output layers and general network parameters to apply the knowledge of eleven image processing models, pre-trained on ImageNet, to five different target domain datasets. We measured the accuracy, accuracy density, training time, and model size to evaluate the pre-trained models both in training sessions in one episode and with ten episodes.

Sensors ◽  
2022 ◽  
Vol 22 (2) ◽  
pp. 599
Yongsheng Li ◽  
Tengfei Tu ◽  
Hua Zhang ◽  
Jishuai Li ◽  
Zhengping Jin ◽  

In the field of video action classification, existing network frameworks often only use video frames as input. When the object involved in the action does not appear in a prominent position in the video frame, the network cannot accurately classify it. We introduce a new neural network structure that uses sound to assist in processing such tasks. The original sound wave is converted into sound texture as the input of the network. Furthermore, in order to use the rich modal information (images and sound) in the video, we designed and used a two-stream frame. In this work, we assume that sound data can be used to solve motion recognition tasks. To demonstrate this, we designed a neural network based on sound texture to perform video action classification tasks. Then, we fuse this network with a deep neural network that uses continuous video frames to construct a two-stream network, which is called A-IN. Finally, in the kinetics dataset, we use our proposed A-IN to compare with the image-only network. The experimental results show that the recognition accuracy of the two-stream neural network model with uesed sound data features is increased by 7.6% compared with the network using video frames. This proves that the rational use of the rich information in the video can improve the classification effect.

2022 ◽  
Jesse I Gilmer ◽  
Michael A Farries ◽  
Zachary P Kilpatrick ◽  
Ioannis Delis ◽  
Abigail L Person

Learning plays a key role in the function of many neural circuits. The cerebellum is considered a learning machine essential for time interval estimation underlying motor coordination and other behaviors. Theoretical work has proposed that the cerebellar input recipient structure, the granule cell layer (GCL), performs pattern separation of inputs that facilitates learning in Purkinje cells (P-cells). However, the relationship between input reformatting and learning outcomes has remained debated, with roles emphasized for pattern separation features from sparsification to decorrelation. We took a novel approach by training a minimalist model of the cerebellar cortex to learn complex time-series data from naturalistic inputs, in contrast to traditional classification tasks. The model robustly produced temporal basis sets from naturalistic inputs, and the resultant GCL output supported learning of temporally complex target functions. Learning favored surprisingly dense granule cell activity, yet the key statistical features in GCL population activity that drove learning differed from those seen previously for classification tasks. Moreover, different cerebellar tasks were supported by diverse pattern separation features that matched the demands of the tasks. These findings advance testable hypotheses for mechanisms of temporal basis set formation and predict that population statistics of granule cell activity may differ across cerebellar regions to support distinct behaviors.

2022 ◽  
Vol 7 ◽  
pp. e831
Xudong Jia ◽  
Li Wang

Text classification is a fundamental task in many applications such as topic labeling, sentiment analysis, and spam detection. The text syntactic relationship and word sequence are important and useful for text classification. How to model and incorporate them to improve performance is one key challenge. Inspired by human behavior in understanding text. In this paper, we combine the syntactic relationship, sequence structure, and semantics for text representation, and propose an attention-enhanced capsule network-based text classification model. Specifically, we use graph convolutional neural networks to encode syntactic dependency trees, build multi-head attention to encode dependencies relationship in text sequence, merge with semantic information by capsule network at last. Extensive experiments on five datasets demonstrate that our approach can effectively improve the performance of text classification compared with state-of-the-art methods. The result also shows capsule network, graph convolutional neural network, and multi-headed attention has integration effects on text classification tasks.

2022 ◽  
Anguo Zhang ◽  
Ying Han ◽  
Jing Hu ◽  
Yuzhen Niu ◽  
Yueming Gao ◽  

We propose two simple and effective spiking neuron models to improve the response time of the conventional spiking neural network. The proposed neuron models adaptively tune the presynaptic input current depending on the input received from its presynapses and subsequent neuron firing events. We analyze and derive the firing activity homeostatic convergence of the proposed models. We experimentally verify and compare the models on MNIST handwritten digits and FashionMNIST classification tasks. We show that the proposed neuron models significantly increase the response speed to the input signal.

Raúl Pedro Aceñero Eixarch ◽  
Raúl Díaz-Usechi Laplaza ◽  
Rafael Berlanga Llavori

This paper presents a study about screening large radiological image streams produced in hospitals for earlier detection of lung nodules. Being one of the most difficult classification tasks in the literature, our objective is to measure how well state-of-the-art classifiers can screen out the images stream to keep as many positive cases as possible in an output stream to be inspected by clinicians. We performed several experiments with different image resolutions and training datasets from different sources, always taking ResNet-152 as the base neural network. Results over existing datasets show that, contrary to other diseases like pneumonia, detecting nodules is a hard task when using only radiographies. Indeed, final diagnosis by clinicians is usually performed with much more precise images like computed tomographies.

Sign in / Sign up

Export Citation Format

Share Document