classification problem Latest Research Papers

“What Can I Cook with these Ingredients?” - Understanding Cooking-Related Information Needs in Conversational Search

ACM Transactions on Information Systems ◽

10.1145/3498330 ◽

2022 ◽

Vol 40 (4) ◽

pp. 1-32

Author(s):

Alexander Frummet ◽

David Elsweiler ◽

Bernd Ludwig

Keyword(s):

Information Needs ◽

Classification Problem ◽

Context Information ◽

Information Need ◽

In Situ Study ◽

Related Information ◽

Home Cooking ◽

Different Levels ◽

Linguistic Means

As conversational search becomes more pervasive, it becomes increasingly important to understand the users’ underlying information needs when they converse with such systems in diverse domains. We conduct an in situ study to understand information needs arising in a home cooking context as well as how they are verbally communicated to an assistant. A human experimenter plays this role in our study. Based on the transcriptions of utterances, we derive a detailed hierarchical taxonomy of diverse information needs occurring in this context, which require different levels of assistance to be solved. The taxonomy shows that needs can be communicated through different linguistic means and require different amounts of context to be understood. In a second contribution, we perform classification experiments to determine the feasibility of predicting the type of information need a user has during a dialogue using the turn provided. For this multi-label classification problem, we achieve average F1 measures of 40% using BERT-based models. We demonstrate with examples which types of needs are difficult to predict and show why, concluding that models need to include more context information in order to improve both information need classification and assistance to make such systems usable.

Download Full-text

An Unsupervised and Robust Line and Word Segmentation Method for Handwritten and Degraded Printed Document

ACM Transactions on Asian and Low-Resource Language Information Processing ◽

10.1145/3474118 ◽

2022 ◽

Vol 21 (2) ◽

pp. 1-31

Author(s):

Jayati Mukherjee ◽

Swapan K. Parui ◽

Utpal Roy

Keyword(s):

Quantitative Measure ◽

Classification Problem ◽

Input Image ◽

Document Image ◽

Segmentation Method ◽

Analysis Problem ◽

Document Structure ◽

Computational Resources ◽

Degraded Document ◽

Document Page

Segmentation of text lines and words in an unconstrained handwritten or a machine-printed degraded document is a challenging document analysis problem due to the heterogeneity in the document structure. Often there is un-even skew between the lines and also broken words in a document. In this article, the contribution lies in segmentation of a document page image into lines and words. We have proposed an unsupervised, robust, and simple statistical method to segment a document image that is either handwritten or machine-printed (degraded or otherwise). In our proposed method, the segmentation is treated as a two-class classification problem. The classification is done by considering the distribution of gap size (between lines and between words) in a binary page image. Our method is very simple and easy to implement. Other than the binarization of the input image, no pre-processing is necessary. There is no need of high computational resources. The proposed method is unsupervised in the sense that no annotated document page images are necessary. Thus, the issue of a training database does not arise. In fact, given a document page image, the parameters that are needed for segmentation of text lines and words are learned in an unsupervised manner. We have applied our proposed method on several popular publicly available handwritten and machine-printed datasets (ISIDDI, IAM-Hist, IAM, PBOK) of different Indian and other languages containing different fonts. Several experimental results are presented to show the effectiveness and robustness of our method. We have experimented on ICDAR-2013 handwriting segmentation contest dataset and our method outperforms the winning method. In addition to this, we have suggested a quantitative measure to compute the level of degradation of a document page image.

Download Full-text

Hypergraph Convolution on Nodes-Hyperedges Network for Semi-Supervised Node Classification

ACM Transactions on Knowledge Discovery from Data ◽

10.1145/3494567 ◽

2022 ◽

Vol 16 (4) ◽

pp. 1-19

Author(s):

Hanrui Wu ◽

Michael K. Ng

Keyword(s):

Deep Learning ◽

Classification Problem ◽

Cross Entropy ◽

Learning Approaches ◽

Entropy Loss ◽

Order Relations ◽

Data Representations ◽

Node Classification ◽

The Cross ◽

Full Consideration

Hypergraphs have shown great power in representing high-order relations among entities, and lots of hypergraph-based deep learning methods have been proposed to learn informative data representations for the node classification problem. However, most of these deep learning approaches do not take full consideration of either the hyperedge information or the original relationships among nodes and hyperedges. In this article, we present a simple yet effective semi-supervised node classification method named Hypergraph Convolution on Nodes-Hyperedges network, which performs filtering on both nodes and hyperedges as well as recovers the original hypergraph with the least information loss. Instead of only reducing the cross-entropy loss over the labeled samples as most previous approaches do, we additionally consider the hypergraph reconstruction loss as prior information to improve prediction accuracy. As a result, by taking both the cross-entropy loss on the labeled samples and the hypergraph reconstruction loss into consideration, we are able to achieve discriminative latent data representations for training a classifier. We perform extensive experiments on the semi-supervised node classification problem and compare the proposed method with state-of-the-art algorithms. The promising results demonstrate the effectiveness of the proposed method.

Download Full-text

What You See is What it Means! Semantic Representation Learning of Code based on Visualization and Transfer Learning

ACM Transactions on Software Engineering and Methodology ◽

10.1145/3485135 ◽

2022 ◽

Vol 31 (2) ◽

pp. 1-34

Author(s):

Patrick Keller ◽

Abdoul Kader Kaboré ◽

Laura Plein ◽

Jacques Klein ◽

Yves Le Traon ◽

...

Keyword(s):

Transfer Learning ◽

Language Processing ◽

State Of The Art ◽

Semantic Representation ◽

Source Code ◽

Visual Representations ◽

Representation Learning ◽

Classification Problem ◽

Semantic Code ◽

Code Clone

Recent successes in training word embeddings for Natural Language Processing ( NLP ) tasks have encouraged a wave of research on representation learning for source code, which builds on similar NLP methods. The overall objective is then to produce code embeddings that capture the maximum of program semantics. State-of-the-art approaches invariably rely on a syntactic representation (i.e., raw lexical tokens, abstract syntax trees, or intermediate representation tokens) to generate embeddings, which are criticized in the literature as non-robust or non-generalizable. In this work, we investigate a novel embedding approach based on the intuition that source code has visual patterns of semantics. We further use these patterns to address the outstanding challenge of identifying semantic code clones. We propose the WySiWiM ( ‘ ‘What You See Is What It Means ” ) approach where visual representations of source code are fed into powerful pre-trained image classification neural networks from the field of computer vision to benefit from the practical advantages of transfer learning. We evaluate the proposed embedding approach on the task of vulnerable code prediction in source code and on two variations of the task of semantic code clone identification: code clone detection (a binary classification problem), and code classification (a multi-classification problem). We show with experiments on the BigCloneBench (Java), Open Judge (C) that although simple, our WySiWiM approach performs as effectively as state-of-the-art approaches such as ASTNN or TBCNN. We also showed with data from NVD and SARD that WySiWiM representation can be used to learn a vulnerable code detector with reasonable performance (accuracy ∼90%). We further explore the influence of different steps in our approach, such as the choice of visual representations or the classification algorithm, to eventually discuss the promises and limitations of this research direction.

Download Full-text

A Multimodal Deep Framework for Derogatory Social Media Post Identification of a Recognized Person

ACM Transactions on Asian and Low-Resource Language Information Processing ◽

10.1145/3447651 ◽

2022 ◽

Vol 21 (1) ◽

pp. 1-19

Author(s):

Rajat Subhra Bhowmick ◽

Isha Ganguli ◽

Jayanta Paul ◽

Jaya Sil

Keyword(s):

Social Media ◽

Deep Learning ◽

General Population ◽

Language Processing ◽

Character Recognition ◽

Optical Character Recognition ◽

Physical Violence ◽

Research Work ◽

Classification Problem ◽

Indian Society

In today’s era of digitization, social media platforms play a significant role in networking and influencing the perception of the general population. Social network sites have recently been used to carry out harmful attacks against individuals, including political and theological figures, intellectuals, sports and movie stars, and other prominent dignitaries, which may or may not be intentional. However, the exchange of such information across the general population inevitably contributes to social-economic, socio-political turmoil, and even physical violence in society. By classifying the derogatory content of a social media post, this research work helps to eradicate and discourage the upsetting propagation of such hate campaigns. Social networking posts today often include the picture of Memes along with textual remarks and comments, which throw new challenges and opportunities to the research community while identifying the attacks. This article proposes a multimodal deep learning framework by utilizing ensembles of computer vision and natural language processing techniques to train an encapsulated transformer network for handling the classification problem. The proposed framework utilizes the fine-tuned state-of-the-art deep learning-based models (e.g., BERT, Electra) for multilingual text analysis along with face recognition and the optical character recognition model for Meme picture comprehension. For the study, a new Facebook meme-post dataset is created with recorded baseline results. The subject of the created dataset and context of the work is more geared toward multilingual Indian society. The findings demonstrate the efficacy of the proposed method in the identification of social media meme posts featuring derogatory content about a famous/recognized individual.

Download Full-text

Photometric Classification of Early-time Supernova Light Curves with SCONE

The Astronomical Journal ◽

10.3847/1538-3881/ac39a1 ◽

2022 ◽

Vol 163 (2) ◽

pp. 57

Author(s):

Helen Qu ◽

Masao Sako

Keyword(s):

Neural Networks ◽

Light Curve ◽

Convolutional Neural Networks ◽

Software Package ◽

Early Time ◽

Classification Problem ◽

Light Curves ◽

Model Code ◽

Time Space

Abstract In this work, we present classification results on early supernova light curves from SCONE, a photometric classifier that uses convolutional neural networks to categorize supernovae (SNe) by type using light-curve data. SCONE is able to identify SN types from light curves at any stage, from the night of initial alert to the end of their lifetimes. Simulated LSST SNe light curves were truncated at 0, 5, 15, 25, and 50 days after the trigger date and used to train Gaussian processes in wavelength and time space to produce wavelength–time heatmaps. SCONE uses these heatmaps to perform six-way classification between SN types Ia, II, Ibc, Ia-91bg, Iax, and SLSN-I. SCONE is able to perform classification with or without redshift, but we show that incorporating redshift information improves performance at each epoch. SCONE achieved 75% overall accuracy at the date of trigger (60% without redshift), and 89% accuracy 50 days after trigger (82% without redshift). SCONE was also tested on bright subsets of SNe (r < 20 mag) and produced 91% accuracy at the date of trigger (83% without redshift) and 95% five days after trigger (94.7% without redshift). SCONE is the first application of convolutional neural networks to the early-time photometric transient classification problem. All of the data processing and model code developed for this paper can be found in the SCONE software package 1 1 github.com/helenqu/scone located at github.com/helenqu/scone (Qu 2021).

Download Full-text

A Fine-Tuned BERT-Based Transfer Learning Approach for Text Classification

Journal of Healthcare Engineering ◽

10.1155/2022/3498123 ◽

2022 ◽

Vol 2022 ◽

pp. 1-17

Author(s):

Rukhma Qasim ◽

Waqas Haider Bangyal ◽

Mohammed A. Alqarni ◽

Abdulwahab Ali Almazroi

Keyword(s):

Data Mining ◽

Social Media ◽

Transfer Learning ◽

Language Processing ◽

Text Classification ◽

Hate Speech ◽

Classification Problem ◽

Learning Approaches ◽

Fake News ◽

Targeted Marketing

Text Classification problem has been thoroughly studied in information retrieval problems and data mining tasks. It is beneficial in multiple tasks including medical diagnose health and care department, targeted marketing, entertainment industry, and group filtering processes. A recent innovation in both data mining and natural language processing gained the attention of researchers from all over the world to develop automated systems for text classification. NLP allows categorizing documents containing different texts. A huge amount of data is generated on social media sites through social media users. Three datasets have been used for experimental purposes including the COVID-19 fake news dataset, COVID-19 English tweet dataset, and extremist-non-extremist dataset which contain news blogs, posts, and tweets related to coronavirus and hate speech. Transfer learning approaches do not experiment on COVID-19 fake news and extremist-non-extremist datasets. Therefore, the proposed work applied transfer learning classification models on both these datasets to check the performance of transfer learning models. Models are trained and evaluated on the accuracy, precision, recall, and F1-score. Heat maps are also generated for every model. In the end, future directions are proposed.

Download Full-text

Semantic Graph Neural Network: A Conversion from Spam Email Classification to Graph Classification

Scientific Programming ◽

10.1155/2022/6737080 ◽

2022 ◽

Vol 2022 ◽

pp. 1-8

Author(s):

Weisen Pan ◽

Jian Li ◽

Lisa Gao ◽

Liexiang Yue ◽

Yan Yang ◽

...

Keyword(s):

Neural Network ◽

Classification Problem ◽

Graph Classification ◽

Method Performance ◽

The Public ◽

Semantic Graph ◽

Public Dataset ◽

Public Datasets ◽

Better Than ◽

Email Classification

In this study, we propose a method named Semantic Graph Neural Network (SGNN) to address the challenging task of email classification. This method converts the email classification problem into a graph classification problem by projecting email into a graph and applying the SGNN model for classification. The email features are generated from the semantic graph; hence, there is no need of embedding the words into a numerical vector representation. The method performance is tested on the different public datasets. Experiments in the public dataset show that the presented method achieves high accuracy in the email classification test against a few public datasets. The performance is better than the state-of-the-art deep learning-based method in terms of spam classification.

Download Full-text

Joint Classification and Regression for Visual Tracking with Fully Convolutional Siamese Networks

International Journal of Computer Vision ◽

10.1007/s11263-021-01559-4 ◽

2022 ◽

Author(s):

Ying Cui ◽

Dongyan Guo ◽

Yanyan Shao ◽

Zhenhua Wang ◽

Chunhua Shen ◽

...

Keyword(s):

Visual Tracking ◽

Parameter Tuning ◽

Classification Problem ◽

Backbone Networks ◽

Bounding Box ◽

Classification And Regression ◽

Ablation Study ◽

Joint Classification ◽

Object Status ◽

Siamese Networks

AbstractVisual tracking of generic objects is one of the fundamental but challenging problems in computer vision. Here, we propose a novel fully convolutional Siamese network to solve visual tracking by directly predicting the target bounding box in an end-to-end manner. We first reformulate the visual tracking task as two subproblems: a classification problem for pixel category prediction and a regression task for object status estimation at this pixel. With this decomposition, we design a simple yet effective Siamese architecture based classification and regression framework, termed SiamCAR, which consists of two subnetworks: a Siamese subnetwork for feature extraction and a classification-regression subnetwork for direct bounding box prediction. Since the proposed framework is both proposal- and anchor-free, SiamCAR can avoid the tedious hyper-parameter tuning of anchors, considerably simplifying the training. To demonstrate that a much simpler tracking framework can achieve superior tracking results, we conduct extensive experiments and comparisons with state-of-the-art trackers on a few challenging benchmarks. Without bells and whistles, SiamCAR achieves leading performance with a real-time speed. Furthermore, the ablation study validates that the proposed framework is effective with various backbone networks, and can benefit from deeper networks. Code is available at https://github.com/ohhhyeahhh/SiamCAR.

Download Full-text

Performance Evaluation of Machine Learning-Based Channel Equalization Techniques: New Trends and Challenges

Journal of Sensors ◽

10.1155/2022/2053086 ◽

2022 ◽

Vol 2022 ◽

pp. 1-14

Author(s):

Shahzad Hassan ◽

Noshaba Tariq ◽

Rizwan Ali Naqvi ◽

Ateeq Ur Rehman ◽

Mohammed K. A. Kaabar

Keyword(s):

Machine Learning ◽

Wireless Communication ◽

Communication Systems ◽

Short Term Memory ◽

Classification Problem ◽

Channel Equalization ◽

Data Rate ◽

Wireless Channel ◽

Support Vector ◽

Functional Link

Wireless communication systems have evolved and offered more smart and advanced systems like ad hoc and sensor-based infrastructure fewer networks. These networks are evaluated with two fundamental parameters including data rate and spectral efficiency. To achieve a high data rate and robust wireless communication, the most significant task is channel equalization at the receiver side. The transmitted data symbols when passing through the wireless channel suffer from various types of impairments, such as fading, Doppler shifts, and Intersymbol Interference (ISI), and degraded the overall network performance. To mitigate channel-related impairments, many channel equalization algorithms have been proposed for communication systems. The channel equalization problem can also be solved as a classification problem by using Machine Learning (ML) methods. In this paper, channel equalization is performed by using ML techniques in terms of Bit Error Rate (BER) analysis and comparison. Radial Basis Functions (RBFs), Multilayer Perceptron (MLP), Support Vector Machines (SVM), Functional Link Artificial Neural Network (FLANN), Long-Short Term Memory (LSTM), and Polynomial-based Neural Networks (NNs) are adopted for channel equalization.

Download Full-text

classification problem
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

“What Can I Cook with these Ingredients?” - Understanding Cooking-Related Information Needs in Conversational Search

An Unsupervised and Robust Line and Word Segmentation Method for Handwritten and Degraded Printed Document

Hypergraph Convolution on Nodes-Hyperedges Network for Semi-Supervised Node Classification

What You See is What it Means! Semantic Representation Learning of Code based on Visualization and Transfer Learning

A Multimodal Deep Framework for Derogatory Social Media Post Identification of a Recognized Person

Photometric Classification of Early-time Supernova Light Curves with SCONE

A Fine-Tuned BERT-Based Transfer Learning Approach for Text Classification

Semantic Graph Neural Network: A Conversion from Spam Email Classification to Graph Classification

Joint Classification and Regression for Visual Tracking with Fully Convolutional Siamese Networks

Performance Evaluation of Machine Learning-Based Channel Equalization Techniques: New Trends and Challenges

Export Citation Format

classification problemRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

“What Can I Cook with these Ingredients?” - Understanding Cooking-Related Information Needs in Conversational Search

An Unsupervised and Robust Line and Word Segmentation Method for Handwritten and Degraded Printed Document

Hypergraph Convolution on Nodes-Hyperedges Network for Semi-Supervised Node Classification

What You See is What it Means! Semantic Representation Learning of Code based on Visualization and Transfer Learning

A Multimodal Deep Framework for Derogatory Social Media Post Identification of a Recognized Person

Photometric Classification of Early-time Supernova Light Curves with SCONE

A Fine-Tuned BERT-Based Transfer Learning Approach for Text Classification

Semantic Graph Neural Network: A Conversion from Spam Email Classification to Graph Classification

Joint Classification and Regression for Visual Tracking with Fully Convolutional Siamese Networks

Performance Evaluation of Machine Learning-Based Channel Equalization Techniques: New Trends and Challenges

classification problem
Recently Published Documents