Indonesian news classification using convolutional  neural network

Muhammad Ali Ramdhani; Dian Sa’adillah Maylawati; Teddy Mantoro

doi:10.11591/ijeecs.v19.i2.pp1000-1009

Indonesian news classification using convolutional neural network

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v19.i2.pp1000-1009 ◽

2020 ◽

Vol 19 (2) ◽

pp. 1000

Author(s):

Muhammad Ali Ramdhani ◽

Dian Sa’adillah Maylawati ◽

Teddy Mantoro

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Language Processing ◽

Research Area ◽

Training Data ◽

Text Data ◽

Testing Data ◽

Headline News ◽

Language Characteristics ◽

Area Data

Every language has unique characteristics, structures, and grammar. Thus, different styles will have different processes and result in processed in Natural Language Processing (NLP) research area. In the current NLP research area, Data Mining (DM) or Machine Learning (ML) technique is popular, especially for Deep Learning (DL) method. This research aims to classify text data in the Indonesian language using Convolutional Neural Network (CNN) as one of the DL algorithms. The CNN algorithm used modified following the Indonesian language characteristics. Thereby, in the text pre-processing phase, stopword removal and stemming are particularly suitable for the Indonesian language. The experiment conducted using 472 Indonesian News text data from various sources with four categories: ‘hiburan’ (entertainment), ‘olahraga’ (sport), ‘tajuk utama’ (headline news), and ‘teknologi’ (technology). Based on the experiment and evaluation using 377 training data and 95 testing data, producing five models with ten epoch for each model, CNN has the best percentage of accuracy around 90,74% and loss value around 29,05% for 300 hidden layers in classifying the Indonesian News data.

Download Full-text

KLASIFIKASI CITRA DIGITAL BUMBU DAN REMPAH DENGAN ALGORITMA CONVOLUTIONAL NEURAL NETWORK (CNN)

Jurnal Gaussian ◽

10.14710/j.gauss.v9i3.27416 ◽

2020 ◽

Vol 9 (3) ◽

pp. 273-282

Author(s):

Isna Wulandari ◽

Hasbi Yasin ◽

Tatik Widiharih

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Activation Function ◽

Training Data ◽

Young Generation ◽

Testing Data ◽

Filter Size ◽

Hidden Layer ◽

Herbs And Spices

The recognition of herbs and spices among young generation is still low. Based on research in SMK 9 Bandung, showed that there are 47% of students that did not recognize herbs and spices. The method that can be used to overcome this problem is automatic digital sorting of herbs and spices using Convolutional Neural Network (CNN) algorithm. In this study, there are 300 images of herbs and spices that will be classified into 3 categories. It’s ginseng, ginger and galangal. Data in each category is divided into two, training data and testing data with a ratio of 80%: 20%. CNN model used in classification of digital images of herbs and spices is a model with 2 convolutional layers, where the first convolutional layer has 10 filters and the second convolutional layer has 20 filters. Each filter has a kernel matrix with a size of 3x3. The filter size at the pooling layer is 3x3 and the number of neurons in the hidden layer is 10. The activation function at the convolutional layer and hidden layer is tanh, and the activation function at the output layer is softmax. In this model, the accuracy of training data is 0.9875 and the loss value is 0.0769. The accuracy of testing data is 0.85 and the loss value is 0.4773. Meanwhile, testing new data with 3 images for each category produces an accuracy of 88.89%. Keywords: image classification, herbs and spices, CNN.

Download Full-text

Skin Cancer Detection using CNN Algorithm

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.e1079.089620 ◽

2020 ◽

Vol 9 (6) ◽

pp. 45-49

Keyword(s):

Neural Network ◽

Skin Cancer ◽

Convolutional Neural Network ◽

Well Being ◽

Training Data ◽

Data Sets ◽

Data Set ◽

Sequential Model ◽

Pigmented Lesions ◽

Testing Data

The project “Disease Prediction Model” focuses on predicting the type of skin cancer. It deals with constructing a Convolutional Neural Network(CNN) sequential model in order to find the type of a skin cancer which takes a huge troll on mankind well-being. Since development of programmed methods increases the accuracy at high scale for identifying the type of skin cancer, we use Convolutional Neural Network, CNN algorithm in order to build our model . For this we make use of a sequential model. The data set that we have considered for this project is collected from NCBI, which is well known as HAM10000 dataset, it consists of massive amounts of information regarding several dermatoscopic images of most trivial pigmented lesions of skin which are collected from different sufferers. Once the dataset is collected, cleaned, it is split into training and testing data sets. We used CNN to build our model and using the training data we trained the model , later using the testing data we tested the model. Once the model is implemented over the testing data, plots are made in order to analyze the relation between the echos and loss function. It is also used to analyse accuracy and echos for both training and testing data.

Download Full-text

Cognitive Brain Tumour Segmentation Using Varying Window Architecture of Cascade Convolutional Neural Network

International Journal of Computer Vision and Image Processing ◽

10.4018/ijcvip.2021100102 ◽

2021 ◽

Vol 11 (4) ◽

pp. 21-29

Author(s):

Mukesh Kumar Chandrakar ◽

Anup Mishra

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Brain Tumour ◽

Feature Space ◽

Training Model ◽

Research Area ◽

Training Data ◽

Tumour Segmentation ◽

Specificity And Sensitivity ◽

Brain Tumour Segmentation

Brain tumour segmentation is a growing research area in cognitive science and brain computing that helps the clinicians to plan the treatment as per the severity of the tumour cells or region. Accurate brain tumor detection requires measuring the volume, shape, boundaries, and other features. Deep learning is used to measure the characteristics without human intervention. The proper parameter setting and evaluation play a major role. Keeping this in mind, this paper focuses on varying window cascade architecture of convolutional neural network for brain tumour segmentation. The cognitive brain tumour computing is associated with the model using cognition concept for training data. The mixing of training data of different types of tumour images is applied to the model that ensures effective training. The feature space and training model improve the performance. The proposed architecture results in improvement in dice similarity, specificity, and sensitivity. The approach with improved performance is also compared with the existing approaches on the same dataset.

Download Full-text

CNN for Image Identification of Hiragana Based on Pattern Recognition using CNN

Journal of Applied Intelligent System ◽

10.33633/jais.v6i2.4586 ◽

2021 ◽

Vol 6 (2) ◽

pp. 62-71

Author(s):

Chaerul Umam ◽

Andi Danang Krismawan ◽

Rabei Raad Ali

Keyword(s):

Neural Network ◽

Pattern Recognition ◽

Convolutional Neural Network ◽

Training Data ◽

Training Process ◽

Image Identification ◽

Training Stage ◽

Testing Stage ◽

Testing Data ◽

Network Method

Hiragana is one of the letters in Japanese. In this study, CNN (Convolutional Neural Network) method used as identication method, while he preprocessing used thresholding. Then carry out the normalization stage and the filtering stage to remove noise in the image. At the training stage use maxpooling and danse methods as a liaison in the training process, wherea in testing stage using the Adam Optimizer method. Here, we use 1000 images from 50 hiragana characters with a ratio of 950: 50, 950 as training data and 50 data as testing data. Our experiment yield accuracy in 95%.

Download Full-text

Pengolahan Citra untuk Membedakan Ikan Segar dan Tidak Segar Menggunakan Convolutional Neural Network

Indonesian Journal of Applied Informatics ◽

10.20961/ijai.v5i1.41770 ◽

2021 ◽

Vol 5 (1) ◽

pp. 11

Author(s):

Arif Agustyawan

Keyword(s):

Neural Network ◽

Deep Learning ◽

Convolutional Neural Network ◽

Training Data ◽

Independent Learning ◽

Fresh Fish ◽

Testing Data ◽

Neural Network Algorithm ◽

Sorting Process ◽

Manual Methods

Abstrak: Proses penyortiran ikan yang dilakukan oleh nelayan atau penjual, untuk menyeleksi ikan berdasar kualitasnya masih menggunakan metode manual dan terkadang meleset karena faktor keterbatasan indra penglihatan ketika lelah. Selama ini pemeriksaan hanya dillihat secara fisik. Akibatnya, saat akan dikonsumsi ikan tersebut kerap kali sudah rusak. Penelitian ini mencoba menerapkan algoritma Convolutional Neural Network (CNN) untuk membedakan ikan segar dan tidak segar. Convolutional Neural Network merupakan salah satu metode deep learning yang mampu melakukan proses pembelajaran mandiri untuk pengenalan objek, ekstraksi objek, dan klasifikasi objek. Pada penelitian ini, diterapkan algoritma Convolutional Neural Network untuk membedakan ikan segar dan tidak segar. Proses learning jaringan menghasilkan akurasi 100% terhadap data training dan data validation. Pengujian terhadap data testing juga menghasilkan akurasi 100%. Hasil penelitian ini menunjukan bahwa penggunaan metode Convolutional Neural Network mampu mengidentifikasi dan mengklasifikasikan ikan segar dan tidak segar dengan sangat baik.___________________________Abstract:The fish sorting process carried out by fishermen or sellers, to select fish based on quality is still using manual methods and sometimes misses due to the limited sense of sight when tired. So far the examination has only been seen physically. As a result, the fish will often be damaged when consumed. This study tries to apply the Convolutional Neural Network (CNN) algorithm to distinguish between fresh and non-fresh fish. Convolutional Neural Network is a method of deep learning that is capable of conducting independent learning processes for object recognition, object extraction, and object classification. In this study, the Convolutional Neural Network algorithm is applied to distinguish between fresh and non-fresh fish. Network learning process produces 100% accuracy of training data and data validation. Testing of testing data also results in 100% accuracy. The results of this study indicate that the use of the Convolutional Neural Network method can identify and classify fresh and non-fresh fish very well.

Download Full-text

Evaluation of Impact of Neural Networks in Text Classification

Journal of University of Shanghai for Science and Technology ◽

10.51201/jusst/21/07257 ◽

2021 ◽

Vol 23 (07) ◽

pp. 1279-1292

Author(s):

Meghana S ◽

◽

Jagadeesh Sai D ◽

Dr. Krishna Raj P. M ◽

◽

...

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Language Processing ◽

Recurrent Neural Network ◽

Short Term Memory ◽

Confusion Matrix ◽

British Broadcasting Corporation ◽

Text Data ◽

N Gram ◽

The Impact

One of the most trending and major areas of research in Natural Language Processing (NLP) is the classification of text data. This necessarily means that the category that the text belongs to is determined by the content of the text. Various algorithms such as Recurrent Neural Network along with its variation which is Long Short-Term Memory, Hierarchical Attention Networks and also Convolutional Neural Network have been used to analyse how the context of the text can be determined from the text data which in available in terms of datasets. These algorithms each have a special characteristic of their own. While Recurrent Neural Network maintains the structural sequence of the contexts, the Convolutional Neural Network manages to obtain the n-gram feature and the Hierarchical Attention Network manages the hierarchy of the documents or data. The above said algorithms have been implemented on the British Broadcasting Corporation News datasets. Various parameters such as recall, precision, accuracy etc. have been considered along with standards such as F1-score, confusion matrix etc. to deduce the impact.

Download Full-text

A Robust Text Classifier Based on Denoising Deep Neural Network in the Analysis of Big Data

Scientific Programming ◽

10.1155/2017/3610378 ◽

2017 ◽

Vol 2017 ◽

pp. 1-10 ◽

Cited By ~ 6

Author(s):

Wulamu Aziguli ◽

Yuanyu Zhang ◽

Yonghong Xie ◽

Dezheng Zhang ◽

Xiong Luo ◽

...

Keyword(s):

Neural Network ◽

Big Data ◽

Language Processing ◽

Text Classification ◽

Deep Neural Network ◽

Big Data Analytics ◽

Research Area ◽

Text Data ◽

Computational Performance ◽

Benchmark Datasets

Text classification has always been an interesting issue in the research area of natural language processing (NLP). While entering the era of big data, a good text classifier is critical to achieving NLP for scientific big data analytics. With the ever-increasing size of text data, it has posed important challenges in developing effective algorithm for text classification. Given the success of deep neural network (DNN) in analyzing big data, this article proposes a novel text classifier using DNN, in an effort to improve the computational performance of addressing big text data with hybrid outliers. Specifically, through the use of denoising autoencoder (DAE) and restricted Boltzmann machine (RBM), our proposed method, named denoising deep neural network (DDNN), is able to achieve significant improvement with better performance of antinoise and feature extraction, compared to the traditional text classification algorithms. The simulations on benchmark datasets verify the effectiveness and robustness of our proposed text classifier.

Download Full-text

Research on Inversion Mechanism of Chlorophyll—A Concentration in Water Bodies Using a Convolutional Neural Network Model

Water ◽

10.3390/w13050664 ◽

2021 ◽

Vol 13 (5) ◽

pp. 664

Author(s):

Yun Xue ◽

Lei Zhu ◽

Bin Zou ◽

Yi-min Wen ◽

Yue-hong Long ◽

...

Keyword(s):

Neural Network ◽

Regression Model ◽

Convolutional Neural Network ◽

Chlorophyll A ◽

Language Processing ◽

Water Bodies ◽

Inversion Effect ◽

Least Squares Regression ◽

Chlorophyll A Concentration ◽

Chl A

For Case-II water bodies with relatively complex water qualities, it is challenging to establish a chlorophyll-a concentration (Chl-a concentration) inversion model with strong applicability and high accuracy. Convolutional Neural Network (CNN) shows excellent performance in image target recognition and natural language processing. However, there little research exists on the inversion of Chl-a concentration in water using convolutional neural networks. Taking China’s Dongting Lake as an example, 90 water samples and their spectra were collected in this study. Using eight combinations as independent variables and Chl-a concentration as the dependent variable, a CNN model was constructed to invert Chl-a concentration. The results showed that: (1) The CNN model of the original spectrum has a worse inversion effect than the CNN model of the preprocessed spectrum. The determination coefficient (RP2) of the predicted sample is increased from 0.79 to 0.88, and the root mean square error (RMSEP) of the predicted sample is reduced from 0.61 to 0.49, indicating that preprocessing can significantly improve the inversion effect of the model.; (2) among the combined models, the CNN model with Baseline1_SC (strong correlation factor of 500–750 nm baseline) has the best effect, with RP2 reaching 0.90 and RMSEP only 0.45. The average inversion effect of the eight CNN models is better. The average RP2 reaches 0.86 and the RMSEP is only 0.52, indicating the feasibility of applying CNN to Chl-a concentration inversion modeling; (3) the performance of the CNN model (Baseline1_SC (RP2 = 0.90, RMSEP = 0.45)) was far better than the traditional model of the same combination, i.e., the linear regression model (RP2 = 0.61, RMSEP = 0.72) and partial least squares regression model (Baseline1_SC (RP2 = 0.58. RMSEP = 0.95)), indicating the superiority of the convolutional neural network inversion modeling of water body Chl-a concentration.

Download Full-text

Performance Evaluation of Deep CNN-Based Crack Detection and Localization Techniques for Concrete Structures

Sensors ◽

10.3390/s21051688 ◽

2021 ◽

Vol 21 (5) ◽

pp. 1688

Author(s):

Luqman Ali ◽

Fady Alnajjar ◽

Hamad Al Jassmi ◽

Munkhjargal Gochoo ◽

Wasif Khan ◽

...

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Crack Detection ◽

Concrete Structures ◽

Model Performance ◽

Training Data ◽

Computational Time ◽

Data Heterogeneity ◽

Public Datasets ◽

Detection And Localization

This paper proposes a customized convolutional neural network for crack detection in concrete structures. The proposed method is compared to four existing deep learning methods based on training data size, data heterogeneity, network complexity, and the number of epochs. The performance of the proposed convolutional neural network (CNN) model is evaluated and compared to pretrained networks, i.e., the VGG-16, VGG-19, ResNet-50, and Inception V3 models, on eight datasets of different sizes, created from two public datasets. For each model, the evaluation considered computational time, crack localization results, and classification measures, e.g., accuracy, precision, recall, and F1-score. Experimental results demonstrated that training data size and heterogeneity among data samples significantly affect model performance. All models demonstrated promising performance on a limited number of diverse training data; however, increasing the training data size and reducing diversity reduced generalization performance, and led to overfitting. The proposed customized CNN and VGG-16 models outperformed the other methods in terms of classification, localization, and computational time on a small amount of data, and the results indicate that these two models demonstrate superior crack detection and localization for concrete structures.

Download Full-text

Towards Accurate Deceptive Opinions Detection Based on Word Order-Preserving CNN

Mathematical Problems in Engineering ◽

10.1155/2018/2410206 ◽

2018 ◽

Vol 2018 ◽

pp. 1-9 ◽

Cited By ~ 4

Author(s):

Siyuan Zhao ◽

Zhiwei Xu ◽

Limin Liu ◽

Mengjie Guo ◽

Jing Yun

Keyword(s):

Neural Network ◽

Natural Language Processing ◽

Natural Language ◽

Convolutional Neural Network ◽

Language Processing ◽

Word Order ◽

Text Analysis ◽

Important Application ◽

Detection Mechanism ◽

Short Text

Convolutional neural network (CNN) has revolutionized the field of natural language processing, which is considerably efficient at semantics analysis that underlies difficult natural language processing problems in a variety of domains. The deceptive opinion detection is an important application of the existing CNN models. The detection mechanism based on CNN models has better self-adaptability and can effectively identify all kinds of deceptive opinions. Online opinions are quite short, varying in their types and content. In order to effectively identify deceptive opinions, we need to comprehensively study the characteristics of deceptive opinions and explore novel characteristics besides the textual semantics and emotional polarity that have been widely used in text analysis. In this paper, we optimize the convolutional neural network model by embedding the word order characteristics in its convolution layer and pooling layer, which makes convolutional neural network more suitable for short text classification and deceptive opinions detection. The TensorFlow-based experiments demonstrate that the proposed detection mechanism achieves more accurate deceptive opinion detection results.

Download Full-text