scholarly journals IMBALANCE OF CLASSES IN SOLVING THE PROBLEM OF SOCIAL NETWORKS USER CLASSIFICATION FOR PROFESSIONAL ORIENTATION

2021 ◽  
Vol 2 (68) ◽  
pp. 41-43
Author(s):  
V. Obrubova ◽  
M. Ozerova

The problem of data imbalance is often underestimated when solving classification problems. A classification model that looks well trained on your data and gives a good recognition rate may not be reliable. Consideration of this problem in the specific task of classifying users of social networks will make it possible to understand how, why and, most importantly, when it is necessary to get rid from data imbalances.

2021 ◽  
Vol 2 (68) ◽  
pp. 48-50
Author(s):  
V. Obrubova ◽  
M. Ozerova

The article deals with a complex formulation of the topic social networks users classification to determine professional orientation.


Author(s):  
Andrew S. Brunker ◽  
Richard R. Rosenkranz ◽  
Anetta Van Itallie ◽  
W. Kerry Mummery ◽  
Quang Vinh Nguyen ◽  
...  

2019 ◽  
Vol 15 (2) ◽  
pp. 155-182 ◽  
Author(s):  
Issa Alsmadi ◽  
Keng Hoon Gan

PurposeRapid developments in social networks and their usage in everyday life have caused an explosion in the amount of short electronic documents. Thus, the need to classify this type of document based on their content has a significant implication in many applications. The need to classify these documents in relevant classes according to their text contents should be interested in many practical reasons. Short-text classification is an essential step in many applications, such as spam filtering, sentiment analysis, Twitter personalization, customer review and many other applications related to social networks. Reviews on short text and its application are limited. Thus, this paper aims to discuss the characteristics of short text, its challenges and difficulties in classification. The paper attempt to introduce all stages in principle classification, the technique used in each stage and the possible development trend in each stage.Design/methodology/approachThe paper as a review of the main aspect of short-text classification. The paper is structured based on the classification task stage.FindingsThis paper discusses related issues and approaches to these problems. Further research could be conducted to address the challenges in short texts and avoid poor accuracy in classification. Problems in low performance can be solved by using optimized solutions, such as genetic algorithms that are powerful in enhancing the quality of selected features. Soft computing solution has a fuzzy logic that makes short-text problems a promising area of research.Originality/valueUsing a powerful short-text classification method significantly affects many applications in terms of efficiency enhancement. Current solutions still have low performance, implying the need for improvement. This paper discusses related issues and approaches to these problems.


Author(s):  
SIMON GÜNTER ◽  
HORST BUNKE

Handwritten text recognition is one of the most difficult problems in the field of pattern recognition. In this paper, we describe our efforts towards improving the performance of state-of-the-art handwriting recognition systems through the use of classifier ensembles. There are many examples of classification problems in the literature where multiple classifier systems increase the performance over single classifiers. Normally one of the two following approaches is used to create a multiple classifier system. (1) Several classifiers are developed completely independent of each other and combined in a last step. (2) Several classifiers are created out of one prototype classifier by using so-called classifier ensemble creation methods. In this paper an algorithm which combines both approaches is introduced and it is used to increase the recognition rate of a hidden Markov model (HMM) based handwritten word recognizer.


Sensors ◽  
2021 ◽  
Vol 21 (2) ◽  
pp. 555
Author(s):  
Jui-Sheng Chou ◽  
Chia-Hsuan Liu

Sand theft or illegal mining in river dredging areas has been a problem in recent decades. For this reason, increasing the use of artificial intelligence in dredging areas, building automated monitoring systems, and reducing human involvement can effectively deter crime and lighten the workload of security guards. In this investigation, a smart dredging construction site system was developed using automated techniques that were arranged to be suitable to various areas. The aim in the initial period of the smart dredging construction was to automate the audit work at the control point, which manages trucks in river dredging areas. Images of dump trucks entering the control point were captured using monitoring equipment in the construction area. The obtained images and the deep learning technique, YOLOv3, were used to detect the positions of the vehicle license plates. Framed images of the vehicle license plates were captured and were used as input in an image classification model, C-CNN-L3, to identify the number of characters on the license plate. Based on the classification results, the images of the vehicle license plates were transmitted to a text recognition model, R-CNN-L3, that corresponded to the characters of the license plate. Finally, the models of each stage were integrated into a real-time truck license plate recognition (TLPR) system; the single character recognition rate was 97.59%, the overall recognition rate was 93.73%, and the speed was 0.3271 s/image. The TLPR system reduces the labor force and time spent to identify the license plates, effectively reducing the probability of crime and increasing the transparency, automation, and efficiency of the frontline personnel’s work. The TLPR is the first step toward an automated operation to manage trucks at the control point. The subsequent and ongoing development of system functions can advance dredging operations toward the goal of being a smart construction site. By intending to facilitate an intelligent and highly efficient management system of dredging-related departments by providing a vehicle LPR system, this paper forms a contribution to the current body of knowledge in the sense that it presents an objective approach for the TLPR system.


2021 ◽  
Vol 11 (6) ◽  
pp. 1592-1598
Author(s):  
Xufei Liu

The early detection of cardiovascular diseases based on electrocardiogram (ECG) is very important for the timely treatment of cardiovascular patients, which increases the survival rate of patients. ECG is a visual representation that describes changes in cardiac bioelectricity and is the basis for detecting heart health. With the rise of edge machine learning and Internet of Things (IoT) technologies, small machine learning models have received attention. This study proposes an ECG automatic classification method based on Internet of Things technology and LSTM network to achieve early monitoring and early prevention of cardiovascular diseases. Specifically, this paper first proposes a single-layer bidirectional LSTM network structure. Make full use of the timing-dependent features of the sampling points before and after to automatically extract features. The network structure is more lightweight and the calculation complexity is lower. In order to verify the effectiveness of the proposed classification model, the relevant comparison algorithm is used to verify on the MIT-BIH public data set. Secondly, the model is embedded in a wearable device to automatically classify the collected ECG. Finally, when an abnormality is detected, the user is alerted by an alarm. The experimental results show that the proposed model has a simple structure and a high classification and recognition rate, which can meet the needs of wearable devices for monitoring ECG of patients.


2012 ◽  
Vol 24 (06) ◽  
pp. 513-524
Author(s):  
Mohsen Alavash Shooshtari ◽  
Keivan Maghooli ◽  
Kambiz Badie

One of the main objectives of data mining as a promising multidisciplinary field in computer science is to provide a classification model to be used for decision support purposes. In the medical imaging domain, mammograms classification is a difficult diagnostic task which calls for development of automated classification systems. Associative classification, as a special case of association rules mining, has been adopted in classification problems for years. In this paper, an associative classification framework based on parallel mining of image blocks is proposed to be used for mammograms discrimination. Indeed, association rules mining is applied to a commonly used mammography image database to classify digital mammograms into three categories, namely normal, benign and malign. In order to do so, first images are preprocessed and then features are extracted from non-overlapping image blocks and discretized for rule discovery. Association rules are then discovered through parallel mining of transactional databases which correspond to the image blocks, and finally are used within a unique decision-making scheme to predict the class of unknown samples. Finally, experiments are conducted to assess the effectiveness of the proposed framework. Results show that the proposed framework proved successful in terms of accuracy, precision, and recall, and suggest that the framework could be used as the core of any future associative classifier to support mammograms discrimination.


2020 ◽  
Author(s):  
Gong Yue-hong ◽  
Yang Tie-jun ◽  
Liang Yi-tao ◽  
Ge Hong-yi ◽  
Chen Liang

AbstractMould is a common phenomenon in stored wheat. First, mould will decrease the quality of wheat kernels. Second, the mycotoxins metabolized by mycetes are very harmful for humans. Therefore, the fast and accurate examination of wheat mould is vitally important to evaluating its storage quality and subsequent processing safety. Existing methods for examining wheat mould mainly rely on chemical methods, which always involve complex and long pretreatment processes, and the auxiliary chemical materials used in these methods may pollute our environment. To improve the determination of wheat mould, this paper proposed a type of green and nondestructive determination method based on biophotons. The specific implementation process is as follows: first, the ultra-weak luminescence between healthy and mouldy wheat samples are measured repeatedly by a biophotonic analyser, and then, the approximate entropy and multiscale approximate entropy are separately introduced as the main classification features. Finally, the classification performances have been tested using the support vector machine(SVM). The ROC curve of the newly established classification model shows that the highest recognition rate can reach 93.6%, which shows that our proposed classification model is feasible and promising for detecting wheat mould.


Entropy ◽  
2019 ◽  
Vol 21 (5) ◽  
pp. 443 ◽  
Author(s):  
Lianmeng Jiao ◽  
Xiaojiao Geng ◽  
Quan Pan

The belief rule-based classification system (BRBCS) is a promising technique for addressing different types of uncertainty in complex classification problems, by introducing the belief function theory into the classical fuzzy rule-based classification system. However, in the BRBCS, high numbers of instances and features generally induce a belief rule base (BRB) with large size, which degrades the interpretability of the classification model for big data sets. In this paper, a BRB learning method based on the evidential C-means clustering (ECM) algorithm is proposed to efficiently design a compact belief rule-based classification system (CBRBCS). First, a supervised version of the ECM algorithm is designed by means of weighted product-space clustering to partition the training set with the goals of obtaining both good inter-cluster separability and inner-cluster pureness. Then, a systematic method is developed to construct belief rules based on the obtained credal partitions. Finally, an evidential partition entropy-based optimization procedure is designed to get a compact BRB with a better trade-off between accuracy and interpretability. The key benefit of the proposed CBRBCS is that it can provide a more interpretable classification model on the premise of comparative accuracy. Experiments based on synthetic and real data sets have been conducted to evaluate the classification accuracy and interpretability of the proposal.


Sign in / Sign up

Export Citation Format

Share Document