Arabic Hands-On Analysis, Clustering and Classification of Large Arabic Twitter Data Set on COVID19

Author(s):  
Abdelrahman Hamdy ◽  
Ayman Mahgoub ◽  
Conor Ryan
2020 ◽  
Vol 10 (6) ◽  
pp. 1401-1407
Author(s):  
Hyungtai Kim ◽  
Minhee Lee ◽  
Min Kyun Sohn ◽  
Jongmin Lee ◽  
Deog Yung Kim ◽  
...  

This paper shows the simultaneous clustering and classification that is done in order to discover internal grouping on an unlabeled data set. Moreover, it simultaneously classifies the data using clusters discovered as class labels. During the simultaneous clustering and classification, silhouette and F1 scores were calculated for clustering and classification, respectively, according to the number of clusters in order to find an optimal number of clusters that guarantee the desired level of classification performance. In this study, we applied this approach to the data set of Ischemic stroke patients in order to discover function recovery patterns where clear diagnoses do not exist. In addition, we have developed a classifier that predicts the type of function recovery for new patients with early clinical test scores in clinically meaningful levels of accuracy. This classifier can be a helpful tool for clinicians in the rehabilitation field.


2018 ◽  
Vol 7 (4.5) ◽  
pp. 40
Author(s):  
Sathish Kumar.P.J ◽  
Dr R.Jagadeesh Kan

The problem of high dimensional clustering and classification has been well studied in previous articles. Also, the recommendation generation towards the treatment based on input symptoms has been considered in this research part. Number of approaches has been discussed earlier in literature towards disease prediction and recommendation generation. Still, the efficient of such recommendation systems are not up to noticeable rate. To improve the performance, an efficient multi level symptom similarity based disease prediction and recommendation generation has been presented. The method reads the input data set, performs preprocessing to remove the noisy records. In the second stage, the method performs Class Level Feature Similarity Clustering. The classification of input symptom set has been performed using MLSS (Multi Level Symptom Similarity) measure estimated between different class of samples. According to the selected class, the method selects higher frequent medicine set as recommendation using drug success rate and frequency measures. The proposed method improves the performance of clustering, disease prediction with higher efficient medicine recommendation.  


Author(s):  
M. Jeyanthi ◽  
C. Velayutham

In Science and Technology Development BCI plays a vital role in the field of Research. Classification is a data mining technique used to predict group membership for data instances. Analyses of BCI data are challenging because feature extraction and classification of these data are more difficult as compared with those applied to raw data. In this paper, We extracted features using statistical Haralick features from the raw EEG data . Then the features are Normalized, Binning is used to improve the accuracy of the predictive models by reducing noise and eliminate some irrelevant attributes and then the classification is performed using different classification techniques such as Naïve Bayes, k-nearest neighbor classifier, SVM classifier using BCI dataset. Finally we propose the SVM classification algorithm for the BCI data set.


Author(s):  
Jianping Ju ◽  
Hong Zheng ◽  
Xiaohang Xu ◽  
Zhongyuan Guo ◽  
Zhaohui Zheng ◽  
...  

AbstractAlthough convolutional neural networks have achieved success in the field of image classification, there are still challenges in the field of agricultural product quality sorting such as machine vision-based jujube defects detection. The performance of jujube defect detection mainly depends on the feature extraction and the classifier used. Due to the diversity of the jujube materials and the variability of the testing environment, the traditional method of manually extracting the features often fails to meet the requirements of practical application. In this paper, a jujube sorting model in small data sets based on convolutional neural network and transfer learning is proposed to meet the actual demand of jujube defects detection. Firstly, the original images collected from the actual jujube sorting production line were pre-processed, and the data were augmented to establish a data set of five categories of jujube defects. The original CNN model is then improved by embedding the SE module and using the triplet loss function and the center loss function to replace the softmax loss function. Finally, the depth pre-training model on the ImageNet image data set was used to conduct training on the jujube defects data set, so that the parameters of the pre-training model could fit the parameter distribution of the jujube defects image, and the parameter distribution was transferred to the jujube defects data set to complete the transfer of the model and realize the detection and classification of the jujube defects. The classification results are visualized by heatmap through the analysis of classification accuracy and confusion matrix compared with the comparison models. The experimental results show that the SE-ResNet50-CL model optimizes the fine-grained classification problem of jujube defect recognition, and the test accuracy reaches 94.15%. The model has good stability and high recognition accuracy in complex environments.


Author(s):  
Usman Naseem ◽  
Imran Razzak ◽  
Matloob Khushi ◽  
Peter W. Eklund ◽  
Jinman Kim

Author(s):  
Ana Villanueva ◽  
Ziyi Liu ◽  
Yoshimasa Kitaguchi ◽  
Zhengzhe Zhu ◽  
Kylie Peppler ◽  
...  

AbstractAugmented reality (AR) is a unique, hands-on tool to deliver information. However, its educational value has been mainly demonstrated empirically so far. In this paper, we present a modeling approach to provide users with mastery of a skill, using AR learning content to implement an educational curriculum. We illustrate the potential of this approach by applying this to an important but pervasively misunderstood area of STEM learning, electrical circuitry. Unlike previous cognitive assessment models, we break down the area into microskills—the smallest segmentation of this knowledge—and concrete learning outcomes for each. This model empowers the user to perform a variety of tasks that are conducive to the acquisition of the skill. We also provide a classification of microskills and how to design them in an AR environment. Our results demonstrated that aligning the AR technology to specific learning objectives paves the way for high quality assessment, teaching, and learning.


Author(s):  
Xiongzhi Ai ◽  
Jiawei Zhuang ◽  
Yonghua Wang ◽  
Pin Wan ◽  
Yu Fu

AbstractUltrasonic image examination is the first choice for the diagnosis of thyroid papillary carcinoma. However, there are some problems in the ultrasonic image of thyroid papillary carcinoma, such as poor definition, tissue overlap and low resolution, which make the ultrasonic image difficult to be diagnosed. Capsule network (CapsNet) can effectively address tissue overlap and other problems. This paper investigates a new network model based on capsule network, which is named as ResCaps network. ResCaps network uses residual modules and enhances the abstract expression of the model. The experimental results reveal that the characteristic classification accuracy of ResCaps3 network model for self-made data set of thyroid papillary carcinoma was $$81.06\%$$ 81.06 % . Furthermore, Fashion-MNIST data set is also tested to show the reliability and validity of ResCaps network model. Notably, the ResCaps network model not only improves the accuracy of CapsNet significantly, but also provides an effective method for the classification of lesion characteristics of thyroid papillary carcinoma ultrasonic images.


1987 ◽  
Vol 65 (3) ◽  
pp. 691-707 ◽  
Author(s):  
A. F. L. Nemec ◽  
R. O. Brinkhurst

A data matrix of 23 generic or subgeneric taxa versus 24 characters and a shorter matrix of 15 characters were analyzed by means of ordination, cluster analyses, parsimony, and compatibility methods (the last two of which are phylogenetic tree reconstruction methods) and the results were compared inter alia and with traditional methods. Various measures of fit for evaluating the parsimony methods were employed. There were few compatible characters in the data set, and much homoplasy, but most analyses separated a group based on Stylaria from the rest of the family, which could then be separated into four groups, recognized here for the first time as tribes (Naidini, Derini, Pristinini, and Chaetogastrini). There was less consistency of results within these groups. Modern methods produced results that do not conflict with traditional groupings. The Jaccard coefficient minimizes the significance of symplesiomorphy and complete linkage avoids chaining effects and corresponds to actual similarities, unlike single or average linkage methods, respectively. Ordination complements cluster analysis. The Wagner parsimony method was superior to the less flexible Camin–Sokal approach and produced better measure of fit statistics. All of the aforementioned methods contain areas susceptible to subjective decisions but, nevertheless, they lead to a complete disclosure of both the methods used and the assumptions made, and facilitate objective hypothesis testing rather than the presentation of conflicting phylogenies based on the different, undisclosed premises of manual approaches.


2017 ◽  
Vol 45 (2) ◽  
pp. 66-74
Author(s):  
Yufeng Ma ◽  
Long Xia ◽  
Wenqi Shen ◽  
Mi Zhou ◽  
Weiguo Fan

Purpose The purpose of this paper is automatic classification of TV series reviews based on generic categories. Design/methodology/approach What the authors mainly applied is using surrogate instead of specific roles or actors’ name in reviews to make reviews more generic. Besides, feature selection techniques and different kinds of classifiers are incorporated. Findings With roles’ and actors’ names replaced by generic tags, the experimental result showed that it can generalize well to agnostic TV series as compared with reviews keeping the original names. Research limitations/implications The model presented in this paper must be built on top of an already existed knowledge base like Baidu Encyclopedia. Such database takes lots of work. Practical implications Like in digital information supply chain, if reviews are part of the information to be transported or exchanged, then the model presented in this paper can help automatically identify individual review according to different requirements and help the information sharing. Originality/value One originality is that the authors proposed the surrogate-based approach to make reviews more generic. Besides, they also built a review data set of hot Chinese TV series, which includes eight generic category labels for each review.


Sign in / Sign up

Export Citation Format

Share Document