text mining
Recently Published Documents





2022 ◽  
Vol 59 (2) ◽  
pp. 102822
Anzhong Huang ◽  
Yuling Zhang ◽  
Jianping Peng ◽  
Hong Chen

2022 ◽  
Vol 303 ◽  
pp. 114160
Alessandro Concari ◽  
Gerjo Kok ◽  
Pim Martens

2022 ◽  
Vol 10 (4) ◽  
pp. 583-593
Syiva Multi Fani ◽  
Rukun Santoso ◽  
Suparti Suparti

Social media is computer-based technology that facilitates the sharing of ideas, thoughts, and information through the building of virtual networks and communities. Twitter is one of the most popular social media in Indonesia which has 78 million users. Businesses rely heavily on Twitter for advertising. Businesses can use these types of tweet content as a means of advertising to Twitter users by Knowing the types of tweet content that are mostly retweeted by their followers . In this study, the application of Text Mining to perform clustering using the K-means clustering method with the best number of clusters obtained from the Silhouette Coefficient method on the @bliblidotcom Twitter tweet data to determine the types of tweet content that are mostly retweeted by @bliblidotcom followers. Tweets with the most retweets and favorites are discount offers and flash sales, so Blibli Indonesia could use this kind of tweet to conduct advertising on social media Twitter because the prize quiz tweets are liked by the @bliblidotcom Twitter account followers.

2022 ◽  
Astghik Sargsyan ◽  
Philipp Wegner ◽  
Stephan Gebel ◽  
Shounak Baksi ◽  
Geena Mariya Jose ◽  

Abstract Motivation: Epilepsy is a multi-faceted complex disorder that requires a precise understanding of the classification, diagnosis, treatment, and disease mechanism governing it. Although scattered resources are available on epilepsy, comprehensive and structured knowledge is missing. In contemplation to promote multidisciplinary knowledge exchange and facilitate advancement in clinical management, especially in pre-clinical research, a disease-specific ontology is necessary. The presented ontology is designed to enable better interconnection between scientific community members in the epilepsy domain.Results: The Epilepsy Ontology (EPIO) is an assembly of structured knowledge on various aspects of epilepsy, developed according to Basic Formal Ontology (BFO) and Open Biological and Biomedical Ontology (OBO) Foundry principles. Concepts and definitions are collected from the latest International League against Epilepsy (ILAE) classification, domain-specific ontologies, and scientific literature. This ontology consists of 1,879 classes and 28,151 axioms (2,171 declaration axioms, 2,219 logical axioms) from several aspects of epilepsy. This ontology is intended to be used for data management and text mining purposes.

2022 ◽  
Vol 10 (4) ◽  
pp. 544-553
Ratna Kurniasari ◽  
Rukun Santoso ◽  
Alan Prahutama

Effective communication between the government and society is essential to achieve good governance. The government makes an effort to provide a means of public complaints through an online aspiration and complaint service called “LaporGub..!”. To group incoming reports easier, the topic of the report is searched by using clustering. Text Mining is used to convert text data into numeric data so that it can be processed further. Clustering is classified as soft clustering (fuzzy) and hard clustering. Hard clustering will divide data into clusters strictly without any overlapping membership with other clusters. Soft clustering can enter data into several clusters with a certain degree of membership value. Different membership values make fuzzy grouping have more natural results than hard clustering because objects at the boundary between several classes are not forced to fully fit into one class but each object is assigned a degree of membership. Fuzzy c-means has an advantage in terms of having a more precise placement of the cluster center compared to other cluster methods, by improving the cluster center repeatedly. The formation of the best number of clusters is seen based on the maximum silhouette coefficient. Wordcloud is used to determine the dominant topic in each cluster. Word cloud is a form of text data visualization. The results show that the maximum silhouette coefficient value for fuzzy c-means clustering is shown by the three clusters. The first cluster produces a word cloud regarding road conditions as many as 449 reports, the second cluster produces a word cloud regarding covid assistance as many as 964 reports, and the third cluster produces a word cloud regarding farmers fertilizers as many as 176 reports. The topic of the report regarding covid assistance is the cluster with the most number of members. 

2022 ◽  
Vol 11 (6) ◽  
pp. 634-645
Nimita Kant ◽  
Perumal Jayaraj ◽  

Eyelid sebaceous gland carcinoma (SGC) is a rare but life-threatening condi-tion. However, there is limited computational research associated with un-derlying protein interactions specific to eyelid sebaceous gland carcinoma. The aim of our study is to identify and analyse the genes associated with eyelid sebaceous gland carcinoma using text mining and to develop a protein-protein interaction network to predict significant biological pathways using bioinformatics tool. Genes associated with eyelid sebaceous gland carcinoma were retrieved from the PubMed database using text mining with key terms ‘eyelid’, ‘sebaceous gland carcinoma’ and excluding the genes for ‘Muir-Torre Syndrome’. The interaction partners were identified using STRING. Cytoscape was used for visualization and analysis of the PPI network. Molec-ular complexes in the network were predicted using MCODE plug-in and ana-lyzed for gene ontology terms using DAVID. PubMed retrieval process identi-fied 79 genes related to eyelid sebaceous gland carcinoma. The PPI network associated with eyelid sebaceous gland carcinoma produced 79 nodes, 1768 edges. Network analysis using Cytoscape identified nine key genes and two molecular complexes to be enriched in the protein-protein interaction net-work. GO enrichment analysis identified biological processes cell fate com-mitment, Wnt signalling pathway, retinoic acid signalling and response to cytokines to be enriched in our network. Genes identified in the study might play a pivotal role in understanding the underlying molecular pathways in-volved in the development and progression of eyelid sebaceous gland carci-noma. Furthermore, it may aid in the identification of candidate biomarkers and therapeutic targets in the treatment of eyelid sebaceous gland carcino-ma.

2022 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Divya Mittal ◽  
Shiv Ratan Agrawal

PurposeThe current study employs text mining and sentiment analysis to identify core banking service attributes and customer sentiment in online user-generated reviews. Additionally, the study explains customer satisfaction based on the identified predictors.Design/methodology/approachA total of 32,217 customer reviews were collected across 29 top banks on bankbazaar.com posted from 2014 to 2021. In total three conceptual models were developed and evaluated employing regression analysis.FindingsThe study revealed that all variables were found to be statistically significant and affect customer satisfaction in their respective models except the interest rate.Research limitations/implicationsThe study is confined to the geographical representation of its subjects' i.e. Indian customers. A cross-cultural and socioeconomic background analysis of banking customers in different countries may help to better generalize the findings.Practical implicationsThe study makes essential theoretical and managerial contributions to the existing literature on services, particularly the banking sector.Originality/valueThis paper is unique in nature that focuses on banking customer satisfaction from online reviews and ratings using text mining and sentiment analysis.

Sign in / Sign up

Export Citation Format

Share Document