scholarly journals The Evolution of Topic Modeling

2022 ◽  
Author(s):  
Rob Churchill ◽  
Lisa Singh

Topic models have been applied to everything from books to newspapers to social media posts in an effort to identify the most prevalent themes of a text corpus. We provide an in-depth analysis of unsupervised topic models from their inception to today. We trace the origins of different types of contemporary topic models, beginning in the 1990s, and we compare their proposed algorithms, as well as their different evaluation approaches. Throughout, we also describe settings in which topic models have worked well and areas where new research is needed, setting the stage for the next generation of topic models.

2016 ◽  
Vol 28 (3) ◽  
pp. 1-9 ◽  
Author(s):  
Shahriar Akter ◽  
Mithu Bhattacharyya ◽  
Samuel Fosso Wamba ◽  
Sutapa Aditya

The surge of interest in big social data has led to growing demand for social media analytics (SMA). Having robust SMA can help firms create value and achieve competitive advantages. However, most firms do not always know how to embrace big social data to establish a path to value. This study addresses this key question to deepen our understanding of how different types of SMA can be applied to create value. Specifically, the findings show the significant uses of opinion mining or sentiment analysis, topic modeling, engagement analysis, predictive analysis, social network analysis, and trend analysis. Finally, the study provides directions for the challenges and opportunities of SMA to maximize value.


2017 ◽  
Vol 21 (3) ◽  
pp. 733-765 ◽  
Author(s):  
Vladimer B. Kobayashi ◽  
Stefan T. Mol ◽  
Hannah A. Berkers ◽  
Gábor Kismihók ◽  
Deanne N. Den Hartog

Despite the ubiquity of textual data, so far few researchers have applied text mining to answer organizational research questions. Text mining, which essentially entails a quantitative approach to the analysis of (usually) voluminous textual data, helps accelerate knowledge discovery by radically increasing the amount data that can be analyzed. This article aims to acquaint organizational researchers with the fundamental logic underpinning text mining, the analytical stages involved, and contemporary techniques that may be used to achieve different types of objectives. The specific analytical techniques reviewed are (a) dimensionality reduction, (b) distance and similarity computing, (c) clustering, (d) topic modeling, and (e) classification. We describe how text mining may extend contemporary organizational research by allowing the testing of existing or new research questions with data that are likely to be rich, contextualized, and ecologically valid. After an exploration of how evidence for the validity of text mining output may be generated, we conclude the article by illustrating the text mining process in a job analysis setting using a dataset composed of job vacancies.


Author(s):  
Maheen Nisar

Rapid progress is being made in the development of next-generation sequencing (NGS) technologies, allowing repeated findings of new genes and a more in-depth analysis of genetic polymorphisms behind the pathogenesis of a disease. In a field such as psychiatry, characteristic of vague and highly variable somatic manifestations, these technologies have brought great advances towards diagnosing various psychiatric and mental disorders, identifying high-risk individuals and towards more effective corresponding treatment. Psychiatry has the difficult task of diagnosing and treating mental disorders without being able to invariably and definitively establish the properties of its illness. This calls for diagnostic technologies that go beyond the traditional ways of gene manipulation to more advanced methods mainly focusing on new gene polymorphism discoveries, one of them being NGS. This enables the identification of hundreds of common and rare genetic variations contributing to behavioral and psychological conditions. Clinical NGS has been useful to detect copy number and single nucleotide variants and to identify structural rearrangements that have been challenging for standard bioinformatics algorithms. The main objective of this article is to review the recent applications of NGS in the diagnosis of major psychiatric disorders, and hence gauge the extent of its impact in the field. A comprehensive PubMed search was conducted and papers published from 2013-2018 were included, using the keywords, “schizophrenia” or “bipolar disorder” or “depressive disorder” or “attention deficit disorder” or “autism spectrum disorder” and “next-generation sequencing”


2020 ◽  
Author(s):  
Aleksandra Urman ◽  
Stefania Ionescu ◽  
David Garcia ◽  
Anikó Hannák

BACKGROUND Since the beginning of the COVID-19 pandemic, scientists have been willing to share their results quickly to speed up the development of potential treatments and/or a vaccine. At the same time, traditional peer-review-based publication systems are not always able to process new research promptly. This has contributed to a surge in the number of medical preprints published since January 2020. In the absence of a vaccine, preventative measures such as social distancing are most helpful in slowing the spread of COVID-19. Their effectiveness can be undermined if the public does not comply with them. Hence, public discourse can have a direct effect on the progression of the pandemic. Research shows that social media discussions on COVID-19 are driven mainly by the findings from preprints, not peer-reviewed papers, highlighting the need to examine the ways medical preprints are shared and discussed online. OBJECTIVE We examine the patterns of medRxiv preprint sharing on Twitter to establish (1) whether the number of tweets linking to medRxiv increased with the advent of the COVID-19 pandemic; (2) which medical preprints were mentioned on Twitter most often; (3) whether medRxiv sharing patterns on Twitter exhibit political partisanship; (4) whether the discourse surrounding medical preprints among Twitter users has changed throughout the pandemic. METHODS The analysis is based on tweets (n=557,405) containing links to medRxriv preprint repository that were posted between the creation of the repository in June 2019 and June 2020. The study relies on a combination of statistical techniques and text analysis methods. RESULTS Since January 2020, the number of tweets linking to medRxiv has increased drastically, peaking in April 2020 with a subsequent cool-down. Before the pandemic, preprints were shared predominantly by users we identify as medical professionals and scientists. After January 2020, other users, including politically-engaged ones, have started increasingly tweeting about medRxiv. Our findings indicate a political divide in sharing patterns of the top-10 most-tweeted preprints. All of them were shared more frequently by users who describe themselves as Republicans than by users who describe themselves as Democrats. Finally, we observe a change in the discourse around medRxiv preprints. Pre-pandemic tweets linking to them were predominantly using the word “preprint”. In February 2020 “preprint” was taken over by the word “study”. Our analysis suggests this change is at least partially driven by politically-engaged users. Widely shared medical preprints can have a direct effect on the public discourse around COVID-19, which in turn can affect the societies’ willingness to comply with preventative measures. This calls for an increased responsibility when dealing with medical preprints from all parties involved: scientists, preprint repositories, media, politicians, and social media companies. CONCLUSIONS Widely shared medical preprints can have a direct effect on the public discourse around COVID-19, which in turn can affect the societies’ willingness to comply with preventative measures. This calls for an increased responsibility when dealing with medical preprints from all parties involved: scientists, preprint repositories, media, politicians, and social media companies.


2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Ali Feizollah ◽  
Mohamed M. Mostafa ◽  
Ainin Sulaiman ◽  
Zalina Zakaria ◽  
Ahmad Firdaus

AbstractThis study explores tweets from Oct 2008 to Oct 2018 related to halal tourism. The tweets were extracted from twitter and underwent various cleaning processes. A total of 33,880 tweets were used for analysis. Analysis intended to (1) identify the topics users tweet about regarding halal tourism, and (2) analyze the emotion-based sentiment of the tweets. To identify and analyze the topics, the study used a word list, concordance graphs, semantic network analysis, and topic-modeling approaches. The NRC emotion lexicon was used to examine the sentiment of the tweets. The analysis illustrated that the word “halal” occurred in the highest number of tweets and was primarily associated with the words “food” and “hotel”. It was also observed that non-Muslim countries such as Japan and Thailand appear to be popular as halal tourist destinations. Sentiment analysis found that there were more positive than negative sentiments among the tweets. The findings have shown that halal tourism is a global market and not only restricted to Muslim countries. Thus, industry players should take the opportunity to use social media to their advantage to promote their halal tourism packages as it is an effective method of communication in this decade.


Author(s):  
Yuming Zhang ◽  
Fan Yang

Companies use corporate social responsibility (CSR) disclosures to communicate their social and environmental policies, practices, and performance to stakeholders. Although the determinants and outcomes of CSR activities are well understood, we know little about how companies use CSR communication to manage a crisis. The few relevant CSR studies have focused on the pressure on corporations exerted by governments, customers, the media, or the public. Although investors have a significant influence on firm value, this stakeholder group has been neglected in research on CSR disclosure. Grounded in legitimacy theory and agency theory, this study uses a sample of Chinese public companies listed on the Shanghai Stock Exchange to investigate CSR disclosure in response to social media criticism posted by investors. The empirical findings show that investors’ social media criticism not only motivates companies to disclose their CSR activities but also increases the substantiveness of their CSR reports, demonstrating that companies’ CSR communication in response to a crisis is substantive rather than merely symbolic. We also find that the impact of social media criticism on CSR disclosure is heterogeneous. Non-state-owned enterprises, companies in regions with high levels of environmental regulations, and companies in regions with local government concern about social issues are most likely to disclose CSR information and report substantive CSR activities. We provide an in-depth analysis of corporate CSR strategies for crisis management and show that crises initiated by investors on social media provide opportunities for corporations to improve their CSR engagement.


2020 ◽  
Vol 48 (12) ◽  
pp. 030006052096777
Author(s):  
Peisong Chen ◽  
Xuegao Yu ◽  
Hao Huang ◽  
Wentao Zeng ◽  
Xiaohong He ◽  
...  

Introduction To evaluate a next-generation sequencing (NGS) workflow in the screening and diagnosis of thalassemia. Methods In this prospective study, blood samples were obtained from people undergoing genetic screening for thalassemia at our centre in Guangzhou, China. Genomic DNA was polymerase chain reaction (PCR)-amplified and sequenced using the Ion Torrent system and results compared with traditional genetic analyses. Results Of the 359 subjects, 148 (41%) were confirmed to have thalassemia. Variant detection identified 35 different types including the most common. Identification of the mutational sites by NGS were consistent with those identified by Sanger sequencing and Gap-PCR. The sensitivity and specificities of the Ion Torrent NGS were 100%. In a separate test of 16 samples, results were consistent when repeated ten times. Conclusion Our NGS workflow based on the Ion Torrent sequencer was successful in the detection of large deletions and non-deletional defects in thalassemia with high accuracy and repeatability.


Author(s):  
Irina Wedel ◽  
Michael Palk ◽  
Stefan Voß

AbstractSocial media enable companies to assess consumers’ opinions, complaints and needs. The systematic and data-driven analysis of social media to generate business value is summarized under the term Social Media Analytics which includes statistical, network-based and language-based approaches. We focus on textual data and investigate which conversation topics arise during the time of a new product introduction on Twitter and how the overall sentiment is during and after the event. The analysis via Natural Language Processing tools is conducted in two languages and four different countries, such that cultural differences in the tonality and customer needs can be identified for the product. Different methods of sentiment analysis and topic modeling are compared to identify the usability in social media and in the respective languages English and German. Furthermore, we illustrate the importance of preprocessing steps when applying these methods and identify relevant product insights.


Sign in / Sign up

Export Citation Format

Share Document