A Text-Mining Analysis on the Review of the Non-Financial Reporting Directive: Bringing Value Creation for Stakeholders into Accounting

The recent Review of the Non-Financial Reporting Directive (NFRD) aims to enhance adequate non-financial information (NFI) disclosure and improve accountability for stakeholders. This study focuses on this regulatory intervention and has a twofold objective: First, it aims to understand the main underlying issues at stake; second, it suggests areas of possible amendment considering the current debates on sustainability accounting and accounting for stakeholders. In keeping with these aims, the research analyzes the documents annexed to the contribution on the Review of the NFRD by conducting a text-mining analysis with latent Dirichlet allocation (LDA) probabilistic topic model (PTM). Our findings highlight four main topics at the core of the current debate: quality of NFI, standardization, materiality, and assurance. The research suggests ways of improving managerial policies to achieve more comparable, relevant, and reliable information by bringing value creation for stakeholders into accounting. It further addresses an integrated logic of accounting for stakeholders that contributes to sustainable development.

Download Full-text

The application of text mining algorithms in summarizing trends in anti-epileptic drug research

10.1101/269308 ◽

2018 ◽

Cited By ~ 2

Author(s):

Shatrunjai P. Singh ◽

Swagata Karkare ◽

Sudhir M. Baswan ◽

Vijendra P. Singh

Keyword(s):

Text Mining ◽

Latent Dirichlet Allocation ◽

Drug Research ◽

Topic Model ◽

Analysis Model ◽

Data Intensive ◽

Document Frequency ◽

Anti Epileptic Drug ◽

The Us ◽

Mining Algorithms

1.AbstractContent summarization is an important area of research in traditional data mining. The volume of studies published on anti-epileptic drugs (AED) has increased exponentially over the last two decades, making it an important area for the application of text mining based summarization algorithms. In the current study, we use text analytics algorithms to mine and summarize 10,000 PubMed abstracts related to anti-epileptic drugs published within the last 10 years. A Text Frequency – Inverse Document Frequency based filtering was applied to identify drugs with highest frequency of mentions within these abstracts. The US Food and Drug database was scrapped and linked to the results to quantify the most frequently mentioned modes of action and elucidate the pharmaceutical entities marketing these drugs. A sentiment analysis model was created to score the abstracts for sentiment positivity or negativity. Finally, a modified Latent Dirichlet Allocation topic model was generated to extract key topics associated with the most frequently mentioned AEDs. Results of this study provide accurate and data intensive insights on the progress of anti-epileptic drug research.

Download Full-text

Identifying User Interests In An Online Discussion Forum With Deep Learning

10.32920/ryerson.14654349.v1 ◽

2021 ◽

Author(s):

Nicholas Buhagiar ◽

Bahram Zahir ◽

Abdolreza Abhari

Keyword(s):

Latent Dirichlet Allocation ◽

Topic Model ◽

Model Framework ◽

User Interests ◽

Online Discussion Forum ◽

Probabilistic Topic Model ◽

Average Accuracy ◽

Discussion Threads ◽

Validation Set ◽

Evaluation Metric

The probabilistic topic model Latent Dirichlet Allocation (LDA) was deployed to model the themes of discourse in discussion threads on the social media aggregation website Reddit. Abstracting discussion threads as vectors of topic weights, these vectors were fed into several neural network architectures, each with a different number of hidden layers, to train machine learning models that could identify which discussion would be of interest for a given user to contribute. Using accuracy as the evaluation metric to determine which model framework achieved the best performance on a given user’s validation set, these selected models achieved an average accuracy of 66.1% on the test data for a sample set of 30 users. Using the predicted probabilities of interest made by these neural networks, recommender systems were further built and analyzed for each user.

Download Full-text

Lender Trust on the P2P Lending: Analysis Based on Sentiment Analysis of Comment Text

Sustainability ◽

10.3390/su12083293 ◽

2020 ◽

Vol 12 (8) ◽

pp. 3293 ◽

Cited By ~ 2

Author(s):

Beibei Niu ◽

Jinzheng Ren ◽

Ansa Zhao ◽

Xiaotao Li

Keyword(s):

Theoretical Basis ◽

Analytical Approach ◽

Latent Dirichlet Allocation ◽

Topic Model ◽

Text Data ◽

The Core ◽

Operational Level ◽

Core Subject ◽

P2p Lending ◽

Subject Areas

Lender trust is important to ensure the sustainability of P2P lending. This paper uses web crawling to collect more than 240,000 unique pieces of comment text data. Based on the mapping relationship between emotion and trust, we use the lexicon-based method and deep learning to check the trust of a given lender in P2P lending. Further, we use the Latent Dirichlet Allocation (LDA) topic model to mine topics concerned with this research. The results show that lenders are positive about P2P lending, though this tendency fluctuates downward with time. The security, rate of return, and compliance of P2P lending are the issues of greatest concern to lenders. This study reveals the core subject areas that influence a lender’s emotions and trusts and provides a theoretical basis and empirical reference for relevant platforms to improve their operational level while enhancing competitiveness. This analytical approach offers insights for researchers to understand the hidden content behind the text data.

Download Full-text

Identifying User Interests In An Online Discussion Forum With Deep Learning

10.32920/ryerson.14654349 ◽

2021 ◽

Author(s):

Nicholas Buhagiar ◽

Bahram Zahir ◽

Abdolreza Abhari

Keyword(s):

Latent Dirichlet Allocation ◽

Topic Model ◽

Model Framework ◽

User Interests ◽

Online Discussion Forum ◽

Probabilistic Topic Model ◽

Average Accuracy ◽

Discussion Threads ◽

Validation Set ◽

Evaluation Metric

Download Full-text

Understanding the University-Sustainability Link through Media: A Spanish Perspective

Sustainability ◽

10.3390/su12124830 ◽

2020 ◽

Vol 12 (12) ◽

pp. 4830 ◽

Cited By ~ 1

Author(s):

Cecilia Elizabeth Bayas Aldaz ◽

Jesus Rodriguez-Pomeda ◽

Leyla Angélica Sandoval Hamón ◽

Fernando Casani

Keyword(s):

Higher Education ◽

Social Perception ◽

Latent Dirichlet Allocation ◽

Topic Model ◽

Higher Education Institutions ◽

News Coverage ◽

News Sources ◽

University Funding ◽

Probabilistic Topic Model ◽

The Social

This article provides a procedure to universities for understanding the social perception of their activities in the sustainability field, through the analysis of news published in the printed media. It identifies the Spanish news sources that have covered this issue the most and the topics that appear in that news coverage. Using a probabilistic topic model called Latent Dirichlet Allocation, the study includes the nine dominant topics within a corpus with more than seventeen thousand published news items (totaling approximately five and a quarter million words) from a database of almost thirteen hundred national press sources between 2014 and 2017. The study identifies the news sources that published the most news on the issue. It is also found that the amount of news on sustainability and universities declined during the covered period. The nine identified topics point towards the relevance of higher education institutions’ activities as drivers of sustainability. The social perception encapsulated within the topics signals how the public is interested in these activities. Therefore, we find some interesting relationships between sustainable development, higher education institutions’ missions and behaviors, governmental policies, university funding and governance, social and economic innovation, and green campuses in terms of the overall goal of sustainability.

Download Full-text

Fesztivállátogatók véleményeinek számítógéppel támogatott tematikus modellezése – egy kísérlet eredményei Computer-aided topic modelling based on festival-goers’ opinions – results of an experiment

Turizmus Bulletin ◽

10.14267/turbull.2021v21n1.1 ◽

2021 ◽

Vol 21 (1) ◽

pp. 4-12

Author(s):

Mátyás Hinek

Keyword(s):

Qualitative Research ◽

Latent Dirichlet Allocation ◽

Topic Model ◽

Computer Algorithm ◽

Topic Modelling ◽

Computer Tools ◽

Computer Aided ◽

Dirichlet Allocation

Tanulmányunkban arra teszünk kísérletet, hogy egy számítógépes algoritmus, a rejtett Dirichlet eloszlást alkalmazó strukturált témamodell (stm) segítségével meghatározzuk a Sziget Fesztivál látogatói által a Facebookon írt vélemények jellemző témáit, és ezeket összevessük egy korábbi kutatásunkban körvonalazott témákkal. A Sziget Fesztivál látogatóinak az elmúlt hét évben angol nyelven írt szöveges véleményei alapján az algoritmus segítségével kilenc témát modelleztünk, melyek tartalma és köre csak részben egyezett meg a korábbi, kvalitatív kutatásunkban azonosított témákkal. Vizsgálatunk legfontosabb eredménye, hogy számítógépes eszközökkel eredményesen vizsgálhatók a látogatói vélemények, ugyanakkor az eredmények minőségét meghatározza a korpusz nagysága, vagyis az elemzett hozzászólások száma és terjedelme. In our study, we attempt to determine the typical topics of opinions written by Sziget Festival visitors on Facebook using structured topic model (stm) computer algorithm and latent Dirichlet allocation, and compare the results with our previous research. Based on written opinions of the visitors of the Sziget Festival in the last seven years, we modelled nine topics. Their content and scope partly matched the topics identified in our previous qualitative research. The most important result of our study is that visitor opinions can be successfully examined with computer tools, but the quality of the results is determined by the size of the corpus, i.e. the number and scope of the analysed posts.

Download Full-text

Detection of Cases of Noncompliance to Drug Treatment in Patient Forum Posts: Topic Model Approach (Preprint)

10.2196/preprints.9222 ◽

2017 ◽

Author(s):

Redhouane Abdellaoui ◽

Pierre Foulquié ◽

Nathalie Texier ◽

Carole Faviez ◽

Anita Burgun ◽

...

Keyword(s):

Social Media ◽

Virtual Communities ◽

Latent Dirichlet Allocation ◽

Topic Model ◽

Antidepressant Drug ◽

Topic Models ◽

Web Crawler ◽

Probabilistic Topic Model ◽

Manual Review ◽

Model Approach

BACKGROUND Medication nonadherence is a major impediment to the management of many health conditions. A better understanding of the factors underlying noncompliance to treatment may help health professionals to address it. Patients use peer-to-peer virtual communities and social media to share their experiences regarding their treatments and diseases. Using topic models makes it possible to model themes present in a collection of posts, thus to identify cases of noncompliance. OBJECTIVE The aim of this study was to detect messages describing patients’ noncompliant behaviors associated with a drug of interest. Thus, the objective was the clustering of posts featuring a homogeneous vocabulary related to nonadherent attitudes. METHODS We focused on escitalopram and aripiprazole used to treat depression and psychotic conditions, respectively. We implemented a probabilistic topic model to identify the topics that occurred in a corpus of messages mentioning these drugs, posted from 2004 to 2013 on three of the most popular French forums. Data were collected using a Web crawler designed by Kappa Santé as part of the Detec’t project to analyze social media for drug safety. Several topics were related to noncompliance to treatment. RESULTS Starting from a corpus of 3650 posts related to an antidepressant drug (escitalopram) and 2164 posts related to an antipsychotic drug (aripiprazole), the use of latent Dirichlet allocation allowed us to model several themes, including interruptions of treatment and changes in dosage. The topic model approach detected cases of noncompliance behaviors with a recall of 98.5% (272/276) and a precision of 32.6% (272/844). CONCLUSIONS Topic models enabled us to explore patients’ discussions on community websites and to identify posts related with noncompliant behaviors. After a manual review of the messages in the noncompliance topics, we found that noncompliance to treatment was present in 6.17% (276/4469) of the posts.

Download Full-text

Latent Feature Word Representations to Enhance Topic Models for Text Mining Algorithms

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.b2503.129219 ◽

2019 ◽

Vol 9 (2) ◽

pp. 4816-4821

Keyword(s):

Text Mining ◽

Process Model ◽

Latent Dirichlet Allocation ◽

Topic Model ◽

Document Clustering ◽

Topic Models ◽

Mining Operations ◽

Document Collections ◽

Generative Process ◽

Document Collection

Dealing with large number of textual documents needs proven models that leverage the efficiency in processing. Text mining needs such models to have meaningful approaches to extract latent features from document collection. Latent Dirichlet allocation (LDA) is one such probabilistic generative process model that helps in representing document collections in a systematic approach. In many text mining applications LDA is useful as it supports many models. One such model is known as Topic Model. However, topic models LDA needs to be improved in order to exploit latent feature vector representations of words trained on large corpora to improve word-topic mapping learnt on smaller corpus. With respect to document clustering and document classification, it is essential to have a novel topic models to improve performance. In this paper, an improved topic model is proposed and implemented using LDA which exploits the benefits of Word2Vec tool to have pre-trained word vectors so as to achieve the desired enhancement. A prototype application is built to demonstrate the proof of the concept with text mining operations like document clustering.

Download Full-text

The Application of Text Mining Algorithms In Summarizing Trends in Anti-Epileptic Drug Research

International Journal of Statistics and Probability ◽

10.5539/ijsp.v7n4p11 ◽

2018 ◽

Vol 7 (4) ◽

pp. 11 ◽

Cited By ~ 2

Author(s):

Shatrunjai P. Singh ◽

Swagata Karkare ◽

Sudhir M. Baswan ◽

Vijendra P. Singh

Keyword(s):

Text Mining ◽

Latent Dirichlet Allocation ◽

Drug Research ◽

Topic Model ◽

Analysis Model ◽

Data Intensive ◽

Anti Epileptic Drug ◽

Key Topics ◽

Negative Sentiment ◽

Mining Algorithms

Content summarization is an important area of research in traditional data mining. The volume of studies published on anti-epileptic drugs (AED) has increased exponentially over the last two decades, making it an important area for the application of text mining based summarization algorithms. In the current study, we use text analytics algorithms to mine and summarize 10,000 PubMed abstracts related to anti-epileptic drugs published within the last 10 years. A Text Frequency – Inverse Document Frequency based filtering was applied to identify drugs with highest frequency of mentions within these abstracts. The US Food and Drug database was scrapped and linked to the results to quantify the most frequently mentioned modes of action and elucidate the pharmaceutical entities marketing these drugs. A sentiment analysis model was created to score the abstracts for sentiment positivity or negativity. Finally, a modified Latent Dirichlet Allocation topic model was generated to extract key topics associated with the most frequently mentioned AEDs. We found the top five most common drugs that appeared from the analysis were Gabapentin, Levetiracetam, Topiramate, Lamotrigine and Acetazolamide. We further listed the key topics associated with these drugs and the overall positive or negative sentiment associated with them. Results of this study provide accurate and data intensive insights on the progress of anti-epileptic drug research.

Download Full-text

A Modeling Approach Based on Weighted LDA for Quality of Experience Measurement in Mobile Network Scenarios

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.571-572.404 ◽

2014 ◽

Vol 571-572 ◽

pp. 404-409

Author(s):

Wei Kuang ◽

Ling Han

Keyword(s):

Quality Of Experience ◽

Latent Dirichlet Allocation ◽

Topic Model ◽

Service Providers ◽

Mobile Network ◽

Vector Model ◽

Modeling Framework ◽

Modeling Approach ◽

Qoe Model

In this paper, we proposed a method for mining mobile users’ Quality of Experience (OoE) model based on weighted LDA. In the recent years, QoE has become an important concept for the quality of networks and services. At present , QoE has attracted the interest of network operators and service providers, because of providing a good QoE service to their customers can satisfy the customers and bring more users. In this paper, we are trying to build up users’ QoE model through topic model, an approach to generate a generative model for data mining. Latent Dirichlet Allocation (LDA) is a feasible and effective algorithm in text modeling. We propose an weighted LDA-based interest model within the modeling framework, and evaluate it on a mobile network users’ behavior extraction system. In this system, we can analyze the users’ behaviors, and build up a vector model for each user through a simple way. Besides, with the help of the topic model, we can get an exact model for users’ QoE, because we can generate the topic model through the vector model. Thus we can get the users’ QoE model, through which we can learn each user’s experience. Then the network operators can provide a better network service for their customers. In the end, we elaborate QoE management requirements for mobile network scenarios, and provide a QoE modeling approach for the mobile network scenarios.

Download Full-text