Text Mining Using Latent Semantic Analysis: An Illustration through Examination of 30 Years of Research at JIS

Latent Semantic Analysis (LSA) or Latent Semantic Indexing (LSI), when applied to information retrieval, has been a major analysis approach in text mining. It is an extension of the vector space method in information retrieval, representing documents as numerical vectors but using a more sophisticated mathematical approach to characterize the essential features of the documents and reduce the number of features in the search space. This chapter summarizes several major approaches to this dimensionality reduction, each of which has strengths and weaknesses, and it describes recent breakthroughs and advances. It shows how the constructs and products of LSA applications can be made user-interpretable and reviews applications of LSA beyond information retrieval, in particular, to text information visualization.

Download Full-text

Latent Semantic Analysis for Text Mining and Beyond

Intelligent Multimedia Databases and Information Retrieval ◽

10.4018/978-1-61350-126-9.ch015 ◽

2013 ◽

pp. 253-280 ◽

Cited By ~ 2

Author(s):

Anne Kao ◽

Steve Poteet ◽

Jason Wu ◽

William Ferng ◽

Rod Tjoelker ◽

...

Keyword(s):

Information Retrieval ◽

Text Mining ◽

Latent Semantic Analysis ◽

Web Mining ◽

Semantic Analysis ◽

Search Space ◽

Latent Semantic Indexing ◽

Cross Language Information Retrieval ◽

Text Information ◽

Cross Language

Latent Semantic Analysis (LSA) or Latent Semantic Indexing (LSI), when applied to information retrieval, has been a major analysis approach in text mining. It is an extension of the vector space method in information retrieval, representing documents as numerical vectors but using a more sophisticated mathematical approach to characterize the essential features of the documents and reduce the number of features in the search space. This chapter summarizes several major approaches to this dimensionality reduction, each of which has strengths and weaknesses, and it describes recent breakthroughs and advances. It shows how the constructs and products of LSA applications can be made user-interpretable and reviews applications of LSA beyond information retrieval, in particular, to text information visualization. While the major application of LSA is for text mining, it is also highly applicable to cross-language information retrieval, Web mining, and analysis of text transcribed from speech and textual information in video.

Download Full-text

The Use of Text Mining Techniques in Electronic Discovery for Legal Matters

Next Generation Search Engines ◽

10.4018/978-1-4666-0330-1.ch008 ◽

2012 ◽

pp. 174-190

Author(s):

Michael W. Berry ◽

Reed Esau ◽

Bruce Kiefer

Keyword(s):

Information Retrieval ◽

Text Mining ◽

Matrix Factorization ◽

Latent Semantic Analysis ◽

Semantic Analysis ◽

Electronic Documents ◽

Collection Process ◽

Relevance Judgments ◽

Electronic Discovery ◽

Non Negative Matrix Factorization

Electronic discovery (eDiscovery) is the process of collecting and analyzing electronic documents to determine their relevance to a legal matter. Office technology has advanced and eased the requirements necessary to create a document. As such, the volume of data has outgrown the manual processes previously used to make relevance judgments. Methods of text mining and information retrieval have been put to use in eDiscovery to help tame the volume of data; however, the results have been uneven. This chapter looks at the historical bias of the collection process. The authors examine how tools like classifiers, latent semantic analysis, and non-negative matrix factorization deal with nuances of the collection process.

Download Full-text

Evaluating the compliance of modern electronic banking and digital cryptocurrency systems with the information society's requirements

Finance and Credit ◽

10.24891/fc.27.1.88 ◽

2021 ◽

Vol 27 (1) ◽

pp. 88-112

Author(s):

Ekaterina I. DYUDIKOVA ◽

Natal'ya N. KUNITSYNA

Keyword(s):

Big Data ◽

Text Mining ◽

Latent Semantic Analysis ◽

Semantic Analysis ◽

Nearest Neighbor ◽

Systems Approach ◽

Social Needs ◽

K Nearest Neighbor ◽

Electronic Banking ◽

Digital Platforms

Subject. The digital economy emerged as a new generation of financial instruments, such as cryptocurrencies, were invented and proliferated, which were able to counteract global challenges. Those who oppose to the legitimization of digital assets and their integration into the payment infrastructure do not point out material advantages and support drastic transformations of the existing financial system. However, assuming very risky digital payments, the scope of cruptocurrency still grows. The article presents the outcome of intellectual text analysis of feedback left by users of electronic banking and digital cryptocurrency systems. Doing so, we determined to what extent they are satisfied with various systems. Objectives. The study is intended to provide the theoretical and methodological rationale for, and practically test the model that determines key themes in analyzable non-structured big data and allows to automatically evaluate the satisfaction of users with various payment systems. Methods. We resorted to the formal logic, systems approach, methods of comparative analysis, text mining and latent semantic analysis. Results. We analyzed reviews uploaded to www.banki.ru and www.otzovik.ru through parsing, stop word elimination, stemming, probabilistic thematic modeling based on the latent semantic analysis. We assessed to what extent users are satisfied with various systems by examining their reviews through the text tone analysis, the k-nearest neighbor algorithm and automated scoring of unrated reviews. Conclusions and Relevance. Text mining of unstructured big data shows that digital platforms, notwithstanding their infancy and high risks, already mostly satisfy social needs as compared to electronic banking systems, which determines the reasonableness of integrating them into the payment system to unlock their potential.

Download Full-text

Put Your Best Text Forward: Introducing Textual Analysis into the Accounting Classroom

Issues in Accounting Education ◽

10.2308/issues-19-108 ◽

2021 ◽

Author(s):

Ingrid Fisher ◽

Mark Hughes ◽

Diane J. Janvrin

Keyword(s):

Information Systems ◽

Data Analytics ◽

Textual Analysis ◽

Accounting Information ◽

Digital Data ◽

Financial Statement ◽

Accounting Students ◽

Accounting Profession ◽

Financial Statement Analysis ◽

Accounting Information Systems

The use of textual analysis methods in the accounting profession has grown markedly in recent years. Accounting professionals as well as business and accounting accreditors have called for accounting students to acquire an increased depth and breadth of knowledge of digital data analytics. This case enables accounting instructors, with no previous background or experience in textual analysis, to introduce students to the use of textual analysis in accounting and allows students to conduct simple analyses using freely available software and documents retrieved from publicly available SEC filings. This case is designed for auditing, accounting information systems, fraud examination, and financial statement analysis courses, but it can be used in any accounting course where the content of relevant documents is subject to examination.

Download Full-text

Perspectives on Past and Future AIS Research as the Journal of Information Systems Turns Thirty

Journal of Information Systems ◽

10.2308/isys-51495 ◽

2016 ◽

Vol 30 (3) ◽

pp. 157-171 ◽

Cited By ~ 5

Author(s):

Kevin C. Moffitt ◽

Vernon J. Richardson ◽

Neal M. Snow ◽

Martin M. Weisner ◽

David A. Wood

Keyword(s):

Information Systems ◽

Text Mining ◽

Annual Meeting ◽

Accounting Information ◽

Future Research ◽

Accounting Information Systems ◽

Panel Session ◽

Research Themes ◽

Research Questions ◽

Over Time

ABSTRACT This paper complements a panel session pertaining to past and future AIS research that was held during the 2015 American Accounting Association Annual Meeting. There are two main parts to this commentary. First, using text mining techniques on AIS article abstracts for the period 1986–2014, we identify the top research themes across three leading AIS journals (Journal of Information Systems, International Journal of Accounting Information Systems, and Journal of Emerging Technologies in Accounting). We chart the usage of these themes over time and discuss their shifting popularity. Second, we speculate on the future of AIS research and identify a series of broad research streams that may garner greater importance over the next 30 years. A host of broad research questions accompany the discussion of emerging and anticipated research streams in order to motivate and guide future research.

Download Full-text

Visualization of Large-Scale Narrative Data Describing Human Error

Human Factors The Journal of the Human Factors and Ergonomics Society ◽

10.1177/0018720817709374 ◽

2017 ◽

Vol 59 (4) ◽

pp. 520-534 ◽

Cited By ~ 1

Author(s):

William J. Irwin ◽

Saul D. Robinson ◽

Stephen M. Belt

Keyword(s):

Information Systems ◽

Geographic Information Systems ◽

Latent Semantic Analysis ◽

Human Error ◽

Large Scale ◽

Semantic Analysis ◽

Difficult Problem ◽

Geographic Information ◽

Data Sets ◽

Systems Software

Objective Introduced is a visual data exploration technique for compiling, reducing, organizing, visually rendering, and filtering text-based narratives for detailed analysis. Background The analysis of data sets provides an increasingly difficult problem. The method of visual representation is considered an effective tool in many applications. The focus of this study was to determine if a latent semantic analysis–based projection of narrative data into a geographic information systems software program provided a useful tool for reducing and organizing large sums of narrative data for analysis. Method This approach utilizes latent semantic analysis to reduce narratives to a high-dimensional vector, truncates the vector to a two-dimensional projection through application of isometric mapping, and then visually renders the result with geographic information systems software. This method is demonstrated on aviation self-reported safety narratives sourced from the Aviation Safety Reporting System. Results Thematic regions from the corpus are illustrated along with the first five topics identified. Conclusion Shown is the ability to assimilate a large number of narratives, identify contextual themes, recognize common events and outliers, and organize resultant topics. Application Large narrative-based data sets present in aviation and other domains may be visualized to facilitate efficient analysis, enhance comprehension, and improve safety.

Download Full-text

Synergies of Text Mining and Multiple Attribute Decision Making: A Criteria Selection and Weighting System in a Prospective MADM Outline

Symmetry ◽

10.3390/sym12050868 ◽

2020 ◽

Vol 12 (5) ◽

pp. 868

Author(s):

Sarfaraz Hashemkhani Zolfani ◽

Arman Derakhti

Keyword(s):

Decision Making ◽

Text Mining ◽

Latent Semantic Analysis ◽

Semantic Analysis ◽

Multiple Attribute Decision Making ◽

General Concept ◽

Futures Studies ◽

Weighting System ◽

Multiple Attribute ◽

Multi Attribute Decision Making

In this study, a new way of criteria selection and a weighting system will be presented in a multi-disciplinary framework. Weighting criteria in Multi-Attribute Decision Making (MADM) has been developing as the most attractive section in the field. Although many ideas have been developed during the last decades, there is no such great diversity that can be mentioned in the literature. This study is looking from outside the box and is presenting something totally new by using big data and text mining in a Prospective MADM outline. PMADM is a hybrid interconnected concept between the Futures Studies and MADM fields. Text mining, which is known as a useful tool in Futures Studies, is applied to create a widespread pilot system for weighting and criteria selection in the PMADM outline. Latent Semantic Analysis (LSA), as an influential method inside the general concept of text mining, is applied to show how a data warehouse’s output, which in this case is Scopus, can reach the final criteria selection and weighting of the criteria.

Download Full-text