Thesaurus-Based Automatic Indexing

Author(s):  
Luis M. de Campos

In this chapter, we present a thesaurus application in the field of text mining and more specifically automatic indexing on the set of descriptors defined by a thesaurus. We begin by presenting various definitions and a mathematical thesaurus model, and also describe various examples of real world thesauri which are used in official institutions. We then explore the problem of thesaurus-based automatic indexing by describing its difficulties and distinguishing features and reviewing previous work in this area. Finally, we propose various lines of future research.

2021 ◽  
Vol 54 (7) ◽  
pp. 1-36
Author(s):  
Luciano Ignaczak ◽  
Guilherme Goldschmidt ◽  
Cristiano André Da Costa ◽  
Rodrigo Da Rosa Righi

The growth of data volume has changed cybersecurity activities, demanding a higher level of automation. In this new cybersecurity landscape, text mining emerged as an alternative to improve the efficiency of the activities involving unstructured data. This article proposes a Systematic Literature Review ( SLR ) to present the application of text mining in the cybersecurity domain. Using a systematic protocol, we identified 2,196 studies, out of which 83 were summarized. As a contribution, we propose a taxonomy to demonstrate the different activities in the cybersecurity domain supported by text mining. We also detail the strategies evaluated in the application of text mining tasks and the use of neural networks to support activities involving unstructured data. The work also discusses text classification performance aiming its application in real-world solutions. The SLR also highlights open gaps for future research, such as the analysis of non-English content and the intensification in the usage of neural networks.


This book provides an objective look into the dynamic world of debt markets, products, valuation, and analysis. It also provides an in-depth understanding about this subject from experts in the field, both practitioners and academics. The coverage extends from discussing basic concepts and their application to increasingly intricate and real-world situations. This volume spans the gamut from theoretical to practical, while attempting to offer a useful balance of detailed and user-friendly coverage. The book has several distinguishing features. It blends the contributions of a global array of scholars and practitioners into a single review of some of the most important topics in this area. The book follows an internally consistent approach in format and style. Hence, it is collectively much more than a compilation of chapters from an array of different authors. It presents theory without unnecessary abstraction, quantitative techniques using basic bond mathematics, and conventions at a useful level of detail. It also incorporates how investment professionals analyze and manage fixed income portfolios. The book emphasizes empirical evidence involving debt securities and markets so it is understandable to a wide array of readers. Each chapter contains discussion questions to help reinforce key concepts. The end of the book contains guideline answers to each question. Readers interested in a broad survey will benefit as will those looking for more in-depth presentations of specific areas within this field of study. In summary, the book provides a fresh look at this intriguing and dynamic but often complex subject.


Sensors ◽  
2021 ◽  
Vol 21 (2) ◽  
pp. 596
Author(s):  
Marco Buzzelli ◽  
Luca Segantin

We address the task of classifying car images at multiple levels of detail, ranging from the top-level car type, down to the specific car make, model, and year. We analyze existing datasets for car classification, and identify the CompCars as an excellent starting point for our task. We show that convolutional neural networks achieve an accuracy above 90% on the finest-level classification task. This high performance, however, is scarcely representative of real-world situations, as it is evaluated on a biased training/test split. In this work, we revisit the CompCars dataset by first defining a new training/test split, which better represents real-world scenarios by setting a more realistic baseline at 61% accuracy on the new test set. We also propagate the existing (but limited) type-level annotation to the entire dataset, and we finally provide a car-tight bounding box for each image, automatically defined through an ad hoc car detector. To evaluate this revisited dataset, we design and implement three different approaches to car classification, two of which exploit the hierarchical nature of car annotations. Our experiments show that higher-level classification in terms of car type positively impacts classification at a finer grain, now reaching 70% accuracy. The achieved performance constitutes a baseline benchmark for future research, and our enriched set of annotations is made available for public download.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Pei Xu ◽  
Joonghee Lee ◽  
James R. Barth ◽  
Robert Glenn Richey

PurposeThis paper discusses how the features of blockchain technology impact supply chain transparency through the lens of the information security triad (confidentiality, integrity and availability). Ultimately, propositions are developed to encourage future research in supply chain applications of blockchain technology.Design/methodology/approachPropositions are developed based on a synthesis of the information security and supply chain transparency literature. Findings from text mining of Twitter data and a discussion of three major blockchain use cases support the development of the propositions.FindingsThe authors note that confidentiality limits supply chain transparency, which causes tension between transparency and security. Integrity and availability promote supply chain transparency. Blockchain features can preserve security and increase transparency at the same time, despite the tension between confidentiality and transparency.Research limitations/implicationsThe research was conducted at a time when most blockchain applications were still in pilot stages. The propositions developed should therefore be revisited as blockchain applications become more widely adopted and mature.Originality/valueThis study is among the first to examine the way blockchain technology eases the tension between supply chain transparency and security. Unlike other studies that have suggested only positive impacts of blockchain technology on transparency, this study demonstrates that blockchain features can influence transparency both positively and negatively.


2021 ◽  
Author(s):  
Ru-Hsueh Wang ◽  
Yu-Wen Hong ◽  
Chia-Chun Li ◽  
Siao-Ling Li ◽  
Jenn-Long Liu ◽  
...  

BACKGROUND Diabetic patients with poor education about the disease may exhibit poor compliance and thus subsequently experience more complications. However, the conceptual gap between the diabetes education provided by health providers and the non-compliance of patients is still not well understood in the real world. OBJECTIVE Disclosing what people think about diabetes on social media may help to close this gap. METHODS In this study, social media data was collected from the OpView social media platform. After checking the quality of the data, we analyzed the trends in people’s discussions on the Internet using text mining. The natural language process, including word segmentation, and word count, and counting the relationships between the words. A word cloud is developed, and a clustering analyses are also performed. RESULTS There were 19,565 posts about diabetes collected from forums, community websites, and Q&A websites in 2017. The three most popular aspects of diabetes were diet (33.2%), life adjustment (21.2%), and avoiding complications (15.6%). Most of the discussions about diabetes were negative, and the top three negative ratios aspects were avoiding complications (7.60), problem-solving (4.08) and exercise (3.97). In terms of diet, the most popular topics were Chinese medicine and special diet therapy. In terms of life adjustment, financial issues, weight reduction, and a less painful glucometer were discussed the most. Furthermore, sexual dysfunction, neuropathy, nephropathy, and retinopathy were the most worrying issues in the avoiding complications area. Using text mining, we found that people care most about sexual dysfunction. Health providers care about the benefits of exercise in diabetes care, but people are mostly really concerned about sexual functioning. CONCLUSIONS A conceptual gap between health providers and diabetes patients existed in this real-world social media investigation. To spread healthy diabetic education concepts in the media, health providers might wish to provide more information related to patients actual areas of concern, such as sexual function, Chinese medicine, and weight reduction.


Author(s):  
Rafal Rzepka ◽  
Kenji Araki

This chapter introduces an approach and methods for creating a system that refers to human experiences and thoughts about these experiences in order to ethically evaluate other parties', and in a long run, its own actions. It is shown how applying text mining techniques can enrich machine's knowledge about the real world and how this knowledge could be helpful in the difficult realm of moral relativity. Possibilities of simulating empathy and applying proposed methods to various approaches are introduced together with discussion on the possibility of applying growing knowledge base to artificial agents for particular purposes, from simple housework robots to moral advisors, which could refer to millions of different experiences had by people in various cultures. The experimental results show efficiency improvements when compared to previous research and also discuss the problems with fair evaluation of moral and immoral acts.


2010 ◽  
pp. 2310-2325
Author(s):  
Adam Slagell ◽  
Kiran Lakkaraju

It is desirable for many reasons to share information, particularly computer and network logs. Researchers need it for experiments, incident responders need it for collaborative security, and educators need this data for real world examples. However, the sensitive nature of this information often prevents its sharing. Anonymization techniques have been developed in recent years that help reduce risk and navigate the trade-offs between privacy, security and the need to openly share information. This chapter looks at the progress made in this area of research over the past several years, identifies the major problems left to solve and sets a roadmap for future research.


Author(s):  
Kingsley Ofosu-Ampong ◽  
Thomas Anning-Dorson

Despite advances in information technology, studies suggest that there is little knowledge of how developing countries are applying gamification in agriculture, education, business, health, and other domains. Thus, from a systematic review, this chapter examines the extent of gamification research in the developing country context. In this chapter, 56 articles were reviewed, and the search was done in the Scopus database. This chapter explains the idea of game design elements in information systems and provides real-world examples of gamified systems outcomes from developing countries. The authors conclude with directions for future research to extend our knowledge of gamification and advance the existing methodologies, domains, and theories.


Sign in / Sign up

Export Citation Format

Share Document