Bitmap Indices for Data Warehouses

Data Warehousing and Mining ◽

10.4018/978-1-59904-951-9.ch091 ◽

2008 ◽

pp. 1590-1605

Author(s):

Kurt Stockinger ◽

Kesheng Wu

Keyword(s):

Time Complexity ◽

Large Data ◽

Database Systems ◽

Large Data Sets ◽

Future Research ◽

Data Sets ◽

Bitmap Index ◽

Access Method ◽

Efficient Access ◽

Efficient Query Processing

In this chapter we discuss various bitmap index technologies for efficient query processing in data warehousing applications. We review the existing literature and organize the technology into three categories, namely bitmap encoding, compression and binning. We introduce an efficient bitmap compression algorithm and examine the space and time complexity of the compressed bitmap index on large data sets from real applications. According to the conventional wisdom, bitmap indices are only efficient for low-cardinality attributes. However, we show that the compressed bitmap indices are also efficient for high-cardinality attributes. Timing results demonstrate that the bitmap indices significantly outperform the projection index, which is often considered to be the most efficient access method for multi-dimensional queries. Finally, we review the bitmap index technology currently supported by commonly used commercial database systems and discuss open issues for future research and development.

Download Full-text

Electronic Records Management - An Old Solution to a New Problem

Big Data ◽

10.4018/978-1-4666-9840-6.ch102 ◽

2016 ◽

pp. 2249-2274

Author(s):

Chinh Nguyen ◽

Rosemary Stockdale ◽

Helana Scheepers ◽

Jason Sargent

Keyword(s):

Big Data ◽

Rapid Development ◽

Large Data ◽

Large Data Sets ◽

Electronic Records ◽

Future Research ◽

Records Management ◽

Data Sets ◽

Interactive Nature ◽

Electronic Records Management

The rapid development of technology and interactive nature of Government 2.0 (Gov 2.0) is generating large data sets for Government, resulting in a struggle to control, manage, and extract the right information. Therefore, research into these large data sets (termed Big Data) has become necessary. Governments are now spending significant finances on storing and processing vast amounts of information because of the huge proliferation and complexity of Big Data and a lack of effective records management. On the other hand, there is a method called Electronic Records Management (ERM), for controlling and governing the important data of an organisation. This paper investigates the challenges identified from reviewing the literature for Gov 2.0, Big Data, and ERM in order to develop a better understanding of the application of ERM to Big Data to extract useable information in the context of Gov 2.0. The paper suggests that a key building block in providing useable information to stakeholders could potentially be ERM with its well established governance policies. A framework is constructed to illustrate how ERM can play a role in the context of Gov 2.0. Future research is necessary to address the specific constraints and expectations placed on governments in terms of data retention and use.

Download Full-text

Toward Psychoinformatics: Computer Science Meets Psychology

Computational and Mathematical Methods in Medicine ◽

10.1155/2016/2983685 ◽

2016 ◽

Vol 2016 ◽

pp. 1-10 ◽

Cited By ~ 40

Author(s):

Christian Montag ◽

Éilish Duke ◽

Alexander Markowetz

Keyword(s):

Computer Science ◽

Online Social Network ◽

Large Data ◽

Research Field ◽

Large Data Sets ◽

Future Research ◽

Data Sets ◽

Psychological Traits ◽

Scientific Methods ◽

Insight Into

The present paper provides insight into an emerging research discipline calledPsychoinformatics. In the context ofPsychoinformatics, we emphasize the cooperation between the disciplines of psychology and computer science in handling large data sets derived from heavily used devices, such as smartphones or online social network sites, in order to shed light on a large number of psychological traits, including personality and mood. New challenges await psychologists in light of the resulting “Big Data” sets, because classic psychological methods will only in part be able to analyze this data derived from ubiquitous mobile devices, as well as other everyday technologies. As a consequence, psychologists must enrich their scientific methods through the inclusion of methods from informatics. The paper provides a brief review of one area of this research field, dealing mainly with social networks and smartphones. Moreover, we highlight how data derived fromPsychoinformaticscan be combined in a meaningful way with data from human neuroscience. We close the paper with some observations of areas for future research and problems that require consideration within this new discipline.

Download Full-text

Implementation of Supervised Learning towards Optimizing Queries in Database Systems

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.b3531.129219 ◽

2019 ◽

Vol 9 (2) ◽

pp. 1182-1187

Keyword(s):

Machine Learning ◽

Supervised Learning ◽

Student Loans ◽

Large Data ◽

Database Systems ◽

Large Data Sets ◽

Data Sets ◽

Human Intervention ◽

Huge Data ◽

Future Direction

Machine learning is a technology which with accumulated data provides better decisions towards future applications. It is also the scientific study of algorithms implemented efficiently to perform a specific task without using explicit instructions. It may also be viewed as a subset of artificial intelligence in which it may be linked with the ability to automatically learn and improve from experience without being explicitly programmed. Its primary intention is to allow the computers learn automatically and produce more accurate results in order to identify profitable opportunities. Combining machine learning with AI and cognitive technologies can make it even more effective in processing large volumes human intervention or assistance and adjust actions accordingly. It may enable analyzing the huge data of information. It may also be linked to algorithm driven study towards improving the performance of the tasks. In such scenario, the techniques can be applied to judge and predict large data sets. The paper concerns the mechanism of supervised learning in the database systems, which would be self driven as well as secure. Also the citation of an organization dealing with student loans has been presented. The paper ends discussion, future direction and conclusion.

Download Full-text

A New Look at Neuroticism: Should We Worry So Much About Worrying?

Current Directions in Psychological Science ◽

10.1177/0963721419887184 ◽

2019 ◽

Vol 29 (1) ◽

pp. 92-101

Author(s):

Alexander Weiss ◽

Ian J. Deary

Keyword(s):

Physical Health ◽

Molecular Genetic ◽

Large Data ◽

Large Data Sets ◽

Future Research ◽

Eysenck Personality Questionnaire ◽

Data Sets ◽

Personality Questionnaire ◽

Genetic Studies ◽

Mental And Physical Health

People with higher levels of neuroticism seem to have drawn the short straw of personality. However, there are multiple ways to score highly in neuroticism. Analyses of the short scale of the Eysenck Personality Questionnaire-Revised in three large data sets have revealed that higher neuroticism can mean having elevated scores on all items, elevated scores mainly on items related to anxiety and tension, or elevated scores mainly on items related to worry and vulnerability. Epidemiological and molecular genetic studies have revealed that people in the first group are at greater risk for poorer mental and physical health but that people in the latter two groups, especially those beset by worry and feelings of vulnerability, have better physical health. These findings suggest that future research on neuroticism and health should focus on different ways that people can exhibit high neuroticism.

Download Full-text

RESEARCH ON ONLINE LEARNING

Online Learning ◽

10.24059/olj.v11i1.1736 ◽

2019 ◽

Vol 11 (1) ◽

Cited By ~ 1

Author(s):

Karen Swan

Keyword(s):

Online Learning ◽

Online Education ◽

Community Of Inquiry ◽

Large Data ◽

Research Literature ◽

Large Data Sets ◽

Future Research ◽

Data Sets ◽

Community Of Inquiry Framework ◽

Blended Course

The second session of the Sloan-C Summer Workshop focused on research and how it might help us meet this challenge. In particular, presenters in this session were charged with addressing what the research to date can tell us about student, faculty and institutional change, what directions for future research seem most promising, and what we really need to do to move research on online learning to more rigorous and more informative levels.The papers they wrote are collected in this section. They include: a critical review of what the research literature can tell us about blended learning relative to each of Sloan-C’s five pillars of quality in online learning; two papers on one of the more promising lines of research in online learning, research involving the Community of Inquiry framework; an intriguing look at what very large data sets and innovative methodologies can tell us about our students and their reactions to blended course offerings; and an equally provocative thought piece on research on online learning in general which asks us to reconsider how we frame that enterprise, arguing that research on online education might generate more meaningful outcomes. The papers are both informative and thought-provoking, and although they may generate more questions than they answer, they clearly suggest directions for future research that could move ourunderstanding of online education forward in interesting and important ways.

Download Full-text

Evaluation of Optimization Strategies for Incremental Graph Queries

Periodica Polytechnica Electrical Engineering and Computer Science ◽

10.3311/ppee.9769 ◽

2017 ◽

Vol 61 (2) ◽

pp. 175 ◽

Cited By ~ 2

Author(s):

Gábor Szárnyas ◽

János Maginecz ◽

Dániel Varró

Keyword(s):

Response Times ◽

Distributed Storage ◽

Large Data ◽

Database Systems ◽

Optimization Techniques ◽

Large Data Sets ◽

Data Sets ◽

Multiple Datasets ◽

Graph Queries ◽

Relational Database Systems

The last decade brought considerable improvements in distributed storage and query technologies, known as NoSQL systems. These systems provide quick evaluation of simple retrieval operations and are able to answer certain complex queries in a scalable way, albeit not instantly. Providing scalability and quick response times at the same time for querying large data sets is still a challenging task. Evaluating complex graph queries is particularly difficult, as it requires lots of join, antijoin and filtering operations. This paper presents optimization techniques used in relational database systems and applies them on graph queries. We evaluate various query plans on multiple datasets and discuss the effect of different optimization techniques.

Download Full-text

Psychoinformatics: a theoretical approach on information science and psychology

Journal of Management and Science ◽

10.26524/jms.2020.2.2 ◽

2020 ◽

Vol 10 (2) ◽

pp. 7-10

Author(s):

Deepti Pandey

Keyword(s):

Big Data ◽

Social Networking ◽

Information Science ◽

Large Data ◽

Large Data Sets ◽

Future Research ◽

Data Sets ◽

Scientific Methods ◽

New Challenges ◽

Insight Into

This article provides insight into an emerging research discipline called Psychoinformatics.In the context of Psychoinformatics, we emphasize the co-operation between the disciplines of Psychology and Information Science which handles large data sets is derivative from severely used devices like smartphones or any online social networking in order to highlight sychological qualities including both personality and mood. New challenges await psychologists considering the result “Big Data” sets because classic psychological methods will only in part be able to analyze this data derived from ubiquitous mobile devices as well as other everyday technologies. Consequently, psychologist must enrich their scientific methods through the inclusion of methods from informatics. Furthermore, we also emphasize on data which is derived from Psychoinformatics to combine in a such a way to give meaningful way with data from human neuroscience. We close the article with some observations of areas for future research and problems that require consideration within this new discipline.

Download Full-text

New Directions in the Paleoecology of Paleozoic Brachiopods

The Paleontological Society Papers ◽

10.1017/s1089332600000966 ◽

2001 ◽

Vol 7 ◽

pp. 185-206 ◽

Cited By ~ 3

Author(s):

Lindsey R. Leighton

Keyword(s):

Statistical Tests ◽

Population Analysis ◽

Large Data ◽

Large Data Sets ◽

Future Research ◽

Data Sets ◽

Great Abundance ◽

New Directions ◽

New Ideas ◽

Excellent Preservation

Because of their great abundance, widespread distribution, excellent preservation potential (Foote and Sepkoski, 1999), and a tendency not to disarticulate after death, brachiopods are ideal subjects for paleoecological research involving morphometrics, population analysis, and phylogenetics. Paleoecology is a subdiscipline that demands large data sets and statistical tests, and brachiopods provide the opportunity to create such databases. I fully expect to see brachiopods play a major role in the coming years in studies on the cutting edge of paleoecology. My approach in this chapter is to provide some background and tools that hopefully will inspire many new ideas for using brachiopods in the study of paleoecology. My intent is not to convince anyone of the correctness of my ideas, but rather to encourage future research in these directions.

Download Full-text

Electronic Records Management - An Old Solution to a New Problem

International Journal of Electronic Government Research ◽

10.4018/ijegr.2014100105 ◽

2014 ◽

Vol 10 (4) ◽

pp. 94-116 ◽

Cited By ~ 1

Author(s):

Chinh Nguyen ◽

Rosemary Stockdale ◽

Helana Scheepers ◽

Jason Sargent

Keyword(s):

Big Data ◽

Rapid Development ◽

Large Data ◽

Large Data Sets ◽

Electronic Records ◽

Future Research ◽

Records Management ◽

Data Sets ◽

The Right ◽

Electronic Records Management

The rapid development of technology and interactive nature of Government 2.0 (Gov 2.0) is generating large data sets for Government, resulting in a struggle to control, manage, and extract the right information. Therefore, research into these large data sets (termed Big Data) has become necessary. Governments are now spending significant finances on storing and processing vast amounts of information because of the huge proliferation and complexity of Big Data and a lack of effective records management. On the other hand, there is a method called Electronic Records Management (ERM), for controlling and governing the important data of an organisation. This paper investigates the challenges identified from reviewing the literature for Gov 2.0, Big Data, and ERM in order to develop a better understanding of the application of ERM to Big Data to extract useable information in the context of Gov 2.0. The paper suggests that a key building block in providing useable information to stakeholders could potentially be ERM with its well established governance policies. A framework is constructed to illustrate how ERM can play a role in the context of Gov 2.0. Future research is necessary to address the specific constraints and expectations placed on governments in terms of data retention and use.

Download Full-text