data warehouses Latest Research Papers

Web-Enabled Data Warehouses

10.1201/9780429114878-65 ◽

2021 ◽

pp. 711-719

Author(s):

Mary Ayala-Bush ◽

John Jordan ◽

Walter Kuketz

Keyword(s):

Data Warehouses

Development of an Efficient ETL Technique for Data Warehouses

10.1007/978-981-16-6636-0_20 ◽

2021 ◽

pp. 243-255

Author(s):

Md Badiuzzaman Biplob ◽

Md. Mokammel Haque

Keyword(s):

Data Warehouses

Utilization Extract, Transform, Load For Developing Data Warehouse In Education Using Pentaho Data Integration

Journal of Physics Conference Series ◽

10.1088/1742-6596/2111/1/012030 ◽

2021 ◽

Vol 2111 (1) ◽

pp. 012030

Author(s):

A D Barahama ◽

R Wardani

Keyword(s):

Decision Making ◽

Information Technology ◽

Data Warehouse ◽

Business Process ◽

Technology In Education ◽

Data Warehouses ◽

Modelling Method ◽

Advanced Information Technology ◽

Planning Development ◽

Multidimensional Modelling

Abstract The utilization of data warehouses in various fields is an absolute necessity. A data warehouse is a database that contains large amounts of data that aims to help organizations, fields, and institutions specifically for decision making. Data warehouses can produce important information in the future. Loading data from various sources and processed through an ETL (Extract, Transform, Load) process that displays data consistently is the basis for creating a data warehouse architecture. The development of a data warehouse in education will provide significant benefits for the progress of education. Integration of data and processing results stored in the data warehouse can be the basis for evaluating better planning. Development of data warehouse adopt the multidimensional modelling method which consists of four stages: select the business process, declare the grain, select dimensions, and identify facts. This stage produces a data warehouse architecture and influences and contributes to the advanced information technology in education.

Schema Evolution in Multiversion Data Warehouses

International Journal of Data Warehousing and Mining ◽

10.4018/ijdwm.2021100101 ◽

2021 ◽

Vol 17 (4) ◽

pp. 1-28

Author(s):

Waqas Ahmed ◽

Esteban Zimányi ◽

Alejandro A. Vaisman ◽

Robert Wrembel

Keyword(s):

Efficient Solution ◽

Business Processes ◽

Online Analytical Processing ◽

User Requirements ◽

Schema Evolution ◽

Data Warehouses ◽

Relational Schema ◽

External Sources ◽

Analytical Processing

Data warehouses (DWs) evolve in both their content and schema due to changes of user requirements, business processes, or external sources to name a few. Although multiple approaches using temporal and/or multiversion DWs have been proposed to handle these changes, an efficient solution for this problem is still lacking. The authors' approach is to separate concerns and use temporal DWs to deal with content changes, and multiversion DWs to deal with schema changes. To address the former, previously, they have proposed a temporal multidimensional (MD) model. In this paper, they propose a multiversion MD model for schema evolution to tackle the latter problem. The two models complement each other and allow managing both content and schema evolution. In this paper, the semantics of schema modification operators (SMOs) to derive various schema versions are given. It is also shown how online analytical processing (OLAP) operations like roll-up work on the model. Finally, the mapping from the multiversion MD model to a relational schema is given along with OLAP operations in standard SQL.

MR-MVPP: A map-reduce-based approach for creating MVPP in data warehouses for big data applications

Information Sciences ◽

10.1016/j.ins.2021.04.004 ◽

2021 ◽

Vol 570 ◽

pp. 200-224

Author(s):

Hossein Azgomi ◽

Mohammad Karim Sohrabi

Keyword(s):

Big Data ◽

Map Reduce ◽

Data Warehouses ◽

Big Data Applications

Logical design of information support for distributed information systems of critical application

Modeling of systems and processes ◽

10.12737/2219-0767-2021-14-2-67-73 ◽

2021 ◽

Vol 14 (2) ◽

pp. 67-73

Author(s):

Valeriy Suhanov ◽

Oleg Lankin

Keyword(s):

Data Warehouse ◽

Information Technologies ◽

Dynamic Properties ◽

Analytical Data ◽

Information Support ◽

Data Warehouses ◽

Distributed Information System ◽

Distributed Information ◽

New Information ◽

Logical Design

The article deals with new information technologies for building a data warehouse in a distributed information system of critical application. The existing principles of creating data warehouses, as well as the outlined ways to improve them, are always associated with the collection, storage and use of information that is recorded at a certain point in time, while they store data corresponding to the last time count. This approach to the development and application of data warehouses can be called static, since it does not store or display the behavior of objects at past points in time. However, the objects that are included in the data warehouse have pronounced dynamic properties and therefore must be displayed dynamically. The way out of this situation is the creation of analytical data warehouses, which will provide an opportunity to more effectively solve traditional and qualitatively new tasks in the system under consideration.

An Automatic Schema-Instance Approach for Merging Multidimensional Data Warehouses

10.1145/3472163.3472268 ◽

2021 ◽

Author(s):

Yuzhao Yang ◽

Jérôme Darmont ◽

Franck Ravat ◽

Olivier Teste

Keyword(s):

Multidimensional Data ◽

Data Warehouses

FIBER: enabling flexible retrieval of electronic health records data for clinical predictive modeling

JAMIA Open ◽

10.1093/jamiaopen/ooab048 ◽

2021 ◽

Vol 4 (3) ◽

Author(s):

Suparno Datta ◽

Jan Philipp Sachs ◽

Harry FreitasDa Cruz ◽

Tom Martensen ◽

Philipp Bode ◽

...

Keyword(s):

Machine Learning ◽

Electronic Health Records ◽

Clinical Data ◽

Heart Surgery ◽

Data Warehouses ◽

Learning Models ◽

Health Records ◽

Star Schema ◽

Electronic Health ◽

Machine Learning Models

Abstract Objectives The development of clinical predictive models hinges upon the availability of comprehensive clinical data. Tapping into such resources requires considerable effort from clinicians, data scientists, and engineers. Specifically, these efforts are focused on data extraction and preprocessing steps required prior to modeling, including complex database queries. A handful of software libraries exist that can reduce this complexity by building upon data standards. However, a gap remains concerning electronic health records (EHRs) stored in star schema clinical data warehouses, an approach often adopted in practice. In this article, we introduce the FlexIBle EHR Retrieval (FIBER) tool: a Python library built on top of a star schema (i2b2) clinical data warehouse that enables flexible generation of modeling-ready cohorts as data frames. Materials and Methods FIBER was developed on top of a large-scale star schema EHR database which contains data from 8 million patients and over 120 million encounters. To illustrate FIBER’s capabilities, we present its application by building a heart surgery patient cohort with subsequent prediction of acute kidney injury (AKI) with various machine learning models. Results Using FIBER, we were able to build the heart surgery cohort (n = 12 061), identify the patients that developed AKI (n = 1005), and automatically extract relevant features (n = 774). Finally, we trained machine learning models that achieved area under the curve values of up to 0.77 for this exemplary use case. Conclusion FIBER is an open-source Python library developed for extracting information from star schema clinical data warehouses and reduces time-to-modeling, helping to streamline the clinical modeling process.

A Comprehensive Portal for Clinical and Translational Data Warehouses

Studies in Health Technology and Informatics - Public Health and Informatics ◽

10.3233/shti210201 ◽

2021 ◽

Author(s):

Marco Johns ◽

Armin Müller ◽

Felix Nikolaus Wirth ◽

Fabian Prasser

Keyword(s):

Data Warehouse ◽

Modular Design ◽

Access Point ◽

Data Warehouses ◽

Training Material ◽

Web Technologies ◽

Wide Range ◽

Video Tutorials ◽

The Status ◽

Central Access

Data-driven methods in biomedical research can help to obtain new insights into the development, progression and therapy of diseases. Clinical and translational data warehouses such as Informatics for Integrating Biology and the Bedside (i2b2) and tranSMART are important solutions for this. From the well-known FAIR data principles, which are used to address the aspects of findability, accessibility, interoperability and reusability. In this paper, we focus on findability. For this purpose, we describe a portal solution that acts as a catalogue for a wide range of data warehouse instances, featuring a central access point and links to training material, such as user manuals and video tutorials. Moreover, the portal provides an overview of the status of multiple warehouses for developers and a set of statistics about the data currently loaded. Due to its modular design and the use of modern web technologies, the portal is easy to extend and customize to reflect different corporate designs and institutional requirements.

Features Application of the Dissection-Placing Method for Secure Data Storage in External Data Warehouses

INFORMACIONNYE TEHNOLOGII ◽

10.17587/it.27.259-266 ◽

2021 ◽

Vol 27 (5) ◽

pp. 259-266

Author(s):

L. V. Arshinskiy ◽

◽

G. N. Shurkhovetsky ◽

Keyword(s):

Data Storage ◽

Data Streams ◽

Data Warehouses ◽

Secure Storage ◽

External Data ◽

Secure Data ◽

Random Placement

The article considers the application of the dissection-placing method for secure storage of information in external, primarily cloud-based data warehouses. Various approaches to the implementation of the method, including patent material, are analyzed. It is shown that the method is most effective for bitwise dissection of files and random placement of bits in data streams sent to separate warehouses.

data warehouses
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Web-Enabled Data Warehouses

Development of an Efficient ETL Technique for Data Warehouses

Utilization Extract, Transform, Load For Developing Data Warehouse In Education Using Pentaho Data Integration

Schema Evolution in Multiversion Data Warehouses

MR-MVPP: A map-reduce-based approach for creating MVPP in data warehouses for big data applications

Logical design of information support for distributed information systems of critical application

An Automatic Schema-Instance Approach for Merging Multidimensional Data Warehouses

FIBER: enabling flexible retrieval of electronic health records data for clinical predictive modeling

A Comprehensive Portal for Clinical and Translational Data Warehouses

Features Application of the Dissection-Placing Method for Secure Data Storage in External Data Warehouses

Export Citation Format

data warehousesRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Web-Enabled Data Warehouses

Development of an Efficient ETL Technique for Data Warehouses

Utilization Extract, Transform, Load For Developing Data Warehouse In Education Using Pentaho Data Integration

Schema Evolution in Multiversion Data Warehouses

MR-MVPP: A map-reduce-based approach for creating MVPP in data warehouses for big data applications

Logical design of information support for distributed information systems of critical application

An Automatic Schema-Instance Approach for Merging Multidimensional Data Warehouses

FIBER: enabling flexible retrieval of electronic health records data for clinical predictive modeling

A Comprehensive Portal for Clinical and Translational Data Warehouses

Features Application of the Dissection-Placing Method for Secure Data Storage in External Data Warehouses

data warehouses
Recently Published Documents