data warehouses
Recently Published Documents


TOTAL DOCUMENTS

1045
(FIVE YEARS 99)

H-INDEX

38
(FIVE YEARS 3)

2021 ◽  
pp. 711-719
Author(s):  
Mary Ayala-Bush ◽  
John Jordan ◽  
Walter Kuketz
Keyword(s):  

2021 ◽  
pp. 243-255
Author(s):  
Md Badiuzzaman Biplob ◽  
Md. Mokammel Haque
Keyword(s):  

2021 ◽  
Vol 2111 (1) ◽  
pp. 012030
Author(s):  
A D Barahama ◽  
R Wardani

Abstract The utilization of data warehouses in various fields is an absolute necessity. A data warehouse is a database that contains large amounts of data that aims to help organizations, fields, and institutions specifically for decision making. Data warehouses can produce important information in the future. Loading data from various sources and processed through an ETL (Extract, Transform, Load) process that displays data consistently is the basis for creating a data warehouse architecture. The development of a data warehouse in education will provide significant benefits for the progress of education. Integration of data and processing results stored in the data warehouse can be the basis for evaluating better planning. Development of data warehouse adopt the multidimensional modelling method which consists of four stages: select the business process, declare the grain, select dimensions, and identify facts. This stage produces a data warehouse architecture and influences and contributes to the advanced information technology in education.


2021 ◽  
Vol 17 (4) ◽  
pp. 1-28
Author(s):  
Waqas Ahmed ◽  
Esteban Zimányi ◽  
Alejandro A. Vaisman ◽  
Robert Wrembel

Data warehouses (DWs) evolve in both their content and schema due to changes of user requirements, business processes, or external sources to name a few. Although multiple approaches using temporal and/or multiversion DWs have been proposed to handle these changes, an efficient solution for this problem is still lacking. The authors' approach is to separate concerns and use temporal DWs to deal with content changes, and multiversion DWs to deal with schema changes. To address the former, previously, they have proposed a temporal multidimensional (MD) model. In this paper, they propose a multiversion MD model for schema evolution to tackle the latter problem. The two models complement each other and allow managing both content and schema evolution. In this paper, the semantics of schema modification operators (SMOs) to derive various schema versions are given. It is also shown how online analytical processing (OLAP) operations like roll-up work on the model. Finally, the mapping from the multiversion MD model to a relational schema is given along with OLAP operations in standard SQL.


2021 ◽  
Vol 14 (2) ◽  
pp. 67-73
Author(s):  
Valeriy Suhanov ◽  
Oleg Lankin

The article deals with new information technologies for building a data warehouse in a distributed information system of critical application. The existing principles of creating data warehouses, as well as the outlined ways to improve them, are always associated with the collection, storage and use of information that is recorded at a certain point in time, while they store data corresponding to the last time count. This approach to the development and application of data warehouses can be called static, since it does not store or display the behavior of objects at past points in time. However, the objects that are included in the data warehouse have pronounced dynamic properties and therefore must be displayed dynamically. The way out of this situation is the creation of analytical data warehouses, which will provide an opportunity to more effectively solve traditional and qualitatively new tasks in the system under consideration.


2021 ◽  
Author(s):  
Yuzhao Yang ◽  
Jérôme Darmont ◽  
Franck Ravat ◽  
Olivier Teste

JAMIA Open ◽  
2021 ◽  
Vol 4 (3) ◽  
Author(s):  
Suparno Datta ◽  
Jan Philipp Sachs ◽  
Harry FreitasDa Cruz ◽  
Tom Martensen ◽  
Philipp Bode ◽  
...  

Abstract Objectives The development of clinical predictive models hinges upon the availability of comprehensive clinical data. Tapping into such resources requires considerable effort from clinicians, data scientists, and engineers. Specifically, these efforts are focused on data extraction and preprocessing steps required prior to modeling, including complex database queries. A handful of software libraries exist that can reduce this complexity by building upon data standards. However, a gap remains concerning electronic health records (EHRs) stored in star schema clinical data warehouses, an approach often adopted in practice. In this article, we introduce the FlexIBle EHR Retrieval (FIBER) tool: a Python library built on top of a star schema (i2b2) clinical data warehouse that enables flexible generation of modeling-ready cohorts as data frames. Materials and Methods FIBER was developed on top of a large-scale star schema EHR database which contains data from 8 million patients and over 120 million encounters. To illustrate FIBER’s capabilities, we present its application by building a heart surgery patient cohort with subsequent prediction of acute kidney injury (AKI) with various machine learning models. Results Using FIBER, we were able to build the heart surgery cohort (n = 12 061), identify the patients that developed AKI (n = 1005), and automatically extract relevant features (n = 774). Finally, we trained machine learning models that achieved area under the curve values of up to 0.77 for this exemplary use case. Conclusion FIBER is an open-source Python library developed for extracting information from star schema clinical data warehouses and reduces time-to-modeling, helping to streamline the clinical modeling process.


Author(s):  
Marco Johns ◽  
Armin Müller ◽  
Felix Nikolaus Wirth ◽  
Fabian Prasser

Data-driven methods in biomedical research can help to obtain new insights into the development, progression and therapy of diseases. Clinical and translational data warehouses such as Informatics for Integrating Biology and the Bedside (i2b2) and tranSMART are important solutions for this. From the well-known FAIR data principles, which are used to address the aspects of findability, accessibility, interoperability and reusability. In this paper, we focus on findability. For this purpose, we describe a portal solution that acts as a catalogue for a wide range of data warehouse instances, featuring a central access point and links to training material, such as user manuals and video tutorials. Moreover, the portal provides an overview of the status of multiple warehouses for developers and a set of statistics about the data currently loaded. Due to its modular design and the use of modern web technologies, the portal is easy to extend and customize to reflect different corporate designs and institutional requirements.


2021 ◽  
Vol 27 (5) ◽  
pp. 259-266
Author(s):  
L. V. Arshinskiy ◽  
◽  
G. N. Shurkhovetsky ◽  

The article considers the application of the dissection-placing method for secure storage of information in external, primarily cloud-based data warehouses. Various approaches to the implementation of the method, including patent material, are analyzed. It is shown that the method is most effective for bitwise dissection of files and random placement of bits in data streams sent to separate warehouses.


Sign in / Sign up

Export Citation Format

Share Document