star schema
Recently Published Documents


TOTAL DOCUMENTS

81
(FIVE YEARS 22)

H-INDEX

7
(FIVE YEARS 1)

Author(s):  
Eka Praja Wiyata Mandala ◽  
Randy Permana ◽  
Dewi Eka Putri

Motorcycle sales have increased significantly, motorcycle manufacturers are competing to produce the latest models which are then sold to consumers. As a result, motorcycle dealers are overwhelmed with more and more data, not knowing what to do with it. Motorcycle dealers also have difficulty calculating the total sales of motorcycles. We try to provide solutions to deal with data overflow. We propose designing a star schema as the basis for creating a data warehouse. To create a star schema, we propose a four-step sequence in creating an effective star schema, starting from requirements analysis and reporting, understanding business processes, connecting and matching business processes with suitable entities and determining the dimensions of the business processes. We get a star schema with 1 fact table, motorcycle_sales and 11 of dimension tables, such as brand, color, customer, customer_contract, distributor, district, motorcycle, repair_workshop, sell_location, type and time.  The star schema is an optimized model that provides the best performance in presenting more complex information


JAMIA Open ◽  
2021 ◽  
Vol 4 (3) ◽  
Author(s):  
Suparno Datta ◽  
Jan Philipp Sachs ◽  
Harry FreitasDa Cruz ◽  
Tom Martensen ◽  
Philipp Bode ◽  
...  

Abstract Objectives The development of clinical predictive models hinges upon the availability of comprehensive clinical data. Tapping into such resources requires considerable effort from clinicians, data scientists, and engineers. Specifically, these efforts are focused on data extraction and preprocessing steps required prior to modeling, including complex database queries. A handful of software libraries exist that can reduce this complexity by building upon data standards. However, a gap remains concerning electronic health records (EHRs) stored in star schema clinical data warehouses, an approach often adopted in practice. In this article, we introduce the FlexIBle EHR Retrieval (FIBER) tool: a Python library built on top of a star schema (i2b2) clinical data warehouse that enables flexible generation of modeling-ready cohorts as data frames. Materials and Methods FIBER was developed on top of a large-scale star schema EHR database which contains data from 8 million patients and over 120 million encounters. To illustrate FIBER’s capabilities, we present its application by building a heart surgery patient cohort with subsequent prediction of acute kidney injury (AKI) with various machine learning models. Results Using FIBER, we were able to build the heart surgery cohort (n = 12 061), identify the patients that developed AKI (n = 1005), and automatically extract relevant features (n = 774). Finally, we trained machine learning models that achieved area under the curve values of up to 0.77 for this exemplary use case. Conclusion FIBER is an open-source Python library developed for extracting information from star schema clinical data warehouses and reduces time-to-modeling, helping to streamline the clinical modeling process.


2021 ◽  
pp. 84-91
Author(s):  
Muhammad Qusyairi ◽  
◽  
Made Sudarma ◽  
Agus Dharma ◽  
◽  
...  

The data warehouse has the function to make the spread company’s data to be integrated and concise, thereby it helps executives in analyzing the existing data to obtain a quick and accurate strategic decision. This research has the objective to design a data warehouse within the scope of application of the benefit-cost ratio. As a solution to the feasibility of the company’s business, the unity of different data enables it to be combined with the results of the company’s in-depth analysis. In designing the model, this research succeeded in designing a data warehouse with the application of benefit-cost-ratio method which is used to carry out an in-depth analysis of the financial sector by providing the feasibility and percentage results of the current business. In summary, the source data that is processed into the process of extracting, transforming, and loading which built by the star schema will affect the quality of generated data for the process of queries. In addition, the results of the data warehouse used for the decision-making process and feasible business strategy.


2021 ◽  
Vol 14 (11) ◽  
pp. 2101-2113
Author(s):  
Yifei Yang ◽  
Matt Youill ◽  
Matthew Woicik ◽  
Yizhou Liu ◽  
Xiangyao Yu ◽  
...  

Modern cloud databases adopt a storage-disaggregation architecture that separates the management of computation and storage. A major bottleneck in such an architecture is the network connecting the computation and storage layers. Two solutions have been explored to mitigate the bottleneck: caching and computation pushdown. While both techniques can significantly reduce network traffic, existing DBMSs consider them as orthogonal techniques and support only one or the other, leaving potential performance benefits unexploited. In this paper we present FlexPushdownDB (FPDB) , an OLAP cloud DBMS prototype that supports fine-grained hybrid query execution to combine the benefits of caching and computation pushdown in a storage-disaggregation architecture. We build a hybrid query executor based on a new concept called separable operators to combine the data from the cache and results from the pushdown processing. We also propose a novel Weighted-LFU cache replacement policy that takes into account the cost of pushdown computation. Our experimental evaluation on the Star Schema Benchmark shows that the hybrid execution outperforms both the conventional caching-only architecture and pushdown-only architecture by 2.2X. In the hybrid architecture, our experiments show that Weighted-LFU can outperform the baseline LFU by 37%.


2021 ◽  
pp. 115226
Author(s):  
Non Sanprasit ◽  
Katechan Jampachaisri ◽  
Taravichet Titijaroonroj ◽  
Kraisak Kesorn

Author(s):  
Wildan Suharso ◽  
Abims Fardiansa ◽  
Yuda Munarko ◽  
Hardianto Wibowo

Libraries are service units with high storage complexity as evidenced by more data being stored for each year. The data that is not integrated makes the complex problem because every year the process that is carried out continues to increase, especially for the circulation of loans. As the number of books increases, the circulation of borrowing increases every year. On the other hand, the library must know exactly what collection of books they have and the transactions it has made. A lot of data is owned by the library cannot be utilized optimally, so that the managerial is unable to make full use of the data. In University scale libraries, this problem increases when the data is not fully integrated. In this study, the implementation of a star schema was carried out to solve problems related to data integration using a nine-step methodology, which includes selection, item selection, process dimensions, fact selection, fact storage, ensuring dimension tables, selecting database duration, changing dimensions, determining priorities, and query models. The results of this study indicate that the star schema can be implemented in the case of libraries, data warehouses and OLAP to support decision making for adding books, and produced 3 dimensions of the 4 grains found.  


Author(s):  
Evi Triandini ◽  
Muhammad Syamsu Rijal ◽  
Made Pradnyana Ambara

<p><em>Salah satu factor yang mendukung kemajuan sebuah perusahaan adalah kemampuannya untuk menganalisa pasar dengan baik. Perilaku konsumen harus mampu ditangkap dengan baik oleh perusahaan, sehingga menajerial dapat mengevaluasi dan menganalisa untuk menghasilkan kebijakan strategis. Data warehouse bisa dijadikan sebagai solusi baru untuk mengatasi permasalahan manajemen terkait dengan informasi penjualan dari waktu ke waktu. Permasalahan yang akan diselesaikan dalam penelitian ini yaitu bagaimana mengimplementasikan data warehouse untuk mengelola data penjualan produk tour dalam beberapa periode? Tujuan penelitian ini yaitu menghasilkan aplikasi data warehouse yang mampu memberikan informasi yang diperlukan oleh perusahaan untuk menentukan kebijakan strategis. Penelitian ini akan menggunakan Nine Steps Design Methodology dari Ralph Kimball untuk merancang skema data warehouse. Sedangkan skema pemodelan dimensional yang digunakan adalah Star Schema, dikarenakan kecepatannya dalam proses pemanggilan data. Penelitian telah menghasilkan sistem data warehouse penjualan produk tour yang menampilkan informasi penjualan produk tour menggunakan grafik dan dilengkapi detil data jika diperlukan untuk dilihat. Sistem juga memberikan kemudahan bagi pengguna untuk melihat data sesuai dimensi yang diperlukan misalkan dimensi client, vendor dan category. Hal ini memudahkan manajemen untuk mengetahui sebaran penjulan produk sesuai dengan dimensi tersebutmenampilkan informasi yang diperlukan oleh manajemen.</em><em></em></p>


Author(s):  
Claudivan Cruz Lopes ◽  
Valéria Cesário-Times ◽  
Stan Matwin ◽  
Cristina Dutra de Aguiar Ciferri ◽  
Ricardo Rodrigues Ciferri

A cloud data warehouse (cloud DW) is a subject-oriented, integrated, time-variant, voluminous, nonvolatile and multidimensional distributed database that is hosted in a cloud. A solution to ensure data confidentiality for a cloud DW is cryptography. In this article, the authors propose an encryption methodology for a cloud DW stored according to the star schema, considering both the data confidentiality maintenance of the DW and the capability of processing analytical queries directly over the encrypted DW. The proposed encryption methodology comprises an encryption strategy for DW called MV-HO (MultiValued and HOmomorphic) for the definition of how the different types of DW's attributes must be encrypted. The proposed MV-HO encryption strategy was compared with encryption strategies based on symmetric encryption, order preserving symmetric encryption and homomorphic encryption. Results indicated that MV-HO is the best solution found, as MV-HO is pareto-optimal with respect to other strategies investigated.


2020 ◽  
Vol 7 (2-1) ◽  
pp. 31-43
Author(s):  
Nidia Rodríguez Mazahua ◽  
Lisbeth Rodríguez Mazahua ◽  
Asdrúbal López Chau ◽  
Giner Alor Hernández

One of the main problems faced by Data Warehouse designers is fragmentation.Several studies have proposed data mining-based horizontal fragmentation methods.However, not exists a horizontal fragmentation technique that uses a decision tree. This paper presents the analysis of different decision tree algorithms to select the best one to implement the fragmentation method. Such analysis was performed under version 3.9.4 of Weka, considering four evaluation metrics (Precision, ROC Area, Recall and F-measure) for different selected data sets using the Star Schema Benchmark. The results showed that the two best algorithms were J48 and Random Forest in most cases; nevertheless, J48 was selected because it is more efficient in building the model.


Sign in / Sign up

Export Citation Format

Share Document