data warehousing
Recently Published Documents


TOTAL DOCUMENTS

1154
(FIVE YEARS 72)

H-INDEX

32
(FIVE YEARS 4)

2022 ◽  
Author(s):  
M. Asif Naeem ◽  
Wasiullah Waqar ◽  
Farhaan Mirza ◽  
Ali Tahir

Abstract Semi-stream join is an emerging research problem in the domain of near-real-time data warehousing. A semi-stream join is basically a join between a fast stream (S) and a slow disk-based relation (R). In the modern era of technology, huge amounts of data are being generated swiftly on a daily basis which needs to be instantly analyzed for making successful business decisions. Keeping this in mind, a famous algorithm called CACHEJOIN (Cache Join) was proposed. The limitation of the CACHEJOIN algorithm is that it does not deal with the frequently changing trends in a stream data efficiently. To overcome this limitation, in this paper we propose a TinyLFU-CACHEJOIN algorithm, a modified version of the original CACHEJOIN algorithm, which is designed to enhance the performance of a CACHEJOIN algorithm. TinyLFU-CACHEJOIN employs an intelligent strategy which keeps only those records of $R$ in the cache that have a high hit rate in S. This mechanism of TinyLFU-CACHEJOIN allows it to deal with the sudden and abrupt trend changes in S. We developed a cost model for our TinyLFU-CACHEJOIN algorithm and proved it empirically. We also assessed the performance of our proposed TinyLFU-CACHEJOIN algorithm with the existing CACHEJOIN algorithm on a skewed synthetic dataset. The experiments proved that TinyLFU-CACHEJOIN algorithm significantly outperforms the CACHEJOIN algorithm.


2022 ◽  
Vol 18 (1) ◽  
pp. 0-0

Social media data become an integral part in the business data and should be integrated into the decisional process for better decision making based on information which reflects better the true situation of business in any field. However, social media data are unstructured and generated in very high frequency which exceeds the capacity of the data warehouse. In this work, we propose to extend the data warehousing process with a staging area which heart is a large scale system implementing an information extraction process using Storm and Hadoop frameworks to better manage their volume and frequency. Concerning structured information extraction, mainly events, we combine a set of techniques from NLP, linguistic rules and machine learning to succeed the task. Finally, we propose the adequate data warehouse conceptual model for events modeling and integration with enterprise data warehouse using an intermediate table called Bridge table. For application and experiments, we focus on drug abuse events extraction from Twitter data and their modeling into the Event Data Warehouse.


Author(s):  
Monika Soni

The aim of this paper is to understand the concept of Data ware housing and how it is implemented. It is related to the data analysis of the data in an organisation. It facilitates and makes the analysis process easy for the workers of the organisation. The paper will also explain two approaches that are followed in data ware housing. The process of implementation of data ware house will also discussed further in this paper. There are certain challenges to create data ware house.


2021 ◽  
Author(s):  
Flavio de Assis Vilela ◽  
Ricardo Rodrigues Ciferri

ETL (Extract, Transform, and Load) is an essential process required to perform data extraction in knowledge discovery in databases and in data warehousing environments. The ETL process aims to gather data that is available from operational sources, process and store them into an integrated data repository. Also, the ETL process can be performed in a real-time data warehousing environment and store data into a data warehouse. This paper presents a new and innovative method named Data Extraction Magnet (DEM) to perform the extraction phase of ETL process in a real-time data warehousing environment based on non-intrusive, tag and parallelism concepts. DEM has been validated on a dairy farming domain using synthetic data. The results showed a great performance gain in comparison to the traditional trigger technique and the attendance of real-time requirements.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Vera Yakovchenko ◽  
Timothy R. Morgan ◽  
Matthew J. Chinman ◽  
Byron J. Powell ◽  
Rachel Gonzalez ◽  
...  

Abstract Background While few countries and healthcare systems are on track to meet the World Health Organization’s hepatitis C virus (HCV) elimination goals, the US Veterans Health Administration (VHA) has been a leader in these efforts. We aimed to determine which implementation strategies were associated with successful national viral elimination implementation within the VHA. Methods We conducted a five-year, longitudinal cohort study of the VHA Hepatic Innovation Team (HIT) Collaborative between October 2015 and September 2019. Participants from 130 VHA medical centers treating HCV were sent annual electronic surveys about their use of 73 implementation strategies, organized into nine clusters as described by the Expert Recommendations for Implementing Change taxonomy. Descriptive and nonparametric analyses assessed strategy use over time, strategy attribution to the HIT, and strategy associations with site HCV treatment volume and rate of adoption, following the Theory of Diffusion of Innovations. Results Between 58 and 109 medical centers provided responses in each year, including 127 (98%) responding at least once, and 54 (42%) responding in all four implementation years. A median of 13–27 strategies were endorsed per year, and 8–36 individual strategies were significantly associated with treatment volume per year. Data warehousing, tailoring, and patient-facing strategies were most commonly endorsed. One strategy—“identify early adopters to learn from their experiences”—was significantly associated with HCV treatment volume in each year. Peak implementation year was associated with revising professional roles, providing local technical assistance, using data warehousing (i.e., dashboard population management), and identifying and preparing champions. Many of the strategies were driven by a national learning collaborative, which was instrumental in successful HCV elimination. Conclusions VHA’s tremendous success in rapidly treating nearly all Veterans with HCV can provide a roadmap for other HCV elimination initiatives.


Author(s):  
A. Sai Ram

Abstract: Across the world in our day-to-day life, we come across various medical inaccuracies caused due to unreliable patient’s reminiscence. Statistically, communication problems are the most significant aspect that hampers the diagnosis of patient’s diseases. So, this paper represents the best theoretical solution to achieve patient care in the most adequate way. In these pandemic days, the communication gap between the patient and the physician has begun to decline to a nominal level. This paper demonstrates a vital solution and a steppingstone to the complete digitalization of the client’s illness catalogue. To attain the solution in a specified manner we are using adverse pre-existential technologies like data warehousing, database management system, cloud computing, big data, etc. We also persistently maintain the most secure, impenetrable infrastructure enabling the client’s data privacy. Keywords: Illness catalogue, cloud computing, data warehousing, database management systems, big data.


Sign in / Sign up

Export Citation Format

Share Document