scholarly journals Analysis on Data Migration Strategies in Heterogeneous Databases

Author(s):  
Ms. Latha S S ◽  
Pavan Kumar S

Data required for a new application are frequently come from other existing application systems. If data required for the new application are available from existing systems and the volume of data is large, the necessary data should be migrated from the existing systems (source systems) to the new application (target system) instead of recreating those data for the target system. The Transformation of data is generally a necessary step in data migration because the data requirements and the architecture of the target system are different from that of the source systems. This paper surveys the data migration techniques which focus on improving the data quality between different types of databases.

2021 ◽  
Vol 5 (2) ◽  
pp. 24
Author(s):  
Otmane Azeroual ◽  
Meena Jha

Data migration is required to run data-intensive applications. Legacy data storage systems are not capable of accommodating the changing nature of data. In many companies, data migration projects fail because their importance and complexity are not taken seriously enough. Data migration strategies include storage migration, database migration, application migration, and business process migration. Regardless of which migration strategy a company chooses, there should always be a stronger focus on data cleansing. On the one hand, complete, correct, and clean data not only reduce the cost, complexity, and risk of the changeover, it also means a good basis for quick and strategic company decisions and is therefore an essential basis for today’s dynamic business processes. Data quality is an important issue for companies looking for data migration these days and should not be overlooked. In order to determine the relationship between data quality and data migration, an empirical study with 25 large German and Swiss companies was carried out to find out the importance of data quality in companies for data migration. In this paper, we present our findings regarding how data quality plays an important role in a data migration plans and must not be ignored. Without acceptable data quality, data migration is impossible.


2019 ◽  
Vol 83 (3) ◽  
pp. 598-626 ◽  
Author(s):  
Caroline Roberts ◽  
Emily Gilbert ◽  
Nick Allum ◽  
Léïla Eisner

Abstract Herbert Simon’s (1956) concept of satisficing provides an intuitive explanation for the reasons why respondents to surveys sometimes adopt response strategies that can lead to a reduction in data quality. As such, the concept rapidly gained popularity among researchers after it was first introduced to the field of survey methodology by Krosnick and Alwin (1987), and it has become a widely cited buzzword linked to different forms of response error. In this article, we present the findings of a systematic review involving a content analysis of journal articles published in English-language journals between 1987 and 2015 that have drawn on the satisficing concept to evaluate survey data quality. Based on extensive searches of online databases, and an initial screening exercise to apply the study’s inclusion criteria, 141 relevant articles were identified. Guided by the theory of survey satisficing described by Krosnick (1991), the methodological features of the shortlisted articles were coded, including the indicators of satisficing analyzed, the main predictors of satisficing, and the presence of main or interaction effects on the prevalence of satisficing involving indicators of task difficulty, respondent ability, and respondent motivation. Our analysis sheds light on potential differences in the extent to which satisficing theory holds for different types of response error, and highlights a number of avenues for future research.


Author(s):  
Marcel von Lucadou ◽  
Thomas Ganslandt ◽  
Hans-Ulrich Prokosch ◽  
Dennis Toddenroth

Abstract Background The secondary use of electronic health records (EHRs) promises to facilitate medical research. We reviewed general data requirements in observational studies and analyzed the feasibility of conducting observational studies with structured EHR data, in particular diagnosis and procedure codes. Methods After reviewing published observational studies from the University Hospital of Erlangen for general data requirements, we identified three different study populations for the feasibility analysis with eligibility criteria from three exemplary observational studies. For each study population, we evaluated the availability of relevant patient characteristics in our EHR, including outcome and exposure variables. To assess data quality, we computed distributions of relevant patient characteristics from the available structured EHR data and compared them to those of the original studies. We implemented computed phenotypes for patient characteristics where necessary. In random samples, we evaluated how well structured patient characteristics agreed with a gold standard from manually interpreted free texts. We categorized our findings using the four data quality dimensions “completeness”, “correctness”, “currency” and “granularity”. Results Reviewing general data requirements, we found that some investigators supplement routine data with questionnaires, interviews and follow-up examinations. We included 847 subjects in the feasibility analysis (Study 1 n = 411, Study 2 n = 423, Study 3 n = 13). All eligibility criteria from two studies were available in structured data, while one study required computed phenotypes in eligibility criteria. In one study, we found that all necessary patient characteristics were documented at least once in either structured or unstructured data. In another study, all exposure and outcome variables were available in structured data, while in the other one unstructured data had to be consulted. The comparison of patient characteristics distributions, as computed from structured data, with those from the original study yielded similar distributions as well as indications of underreporting. We observed violations in all four data quality dimensions. Conclusions While we found relevant patient characteristics available in structured EHR data, data quality problems may entail that it remains a case-by-case decision whether diagnosis and procedure codes are sufficient to underpin observational studies. Free-text data or subsequently supplementary study data may be important to complement a comprehensive patient history.


AI Magazine ◽  
2010 ◽  
Vol 31 (1) ◽  
pp. 65 ◽  
Author(s):  
Clint R. Bidlack ◽  
Michael P Wellman

Recent advances in enterprise web-based software have created a need for sophisticated yet user-friendly data quality solutions. A new category of data quality solutions are discussed that fill this need using intelligent matching and retrieval algorithms. Solutions are focused on customer and sales data and include real-time inexact search, batch processing, and data migration. Users are empowered to maintain higher quality data resulting in more efficient sales and marketing operations. Sales managers spend more time with customers and less time managing data.


2013 ◽  
Vol 756-759 ◽  
pp. 3766-3770
Author(s):  
Hui Pang ◽  
Li Ting Gao ◽  
Fan Xing Meng ◽  
Bing Liu

Intelligent building is the product of the computer and information technology network to penetrate the construction industry, which fully reflects the architectural art and information technologies. This paper focuses on information platform in the intelligent building construction and application process for different sectors, different types of databases data integration, processing, capture data changes, transmission and synchronization. Through studied the method of data transformation, change data capture methods, making the integration of distributed heterogeneous databases in the process of data synchronization issues have been satisfactorily resolved.


2021 ◽  
pp. 1-12
Author(s):  
Jing Wang ◽  
Jie Wei ◽  
Long Li ◽  
Lijian Zhang

With the rapid development of evidence-based medicine, translational medicine, and pharmacoeconomics in China, as well as the country’s strong commitment to clinical research, the demand for physicians’ research continues to increase. In recent years, real-world studies are attracting more and more attention in the field of health care, as a method of post-marketing re-evaluation of drugs, RWS can better reflect the effects of drugs in real clinical settings. In the past, it was difficult to ensure data quality and efficiency of research implementation because of the large sample size required and the large amount of medical data involved. However, due to the large sample size required and the large amount of medical data involved, it is not only time-consuming and labor-intensive, but also prone to human error, making it difficult to ensure data quality and efficiency of research implementation. This paper analyzes and summarizes the existing application systems of big data analytics platforms, and concludes that big data research analytics platforms using natural language processing, machine learning and other artificial intelligence technologies can help RWS to quickly complete the collection, integration, processing, statistics and analysis of large amounts of medical data, and deeply mine the intrinsic value of the data, real-world research in new drug development, drug discovery, drug discovery, drug discovery, and drug discovery. It has a broad application prospect for multi-level and multi-angle needs such as economics, medical insurance cost control, indications/contraindications evaluation, and clinical guidance.


Author(s):  
Nurhidayah Muhamad Zahari ◽  
Wan Ya Wan Hussin ◽  
Mohd Yunus Mohd Yussof ◽  
Fauzi Mohd Saman

1998 ◽  
Vol 3 (2) ◽  
pp. 90-95
Author(s):  
Kayvon Safavi ◽  
Martin A. Weinstock

Background: Increasingly, large collections of pre-existing data are being used to analyze the occurrence, burden, and health care resources directed to the management of various skin diseases. Objective: This article discusses a number of different types of large datasets along with their common uses. Various concerns about the use of this information are also discussed. Conclusion: Although large datasets provide significant statistical power with readily available data, there are significant concerns, particularly regarding data quality and statistical analysis. Readers need to be aware of how an investigator has addressed these issues. Furthermore, the profession needs to be cognizant of very legitimate public concerns regarding confidentiality of personal information.


2012 ◽  
Vol 241-244 ◽  
pp. 1589-1592
Author(s):  
Jun Tan

In recent years, many application systems have generate large quantities of data, so it is no longer practical to rely on traditional database technique to analyze these data. Data mining offers tools for extracting knowledge from data, leading to significant improvement in the decision-making process. Association rules mining is one of the most important data mining technology. The paper first presents the basic concept of association rule mining, then discuss a few different types of association rules mining including multi-level association rules, multidimensional association rules, weighted association rules, multi-relational association rules, fuzzy association rules.


Sign in / Sign up

Export Citation Format

Share Document