scholarly journals Unleashing The Power of Your Master Linkage Map – Is There A Role For Business Intelligence Tools In Supporting Data Linkage?

Author(s):  
Brian Stokes

Background with rationaleBusiness Intelligence (BI) software applications collect and process large amounts of data from one or more sources, and for a variety of purposes. These can include generating operational or sales reports, developing dashboards and data visualisations, and for ad-hoc analysis and querying of enterprise databases. Main AimBusiness Intelligence (BI) software applications collect and process large amounts of data from one or more sources, and for a variety of purposes. These can include generating operational or sales reports, developing dashboards and data visualisations, and for ad-hoc analysis and querying of enterprise databases. Methods/ApproachIn deciding to develop a series of dashboards to visually represent data stored in its MLM, the TDLU identified routine requests for these data and critically examined existing techniques for extracting data from its MLM. Traditionally Structured Query Language (SQL) queries were developed and used for a single purpose. By critically analysing limitations with this approach, the TDLU identified the power of BI tools and ease of use for both technical and non-technical staff. ResultsImplementing a BI tool is enabling quick and accurate production of a comprehensive array of information. Such information assists with cohort size estimation, producing data for routine and ad-hoc reporting, identifying data quality issues, and to answer questions from prospective users of linked data services including instantly producing estimates of links stored across disparate datasets. Conclusion BI tools are not traditionally considered integral to the operations of data linkage units. However, the TDLU has successfully applied the use of a BI tool to enable a rich set of data locked in its MLM to be quickly made available in multiple, easy to use formats and by technical and non-technical staff.

Author(s):  
Brian Stokes ◽  
Nadine Wiggins ◽  
Tim Albion ◽  
Alison Venn

As a member of the Population Health Research Network Australia, being an Australian collaboration established to support the use of linked data for research and other purposes, the Tasmanian Data Linkage Unit [TDLU) provides linked-data services in Australia’s smallest state, and as part of the Menzies Institute for Medical Research at the state’s only University, the University of Tasmania. The TDLU works in close collaboration with the Tasmanian Government Department of Health and other key stakeholders both in Tasmania and Australia representing government, education, research, and the community sector. The TDLU is one of the newest data linkage services in Australia, and the smallest node of the PHRN having operated for almost nine years by less than three full time equivalent staff. However, despite its size and relative maturity as a provider of linked-data services, the TDLU continues to grow the number of datasets linked on a routine and ad-hoc basis, the number of projects completed, the size of its Master Linkage Map and number of ‘keys’ stored in this Map. The TDLU places high-emphasis on security, privacy preservation, innovation, quality assurance, stakeholder engagement and providing responsive and exemplary services to users of linked-data.}


2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Peter Baumann ◽  
Dimitar Misev ◽  
Vlad Merticariu ◽  
Bang Pham Huu

AbstractMulti-dimensional arrays (also known as raster data or gridded data) play a key role in many, if not all science and engineering domains where they typically represent spatio-temporal sensor, image, simulation output, or statistics “datacubes”. As classic database technology does not support arrays adequately, such data today are maintained mostly in silo solutions, with architectures that tend to erode and not keep up with the increasing requirements on performance and service quality. Array Database systems attempt to close this gap by providing declarative query support for flexible ad-hoc analytics on large n-D arrays, similar to what SQL offers on set-oriented data, XQuery on hierarchical data, and SPARQL and CIPHER on graph data. Today, Petascale Array Database installations exist, employing massive parallelism and distributed processing. Hence, questions arise about technology and standards available, usability, and overall maturity. Several papers have compared models and formalisms, and benchmarks have been undertaken as well, typically comparing two systems against each other. While each of these represent valuable research to the best of our knowledge there is no comprehensive survey combining model, query language, architecture, and practical usability, and performance aspects. The size of this comparison differentiates our study as well with 19 systems compared, four benchmarked to an extent and depth clearly exceeding previous papers in the field; for example, subsetting tests were designed in a way that systems cannot be tuned to specifically these queries. It is hoped that this gives a representative overview to all who want to immerse into the field as well as a clear guidance to those who need to choose the best suited datacube tool for their application. This article presents results of the Research Data Alliance (RDA) Array Database Assessment Working Group (ADA:WG), a subgroup of the Big Data Interest Group. It has elicited the state of the art in Array Databases, technically supported by IEEE GRSS and CODATA Germany, to answer the question: how can data scientists and engineers benefit from Array Database technology? As it turns out, Array Databases can offer significant advantages in terms of flexibility, functionality, extensibility, as well as performance and scalability—in total, the database approach of offering “datacubes” analysis-ready heralds a new level of service quality. Investigation shows that there is a lively ecosystem of technology with increasing uptake, and proven array analytics standards are in place. Consequently, such approaches have to be considered a serious option for datacube services in science, engineering and beyond. Tools, though, vary greatly in functionality and performance as it turns out.


2017 ◽  
Vol 13 (3) ◽  
pp. 65-85 ◽  
Author(s):  
Mohammad Daradkeh ◽  
Radwan Moh'd Al-Dwairi

Despite the growing popularity of self-service business intelligence (SSBI) tools, empirical research that investigates their acceptance by business professionals is still scarce. This paper presents and tests an integrated model of the antecedents of users' acceptance of SSBI tools in business enterprises. The proposed model is developed based on the technology acceptance model (TAM) and incorporating information and system quality from DeLone and McLean IS success model. It also includes an important factor from the business intelligence literature called analysis quality. To test the model, data were collected through a questionnaire survey from 331 business users working in a variety of industries in Jordan. Data were analysed using structural equation modeling (SEM) techniques. The results demonstrated that the three quality factors– information quality, system quality and analysis quality – are key antecedents of perceived usefulness and ease of use, which in turn were found to be strong predictors of users' intention to use SSBI tools. The findings of this study provide several implications for research and practice, and thus should help in the design and deployment of more user-accepted SSBI tools.


Biostatistics ◽  
2018 ◽  
Vol 21 (3) ◽  
pp. 432-448 ◽  
Author(s):  
William J Artman ◽  
Inbal Nahum-Shani ◽  
Tianshuang Wu ◽  
James R Mckay ◽  
Ashkan Ertefaie

Summary Sequential, multiple assignment, randomized trial (SMART) designs have become increasingly popular in the field of precision medicine by providing a means for comparing more than two sequences of treatments tailored to the individual patient, i.e., dynamic treatment regime (DTR). The construction of evidence-based DTRs promises a replacement to ad hoc one-size-fits-all decisions pervasive in patient care. However, there are substantial statistical challenges in sizing SMART designs due to the correlation structure between the DTRs embedded in the design (EDTR). Since a primary goal of SMARTs is the construction of an optimal EDTR, investigators are interested in sizing SMARTs based on the ability to screen out EDTRs inferior to the optimal EDTR by a given amount which cannot be done using existing methods. In this article, we fill this gap by developing a rigorous power analysis framework that leverages the multiple comparisons with the best methodology. Our method employs Monte Carlo simulation to compute the number of individuals to enroll in an arbitrary SMART. We evaluate our method through extensive simulation studies. We illustrate our method by retrospectively computing the power in the Extending Treatment Effectiveness of Naltrexone (EXTEND) trial. An R package implementing our methodology is available to download from the Comprehensive R Archive Network.


2022 ◽  
Vol 13 (2) ◽  
pp. 1-28
Author(s):  
Yan Tang ◽  
Weilong Cui ◽  
Jianwen Su

A business process (workflow) is an assembly of tasks to accomplish a business goal. Real-world workflow models often demanded to change due to new laws and policies, changes in the environment, and so on. To understand the inner workings of a business process to facilitate changes, workflow logs have the potential to enable inspecting, monitoring, diagnosing, analyzing, and improving the design of a complex workflow. Querying workflow logs, however, is still mostly an ad hoc practice by workflow managers. In this article, we focus on the problem of querying workflow log concerning both control flow and dataflow properties. We develop a query language based on “incident patterns” to allow the user to directly query workflow logs instead of having to transform such queries into database operations. We provide the formal semantics and a query evaluation algorithm of our language. By deriving an accurate cost model, we develop an optimization mechanism to accelerate query evaluation. Our experiment results demonstrate the effectiveness of the optimization and achieves up to 50× speedup over an adaption of existing evaluation method.


Author(s):  
Albert N. Badre ◽  
Tiziana Catarci ◽  
Antonio Massari ◽  
Giuseppe Santucci
Keyword(s):  

Author(s):  
Henrike Berthold ◽  
Philipp Rösch ◽  
Stefan Zöller ◽  
Felix Wortmann ◽  
Alessio Carenini ◽  
...  
Keyword(s):  

Author(s):  
Steven Noel ◽  
Stephen Purdy ◽  
Annie O’Rourke ◽  
Edward Overly ◽  
Brianna Chen ◽  
...  

This paper describes the Cyber Situational Understanding (Cyber SU) Proof of Concept (CySUP) software system for exploring advanced Cyber SU capabilities. CySUP distills complex interrelationships among cyberspace entities to provide the “so what” of cyber events for tactical operations. It combines a variety of software components to build an end-to-end pipeline for live data ingest that populates a graph knowledge base, with query-driven exploratory analysis and interactive visualizations. CySUP integrates with the core infrastructure environment supporting command posts to provide a cyber overlay onto a common operating picture oriented to tactical commanders. It also supports detailed analysis of cyberspace entities and relationships driven by ad hoc graph queries, including the conversion of natural language inquiries to formal query language. To help assess its Cyber SU capabilities, CySUP leverages automated cyber adversary emulation to carry out controlled cyberattack campaigns that impact elements of tactical missions.


Author(s):  
Arijit Sengupta ◽  
V. Ramesh

This chapter presents DSQL, a conservative extension of SQL, as an ad-hoc query language for XML. The development of DSQL follows the theoretical foundations of first order logic, and uses common query semantics already accepted for SQL. DSQL represents a core subset of XQuery that lends well to query optimization techniques; while at the same time allows easy integration into current databases and applications that use SQL. The intent of DSQL is not to replace XQuery, the current W3C recommended XML query language, but to serve as an ad-hoc querying frontend to XQuery. Further, the authors present proofs for important query language properties such as complexity and closure. An empirical study comparing DSQL and XQuery for the purpose of ad-hoc querying demonstrates that users perform better with DSQL for both flat and tree structures, in terms of both accuracy and efficiency.


Sign in / Sign up

Export Citation Format

Share Document