Thoughts and musings from the new International Population Data Linkage Network (IPDLN) Co-directors

EDITORIAL The beginning of 2017 triggered a change in leadership at the International Population Data Linkage Network (IPDLN), as Professor David Ford’s two year term as director of the network wound down, and the baton was handed on to new Canadian co-leadership, from the Institute for Clinical Evaluative Sciences in Ontario, and the O’Brien Institute for Public Health at the University of Calgary. David Ford’s success and energy during his term were such that our two organizations decided we needed to team up to jointly direct the IPDLN for the next two years. This was endorsed by the IPDLN membership in the lead up to the August 2016 IPDLN conference in Wales, and we are delighted to co-direct this growing network. First off, we want to thank David for his outstanding leadership of the IPDLN over the last two years. He and his team put on an excellent conference in Swansea, with record breaking attendance, excellent presentations, thought provoking plenaries and wonderful facilities and social events that made the conference a huge success. Even the weather cooperated, as the sun shone most of the time (well done David!). In addition, membership of the IPDLN has grown substantially in the last two years, both in number of members and number of different countries represented. David played a key role in the launch of a new journal for our network, the International Journal of Population Data Science (IJPDS), with an editorial board comprised of several IPDLN members, and with Associate Professor Kerina Jones (Swansea University) as founding editor-in-chief. The IPDLN remains a young organization, having had its inaugural conference only in 2008, and we have come a long way in that time. Each subsequent biennial conference has grown in size, sophistication and quality of content, and our membership has steadily increased. In addition, new collaborations between IPDLN researchers and member organizations are continuing to develop. External trends, such as widespread interest in the possibilities of “Big Data” make the work and expertise of IPDLN members more relevant than ever. Common challenges across our countries, such as the perpetual concerns around quality and sustainability of our health and social care systems, make the IPDLN’s ability to foster cross-jurisdiction research and evidence more important than ever. We are committed to building on David’s work and that of the directors before him to further the aims of the IPDLN. As a first step, we have established an interim executive committee, consisting of the current IPDLN co-directors and leaders of the organizations that have previously led the network (James Semmens, Curtin University, Kim McGrail, University of British Columbia, David Ford, Swansea University). Its purpose is to provide strategic advice on the overall direction and deliverables of the IPDLN. The interim executive committee will serve until the next conference in 2018, at which time a new director and a new executive committee will be elected by the membership. One key deliverable will be another outstanding IPDLN conference, and we are pleased to announce it will take place in Banff, Alberta, from September 12-14, 2018. Please mark your calendars now and plan to attend. Dr Hude Quan and Dr Astrid Guttmann have agreed to lead the Scientific Programme Committee, and we will be providing more information on opportunities to participate in the work of that committee in due course. In addition, we aim to continue to increase IPDLN membership in terms of number and geography, especially among countries that are not (or only modestly) represented. We will also help IPDLN member institutions to create more opportunities for collaboration, cross-jurisdictional studies, and shared tools and learning. The scientific focus of our young network is evolving, reflected in the recent decision to substitute “population data” for “health data” in the network’s name. This was welcome in that it reflected the fact that many of our members study more than health data or health outcomes, but it might make it harder to describe the mission of the network. Similarly, how the IPDLN can better contribute to shaping a conversation around our collective mission, advance science between conferences, and ensure that the network and its members are collectively seen as leaders internationally, are questions being explored and will inform the network development plan. The IPDLN is well-positioned to shape the global scientific agenda in emerging areas such as ‘population data science’ and/or ‘Big Data’. Such terms, describing domains of scientific inquiry, mean different things to different people. Our network and its broad and interdisciplinary international membership can help to define what these areas are, by identifying high priority population data science questions and opportunities for ambitious collaborative efforts on a global scale. These are exciting times for the IPDLN and its members, as in this digital health information age the potential for our network is limitless. In our early days as co-directors, we welcome your input in response to these ideas on the role of the IPDLN (please email us with your ideas and thoughts). We are honoured to have the opportunity to co-lead the IPDLN for the next two years, and recognize that our efforts will be made easier because we are building on the existing strengths of the network and its members.Please email us your ideas and thoughts on the role of IPDLN to:Michael Schull at [email protected] Ghali at [email protected]

Download Full-text

The International Population Data Linkage Network – Banff and Beyond

International Journal for Population Data Science ◽

10.23889/ijpds.v3i1.697 ◽

2018 ◽

Vol 3 (4) ◽

Author(s):

William A Ghali ◽

Michael J Schull

Keyword(s):

Executive Committee ◽

Data Science ◽

Data Linkage ◽

South Australia ◽

Population Data ◽

Governance Structure ◽

Social Program ◽

Research Network ◽

Foundational Work ◽

Poster Presentations

We write to you, here in the pages of the International Journal of Population Data Science, for the second time in our capacity of co-directors of the International Population Data Linkage Network (IPDLN – www.ipdln.org). Time has certainly passed quickly since our first communication, where we introduced ourselves, and discussed planned initiatives for our tenure as leads of the IPDLN. Our network’s scientific community is steadily growing and thriving in an era of heightened interest around all things ‘data’. Indeed, there is great enthusiasm for all initiatives that explore ways of harnessing information systems and multisource data to enhance collective knowledge of health matters so that better decisions can be made by governments, system planners, providers, and patients. Never before have such initiatives attracted more attention. It is in this context of heightened interest and relevance around IPDLN and its science that we prepare to convene in Banff, Alberta, Canada for the 5th biennial IPDLN Conference – September 11-14. The conference, to be held at the inspiring Banff Centre (www.banffcentre.ca), is almost sold out, with only limited space remaining for late registrants. A tremendous program has been created through the oversight of Scientific Program co-chairs, Drs. Astrid Guttman and Hude Quan. A compelling roster of plenary lectures from Drs. Diane Watson, Jennifer Walker, and Osmar Zaïane is eagerly anticipated, as are topical panel discussions, an entertaining Science Slam session, and a terrific social program. These sessions will be surrounded by rich scientific oral and poster presentations arising from the more than 450 scientific abstracts submitted for review. We are so pleased to see this vibrant scientific engagement from the IPDLN membership and students, and look forward to hosting all delegates in Banff. The Banff conference will also be the venue at which we announce the new Directorship of the IPDLN for the next two years (2019 and 2020). As co-directors, we engaged with a number of individuals and organizations with interest in leading the IPDLN. In the end, two compelling Directorship applications were submitted – one a joint bid from Australia’s Population Health Research Network and the South Australia Northern Territory DataLink, and the other from the US-based Actionable Intelligence for Social Policy. IPDLN members submitted votes on these strong leadership bids through an online voting process, and while the excellence and appeal of both bids was apparent in strong voter support for both, a winning bid has been confirmed, and it will (as mentioned) be announced at the upcoming September conference. As we look forward to the Banff meeting with great anticipation, we are compelled to acknowledge the growing IPDLN legacy created by past directors. We are particularly indebted to our immediate predecessor, Dr. David Ford, and his team at Swansea University. Their work in hosting the 2016 IPDLN conference has been an inspiration to us in the planning of this year’s conference, and their crucial and foundational work in creating an IT platform for the IPDLN website, the membership database, and the new International Journal for Population Data Science has brought the IPDLN to a new level of organizational sophistication. Over the last 18 months, our co-directorship teams from the Institute for Clinical Evaluative Sciences in Ontario and the O’Brien Institute for Public Health at the University of Calgary have built on the foundation established by prior directors to update/enhance the IPDLN website and membership database. The IPDLN has more members than ever before representing a greater number of countries, and we have a more formalized governance structure with the creation of an Executive Committee that will include immediate past-Directors in order to better ensure continuity. A new Executive Committee will be elected by the IPDLN membership following the Banff conference. The waiting is almost over and IPDLN 2018 is upon us! Our scientific domain has never had the prominence or level of anticipation that we currently see. And the IPDLN has grown in its size, vibrancy and scientific scope. The opportunities for us are boundless, and the timing of our upcoming conference could not be better. We are honoured, with our respective organizations, to have had this opportunity to serve as co-directors over the past two years, and look forward to seeing many of you very soon. For those of you who are unable to travel to Canada’s Rocky Mountains this year, we look forward to connecting with you at a later time in the IPDLN’s continuing upward journey.

Download Full-text

Planting the S.E.E.D.S of Indigenous Population Health Data Linkage

International Journal for Population Data Science ◽

10.23889/ijpds.v5i5.1626 ◽

2020 ◽

Vol 5 (5) ◽

Author(s):

Robyn K Rowe ◽

Jennifer D Walker

Keyword(s):

Population Health ◽

Indigenous Peoples ◽

Health Care Policy ◽

Indigenous Population ◽

Data Linkage ◽

Indigenous Health ◽

Population Data ◽

Health Data ◽

Indigenous Populations ◽

Data Stewardship

IntroductionThe increasing accessibility of data through digitization and linkage has resulted in Indigenous and allied individuals, scholars, practitioners, and data users recognizing a need to advance ways that assert Indigenous sovereignty and governance within data environments. Advances are being talked about around the world for how Indigenous data is collected, used, stored, shared, linked, and analysed. Objectives and ApproachDuring the International Population Data Linkage Network Conference in September of 2018, two sessions were hosted and led by international collaborators that focused on regional Indigenous health data linkage. Notes, discussions, and artistic contributions gathered from the conference led to collaborative efforts to highlight the common approaches to Indigenous data linkage, as discussed internationally. This presentation will share the braided culmination of these discussions and offer S.E.E.D.S as a set of guiding Indigenous data linkage principles. ResultsS.E.E.D.S emerges as a living and expanding set of guiding principles that: 1) prioritizes Indigenous Peoples’ right to Self-determination; 2) makes space for Indigenous Peoples to Exercise sovereignty; 3) adheres to Ethical protocols; 4) acknowledges and respects Data stewardship and governance, and; 5) works to Support reconciliation between Indigenous Peoples and settler states. S.E.E.D.S aims to centre and advance Indigenous-driven population data linkage and research while weaving together common global approaches to Indigenous data linkage. Conclusion / ImplicationsEach of the five elements of S.E.E.D.S interweave and need to be enacted together to create a positive Indigenous data linkage environment. When implemented together, the primary goals of the S.E.E.D.S Principles is to guide positive Indigenous population health data linkage in an effort to create more meaningful research approaches through improved Indigenous-based research processes. The implementation of these principles can, in turn, lead to better measurements of health progress that are critical to enhancing health care policy and improving health and wellness outcomes for Indigenous populations.

Download Full-text

Role of Urban Big Data in Travel Behavior Research

Transportation Research Record Journal of the Transportation Research Board ◽

10.1177/0361198120975029 ◽

2020 ◽

pp. 036119812097502

Author(s):

Chihuangji Wang ◽

Daniel Baldwin Hess

Keyword(s):

Big Data ◽

Survey Data ◽

Travel Behavior ◽

Data Science ◽

Location Based Service ◽

Labor Costs ◽

Public And Private ◽

Public And Private Sector ◽

Data Analytic

Understanding urban travel behavior (TB) is critical for advancing urban transportation planning practice and scholarship; however, traditional survey data is expensive (because of labor costs) and error-prone. With advances in data collection techniques and data analytic approaches, urban big data (UBD) is currently generated at an unprecedented scale in relation to volume, variety, and speed, producing new possibilities for applying UBD for TB research. A review of more than 50 scholarly articles confirms the remarkable and expanding role of UBD in TB research and its advantages over traditional survey data. Using this body of published work, a typology is developed of four key types of UBD—social media, GPS log, mobile phone/location-based service, and smart card—focusing on the features and applications of each type in the context of TB research. This paper discusses in significant detail the opportunities and challenges in the use of UBD from three perspectives: conceptual, methodological, and political. The paper concludes with recommendations for researchers to develop data science knowledge and programming skills for analysis of UBD, for public and private sector agencies to cooperate on the collection and sharing of UBD, and for legislators to enforce data security and confidentiality. UBD offers both researchers and practitioners opportunities to capture urban phenomena and deepen knowledge about the TB of individuals.

Download Full-text

Population Data BC: Supporting population data science in British Columbia

International Journal for Population Data Science ◽

10.23889/ijpds.v4i2.1133 ◽

2020 ◽

Vol 4 (2) ◽

Author(s):

Tavinder Kaur Ark ◽

Sarah Kesselring ◽

Brent Hills ◽

Kim McGrail

Keyword(s):

Linked Data ◽

Large Scale ◽

Data Science ◽

Data Linkage ◽

Population Data ◽

Cost Effective ◽

Data Access ◽

Well Being ◽

Third Party ◽

Individual Level

BackgroundPopulation Data BC (PopData) was established as a multi-university data and education resourceto support training and education, data linkage, and access to individual level, de-identified data forresearch in a wide variety of areas including human and community development and well-being. ApproachA combination of deterministic and probabilistic linkage is conducted based on the quality andavailability of identifiers for data linkage. PopData utilizes a harmonized data request and approvalprocess for data stewards and researchers to increase efficiency and ease of access to linked data.Researchers access linked data through a secure research environment (SRE) that is equipped witha wide variety of tools for analysis. The SRE also allows for ongoing management and control ofdata. PopData continues to expand its data holdings and to evolve its services as well as governanceand data access process. DiscussionPopData has provided efficient and cost-effective access to linked data sets for research. After twodecades of learning, future planned developments for the organization include, but are not limitedto, policies to facilitate programs of research, access to reusable datasets, evaluation and use of newdata linkage techniques such as privacy preserving record linkage (PPRL). ConclusionPopData continues to maintain and grow the number and type of data holdings available for research.Its existing models support a number of large-scale research projects and demonstrate the benefitsof having a third-party data linkage and provisioning center for research purposes. Building furtherconnections with existing data holders and governing bodies will be important to ensure ongoingaccess to data and changes in policy exist to facilitate access for researchers.

Download Full-text

Pandemic--The Role of the Electronic Sharing of Public Health Data, Public Health Data Science, and Public Health Action

Public Health - Open Journal ◽

10.17140/phoj-5-151 ◽

2020 ◽

Vol 5 (3) ◽

pp. 73-74

Author(s):

Gregory Fant ◽

Keyword(s):

Public Health ◽

Data Science ◽

Health Data ◽

Public Health Action ◽

Public Health Data ◽

Health Action

Download Full-text

Population Data Science: The science of data about people

International Journal for Population Data Science ◽

10.23889/ijpds.v3i4.918 ◽

2018 ◽

Vol 3 (4) ◽

Cited By ~ 3

Author(s):

Kim McGrail ◽

Kerina Jones

Keyword(s):

Data Science ◽

Data Linkage ◽

Positive Impact ◽

Population Data ◽

Population Level ◽

Multiple Sources ◽

Science Field ◽

Data Intensive ◽

Collective Work ◽

The Impact

IntroductionSocietal and individual benefits of data-intensive science are substantial but raise challenges of balancing individual privacy and public good, while building appropriate governance and socio-technical systems to support data-intensive science. We set out to define a new field of inquiry to move collective interests forward. Objectives and ApproachOur objectives were: 1. To create a concise definition of the emerging field of Population Data Science; 2. To highlight the characteristics and challenges of Population Data Science; 3. To differentiate Population Data Science from existing fields of data science and informatics; and 4. To discuss the implications and future opportunities for Population Data Science. Objectives 1 and 2 were met largely through International Population Data Linkage Network (IPDLN) member engagement, Objective 3 was evaluated via literature review, and Objective 4 was achieved through iterative and collective work on a peer-reviewed position paper. ResultsWe define Population Data Science succinctly as the science of data about people. It is related to, but distinct from, the fields of data science and informatics. A broader definition includes four characteristics of: i) data use for positive impact on individuals and populations; ii) bringing together and analyzing data from multiple sources; iii) identifying population-level insights; and iv) developing safe, privacy-sensitive and ethical infrastructure to support research. One implication of these characteristics is that few individuals or organisations possess all of the requisite knowledge and skills comprising Population Data Science, so this is by nature a multi-disciplinary “team science” field. There is a need to advance various aspects of science, such as data linkage technology, various forms of analytics, and methods of public engagement. Conclusion/ImplicationsThese implications are the beginnings of a research agenda for Population Data Science, which if approached as a collective field, will catalyze significant advances in our understanding of society, health, and human behavior and increase the impact of our research.

Download Full-text

The role of 3S in big data quality: a perspective on operational performance indicators using an integrated approach

The TQM Journal ◽

10.1108/tqm-02-2021-0062 ◽

2021 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Pratima Verma ◽

Vimal Kumar ◽

Ankesh Mittal ◽

Bhawana Rathore ◽

Ajay Jha ◽

...

Keyword(s):

Big Data ◽

Data Quality ◽

Data Science ◽

Fuzzy Ahp ◽

Integrated Approach ◽

Preference Ranking ◽

Content Type ◽

Quality Dimensions ◽

Data Quality Dimensions

PurposeThis study aims to provide insight into the operational factors of big data. The operational indicators/factors are categorized into three functional parts, namely synthesis, speed and significance. Based on these factors, the organization enhances its big data analytics (BDA) performance followed by the selection of data quality dimensions to any organization's success.Design/methodology/approachA fuzzy analytic hierarchy process (AHP) based research methodology has been proposed and utilized to assign the criterion weights and to prioritize the identified speed, synthesis and significance (3S) indicators. Further, the PROMETHEE (Preference Ranking Organization METHod for Enrichment of Evaluations) technique has been used to measure the data quality dimensions considering 3S as criteria.FindingsThe effective indicators are identified from the past literature and the model confirmed with industry experts to measure these indicators. The results of this fuzzy AHP model show that the synthesis is recognized as the top positioned and most significant indicator followed by speed and significance are developed as the next level. These operational indicators contribute toward BDA and explore with their sub-categories' priority.Research limitations/implicationsThe outcomes of this study will facilitate the businesses that are contemplating this technology as a breakthrough, but it is both a challenge and opportunity for developers and experts. Big data has many risks and challenges related to economic, social, operational and political performance. The understanding of data quality dimensions provides insightful guidance to forecast accurate demand, solve a complex problem and make collaboration in supply chain management performance.Originality/valueBig data is one of the most popular technology concepts in the market today. People live in a world where every facet of life increasingly depends on big data and data science. This study creates awareness about the role of 3S encountered during big data quality by prioritizing using fuzzy AHP and PROMETHEE.

Download Full-text

Big Data-Based System

Interdisciplinary Approaches to Altering Neurodevelopmental Disorders - Advances in Medical Diagnosis, Treatment, and Care ◽

10.4018/978-1-7998-3069-6.ch017 ◽

2020 ◽

pp. 303-319

Author(s):

Tanu Wadhera ◽

Deepti Kakkar

Keyword(s):

Big Data ◽

Data Storage ◽

Big Data Analytics ◽

Autism Spectrum ◽

Health Data ◽

Second Phase ◽

Tremendous Amount ◽

New Methodologies ◽

Phase Data

In the health domain, the move of generating big data is opening new methodologies in detection as well as prediction of various diseases and disorders. The first phase of the present chapter has provided insights into the role of big data analytics in the detection of one such neuro-disorder, that is, autism spectrum disorder (ASD). The data lake concept has provided a direction to resolve the issue by providing a common platform for storing tremendous amount of data in all formats (structured, unstructured, or raw). However, if the entire data have potential value, the data lakes need to be strategically designed as otherwise it can lead to data swamps. Therefore, in the second phase, data lake based on Hadoop architecture and Apache Spark engine has been provided for the analysis of the health data. The proposed system has resolved the data storage issue, management, and analytics on a single platform. Hence, the novelty of the chapter is that it is pointing towards the faster exploration as well as management of data so that the timely generation of hypothesis can help in analyzing ASD.

Download Full-text

Integrating Population Data: Challenges and Prospects

Voprosy statistiki ◽

10.34023/2313-6383-2021-28-3-5-14 ◽

2021 ◽

Vol 28 (3) ◽

pp. 5-14

Author(s):

М. A. Klupt ◽

О. N. Nikiforov

Keyword(s):

Big Data ◽

Social Life ◽

Necessary Conditions ◽

Population Data ◽

Social Structures ◽

The Individual ◽

The City ◽

Different Sources ◽

Clear Description

The article deals with methodological, organizational, and technological issues of integrating population data obtained from various administrative sources and corporative big data. The article proves the particular relevance of the interaction between official statistics and other governmental and corporative information systems amidst the digitization of the economy and social life and the incipient establishment of the federal population data register. The authors propose a system of interrelated aggregates, characterizing various categories of population, which differ according to criteria of citizenship, permanent residence, duration, and purposes of stay on the territory of Russia. Challenges associated with estimating these aggregates are analyzed. The article considers possibilities and legal limitations in the work of statisticians on systematizing information, rationalizing the selection and subsequent joint use of information, characterizing an individual (i.e. matching) for addressing various tasks faced by social and demographic statistics. Special attention is paid to the various options for resolving the issue of a personal code (one or more) that allow linking information on the individual from different databases. The need to ensure the transparency of the methodology used by the various participants of informational interaction is emphasized, which in turn shall pave the way for the harmonization and, where possible, the unification of such methodology. The paper demonstrates the crucial role of preliminary qualitative analysis of data from different sources and explains mechanisms for further interaction of statistical authorities with organizations, interested in this information, and social structures. Using mobile operators’ and providers’ data on the population of the city, necessary conditions for their adequate interpretation – transparent methodology, clear description of population aggregates to estimate, and assumptions used for such estimations – are characterized.

Download Full-text

Data Science for Finance: Best-Suited Methods and Enterprise Architectures

Applied System Innovation ◽

10.3390/asi4030069 ◽

2021 ◽

Vol 4 (3) ◽

pp. 69

Author(s):

Galena Pisoni ◽

Bálint Molnár ◽

Ádám Tarcsi

Keyword(s):

Big Data ◽

Financial Services ◽

Data Science ◽

Enterprise Architecture ◽

Data Driven ◽

Financial Company ◽

Desk Research ◽

Qualitative Literature ◽

Analyze Data

We live in an era of big data. Large volumes of complex and difficult-to-analyze data exist in a variety of industries, including the financial sector. In this paper, we investigate the role of big data in enterprise and technology architectures for financial services. We followed a two-step qualitative process for this. First, using a qualitative literature review and desk research, we analyzed and present the data science tools and methods financial companies use; second, we used case studies to showcase the de facto standard enterprise architecture for financial companies and examined how the data lakes and data warehouses play a central role in a data-driven financial company. We additionally discuss the role of knowledge management and the customer in the implementation of such an enterprise architecture in a financial company. The emerging technological approaches offer opportunities for finance companies to plan and develop additional services as presented in this paper.

Download Full-text