Investigating Health Context: Using Geospatial Big Data Ecosystem (Preprint)
BACKGROUND Enabling the use of spatial context is vital to understanding today’s digital health problems. Any given location is associated with many different contexts. The strategic transformation of population health, epidemiology, and eHealth studies require vast amounts of integrated digital data. Needed is a novel analytical framework designed to leverage location to create new contextual knowledge. GeoARK, a research resource has the robust, locationally integrated, social, environmental, and infrastructural information to address today’s complex questions, investigate context and to spatially-enable health investigations. GeoARK is different from other GIS resources in that it has taken the layered world of GIS and flattened it into a Big Data table that ties all the data and information together using location and developing its context. OBJECTIVE It is paramount to build a robust spatial data analytics framework that integrates social, environmental, and infrastructural knowledge base to empower health researchers’ use of geospatial context to timely answer population health issues. The goal is two-fold in that it embodies an innovative technological approach and serves to ease the educational burden for health researchers to think spatially about their problems. METHODS A unique analytical tool using location as the key is developed. It allows integration across source, geography, and time to create a geospatial big table with over 162 million individual locations (X-Y points that serve as rows) and 5549 attributes (represented as columns). The concept of context (adjacency, proximity, distance, etc.) has been quantified through geo-analytics and captured as new distance, density, or neighbor attributes within the system. Development of geospatial analytics permit contextual extraction and investigator-initiated eHealth and mHealth analysis across multiple attributes. RESULTS We built a unique geospatial big data ecosystem called Geospatial Analytical Research Knowledgebase (GeoARK). Analytics on this big table occur across resolution groups, sources, and geographies for extraction and analysis of information to gain new insights. Case studies, including telehealth assessment, income inequality and health outcomes disparity, and COVID-19 risk assessment, demonstrate the capability to support robust and efficient geospatial understanding of a wide spectrum of population health questions. CONCLUSIONS This research has identified, compiled, transformed, standardized, and integrated the multifaceted data required to better understand the context of health events within a large location-enabled database. The GeoARK system empowers health professionals to engage more complex research where the synergisms of health and geospatial information will be robustly studied beyond what could be accomplished today. No longer is the need to know how to do geospatial processing an impediment to the health researcher, but rather the development of how to think spatially becomes the greater challenge.