Big Data Cohort Extraction to Facilitate Machine Learning to Improve Statin Treatment

2016 ◽  
Vol 39 (1) ◽  
pp. 42-62 ◽  
Author(s):  
Chih-Lin Chi ◽  
Jin Wang ◽  
Thomas R. Clancy ◽  
Jennifer G. Robinson ◽  
Peter J. Tonellato ◽  
...  

Health care Big Data studies hold substantial promise for improving clinical practice. Among analytic tools, machine learning (ML) is an important approach that has been widely used by many industries for data-driven decision support. In Big Data, thousands of variables and millions of patient records are commonly encountered, but most data elements cannot be directly used to support decision making. Although many feature-selection tools can help identify relevant data, these tools are typically insufficient to determine a patient data cohort to support learning. Therefore, domain experts with nursing or clinic knowledge play critical roles in determining value criteria or the type of variables that should be included in the patient cohort to maximize project success. We demonstrate this process by extracting a patient cohort (37,506 individuals) to support our ML work (i.e., the production of a proactive strategy to prevent statin adverse events) from 130 million de-identified lives in the OptumLabs™ Data Warehouse.

Author(s):  
Fanhui Kong ◽  
Jian Li ◽  
Bin Jiang ◽  
Tianyuan Zhang ◽  
Houbing Song

Author(s):  
Peter V. Coveney ◽  
Edward R. Dougherty ◽  
Roger R. Highfield

The current interest in big data, machine learning and data analytics has generated the widespread impression that such methods are capable of solving most problems without the need for conventional scientific methods of inquiry. Interest in these methods is intensifying, accelerated by the ease with which digitized data can be acquired in virtually all fields of endeavour, from science, healthcare and cybersecurity to economics, social sciences and the humanities. In multiscale modelling, machine learning appears to provide a shortcut to reveal correlations of arbitrary complexity between processes at the atomic, molecular, meso- and macroscales. Here, we point out the weaknesses of pure big data approaches with particular focus on biology and medicine, which fail to provide conceptual accounts for the processes to which they are applied. No matter their ‘depth’ and the sophistication of data-driven methods, such as artificial neural nets, in the end they merely fit curves to existing data. Not only do these methods invariably require far larger quantities of data than anticipated by big data aficionados in order to produce statistically reliable results, but they can also fail in circumstances beyond the range of the data used to train them because they are not designed to model the structural characteristics of the underlying system. We argue that it is vital to use theory as a guide to experimental design for maximal efficiency of data collection and to produce reliable predictive models and conceptual knowledge. Rather than continuing to fund, pursue and promote ‘blind’ big data projects with massive budgets, we call for more funding to be allocated to the elucidation of the multiscale and stochastic processes controlling the behaviour of complex systems, including those of life, medicine and healthcare. This article is part of the themed issue ‘Multiscale modelling at the physics–chemistry–biology interface’.


2021 ◽  
Vol 10 (1) ◽  
Author(s):  
Charles E. Knott ◽  
Stephen Gomori ◽  
Mai Ngyuen ◽  
Susan Pedrazzani ◽  
Sridevi Sattaluri ◽  
...  

AbstractCombining survey data with alternative data sources (e.g., wearable technology, apps, physiological, ecological monitoring, genomic, neurocognitive assessments, brain imaging, and psychophysical data) to paint a complete biobehavioral picture of trauma patients comes with many complex system challenges and solutions. Starting in emergency departments and incorporating these diverse, broad, and separate data streams presents technical, operational, and logistical challenges but allows for a greater scientific understanding of the long-term effects of trauma. Our manuscript describes incorporating and prospectively linking these multi-dimensional big data elements into a clinical, observational study at US emergency departments with the goal to understand, prevent, and predict adverse posttraumatic neuropsychiatric sequelae (APNS) that affects over 40 million Americans annually. We outline key data-driven system challenges and solutions and investigate eligibility considerations, compliance, and response rate outcomes incorporating these diverse “big data” measures using integrated data-driven cross-discipline system architecture.


10.2196/16607 ◽  
2019 ◽  
Vol 21 (11) ◽  
pp. e16607 ◽  
Author(s):  
Christian Lovis

Data-driven science and its corollaries in machine learning and the wider field of artificial intelligence have the potential to drive important changes in medicine. However, medicine is not a science like any other: It is deeply and tightly bound with a large and wide network of legal, ethical, regulatory, economical, and societal dependencies. As a consequence, the scientific and technological progresses in handling information and its further processing and cross-linking for decision support and predictive systems must be accompanied by parallel changes in the global environment, with numerous stakeholders, including citizen and society. What can be seen at the first glance as a barrier and a mechanism slowing down the progression of data science must, however, be considered an important asset. Only global adoption can transform the potential of big data and artificial intelligence into an effective breakthroughs in handling health and medicine. This requires science and society, scientists and citizens, to progress together.


2016 ◽  
Vol 23 (3) ◽  
pp. 269-278 ◽  
Author(s):  
R. Andrew Taylor ◽  
Joseph R. Pare ◽  
Arjun K. Venkatesh ◽  
Hani Mowafi ◽  
Edward R. Melnick ◽  
...  

Author(s):  
Pål Sundsøy ◽  
Johannes Bjelland ◽  
Asif M. Iqbal ◽  
Alex “Sandy” Pentland ◽  
Yves-Alexandre de Montjoye

Sign in / Sign up

Export Citation Format

Share Document