Partially versus Purely Data-Driven Approaches in SARS-CoV-2 Prediction

2020 ◽  
Vol 10 (16) ◽  
pp. 5696 ◽  
Author(s):  
Samar A. Shilbayeh ◽  
Abdullah Abonamah ◽  
Ahmad A. Masri

Prediction models of coronavirus disease utilizing machine learning algorithms range from forecasting future suspect cases, predicting mortality rates, to building a pattern for country-specific pandemic end date. To predict the future suspect infection and death cases, we categorized the approaches found in the literature into: first, a purely data-driven approach, whose goal is to build a mathematical model that relates the data variables including outputs with inputs to detect general patterns. The discovered patterns can then be used to predict the future infected cases without any expert input. The second approach is partially data-driven; it uses historical data, but allows expert input such as the SIR epidemic algorithm. This approach assumes that the epidemic will end according to medical reasoning. In this paper, we compare the purely data-driven and partially-data driven approaches by applying them to data from three countries having different past pattern behavior. The countries are the US, Jordan, and Italy. It is found that those two prediction approaches yield significantly different results. Purely data-driven approach depends totally on the past behavior and does not show any decline in the number of the infected cases if the country did not experience any decline in the number of cases. On the other hand, a partially data-driven approach guarantees a timely decline of the infected curve to reach zero. Using the two approaches highlights the importance of human intervention in pandemic prediction to guide the learning process as opposed to the purely data-driven approach that predicts future cases based on the pattern detected in the data.

Tábula ◽  
2021 ◽  
Author(s):  
Miguel Ángel Amutio Gómez

La orientación al dato en el contexto de la transformación digital lleva aparejada la aparición de nuevas regulaciones, dinámicas de gobernanza y roles, y servicios, junto con las correspondientes prácticas, instrumentos y estándares. A la vez se suscitan retos en relación con la ciberseguridad y la preservación de los datos. En este artículo se exponen la transformación digital y la orientación al dato, la proyección de lo anterior en la administración digital, el contexto de la Unión Europea, trayectoria y su orientación, aspectos de la interoperabilidad, ciberseguridad y preservación de los datos, cuestiones de gobernanza y roles en la orientación al dato y, finalmente, unas conclusiones. The data-driven approach in the context of digital transformation entails the appearance of new regulations, governance dynamics and roles, and services, together with the corresponding practices, instruments and standards. At the same time new challenges appear in relation to cybersecurity and data preservation. This article presents the digital transformation and data-driven approach, the impact in digital administration, the context of the European Union, trajectory and orientation towards the future, along with aspects of interoperability, cybersecurity and data preservation, as well as issues of governance and roles in data orientation and finally some conclusions.


2017 ◽  
Vol 4 (suppl_1) ◽  
pp. S403-S404
Author(s):  
Maggie Makar ◽  
Jeeheh Oh ◽  
Christopher Fusco ◽  
Joseph Marchesani ◽  
Robert McCaffrey ◽  
...  

Abstract Background An estimated 293,300 healthcare-associated cases of Clostridium difficile infection (CDI) occur annually in the United States. Prior research on risk-prediction models for CDI have focused on a small number of risk factors with the goal of developing a model that works well across hospitals. We hypothesize that risk factors are, in part, hospital-specific. We applied a generalizable machine learning approach to discovering, or “learning”, hospital-specific risk-stratification models using electronic health record (EHR) data collected during the course of patient care from the Massachusetts General Hospital (MGH) and the University of Michigan Health System (UM). Methods We utilized EHR data from 115,958 adult inpatient admissions from 2012–2014 (MGH) and 258,050 adult inpatient admissions from 2010–2016 (UM) (Fig 1). We extracted patient demographics, admission details, patient history, and daily hospitalization details, resulting in 2,964 and 4,739 features in the MGH and UM models, respectively. We used L2 regularized logistic regression to learn the models and measured the discriminative performance of the models on a year of held-out data from each hospital. Results The MGH and UM models achieved AUROCs of 0.74 (CI: 0.73–0.75) and 0.77 (CI: 0.75–0.80), respectively. The relative importance of risk factors varied significantly across hospitals. In particular, in-hospital locations appeared in the set of top risk factors at one hospital and in the set of protective factors at the other. On average, both models were able to predict CDI five days in advance of clinical diagnosis (Fig 2). Conclusion We used EHR data to generate a daily estimate of the risk of CDI for each inpatient hospitalization. We applied a generalizable data-driven approach to existing data from two large institutions with different patient populations and different data formats and content. In contrast to approaches that focus on learning models that apply generally across hospitals, our proposed approach yields risk stratification models tailored to an institution’s EHR system and patient population. In turn, these hospital-specific models could allow for earlier and more accurate identification of high-risk patients. Disclosures All authors: No reported disclosures.


2020 ◽  
Vol 29 (06) ◽  
pp. 2030001
Author(s):  
Abeer M. Mahmoud ◽  
Hanen Karamti ◽  
Fadwa Alrowais

Functional Magnetic Resonance Imaging (fMRI), for many decades acts as a potential aiding method for diagnosing medical problems. Several successful machine learning algorithms have been proposed in literature to extract valuable knowledge from fMRI. One of these algorithms is the convolutional neural network (CNN) that competent with high capabilities for learning optimal abstractions of fMRI. This is because the CNN learns features similarly to human brain where it preserves local structure and avoids distortion of the global feature space. Focusing on the achievements of using the CNN for the fMRI, and accordingly, the Deep Convolutional Auto-Encoder (DCAE) benefits from the data-driven approach with CNN’s optimal features to strengthen the fMRI classification. In this paper, a new two consequent multi-layers DCAE deep discriminative approach for classifying fMRI Images is proposed. The first DCAE is unsupervised sub-model that is composed of four CNN. It focuses on learning weights to utilize discriminative characteristics of the extracted features for robust reconstruction of fMRI with lower dimensional considering tiny details and refining by its deep multiple layers. Then the second DCAE is a supervised sub-model that focuses on training labels to reach an outperformed results. The proposed approach proved its effectiveness and improved literately reported results on a large brain disorder fMRI dataset.


2018 ◽  
Vol 39 (4) ◽  
pp. 425-433 ◽  
Author(s):  
Jeeheh Oh ◽  
Maggie Makar ◽  
Christopher Fusco ◽  
Robert McCaffrey ◽  
Krishna Rao ◽  
...  

OBJECTIVEAn estimated 293,300 healthcare-associated cases ofClostridium difficileinfection (CDI) occur annually in the United States. To date, research has focused on developing risk prediction models for CDI that work well across institutions. However, this one-size-fits-all approach ignores important hospital-specific factors. We focus on a generalizable method for building facility-specific models. We demonstrate the applicability of the approach using electronic health records (EHR) from the University of Michigan Hospitals (UM) and the Massachusetts General Hospital (MGH).METHODSWe utilized EHR data from 191,014 adult admissions to UM and 65,718 adult admissions to MGH. We extracted patient demographics, admission details, patient history, and daily hospitalization details, resulting in 4,836 features from patients at UM and 1,837 from patients at MGH. We used L2 regularized logistic regression to learn the models, and we measured the discriminative performance of the models on held-out data from each hospital.RESULTSUsing the UM and MGH test data, the models achieved area under the receiver operating characteristic curve (AUROC) values of 0.82 (95% confidence interval [CI], 0.80–0.84) and 0.75 ( 95% CI, 0.73–0.78), respectively. Some predictive factors were shared between the 2 models, but many of the top predictive factors differed between facilities.CONCLUSIONA data-driven approach to building models for estimating daily patient risk for CDI was used to build institution-specific models at 2 large hospitals with different patient populations and EHR systems. In contrast to traditional approaches that focus on developing models that apply across hospitals, our generalizable approach yields risk-stratification models tailored to an institution. These hospital-specific models allow for earlier and more accurate identification of high-risk patients and better targeting of infection prevention strategies.Infect Control Hosp Epidemiol2018;39:425–433


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Francesca Loia ◽  
Nunzia Capobianco ◽  
Roberto Vona

Purpose This study aims to investigate the collective perception regarding the future of offshore platforms and frame the main categories of meanings associated by the community with the investigated phenomenon. Design/methodology/approach A data driven approach has been conducted. The collection of the peoples’ opinions has been realized on two specific social network communities as follows: Twitter and Instagram. The text mining processes carried out a sentiment and a cluster analysis. Findings The sentiment analysis of the most frequent words has been shown. The following four main homogeneous categories of words are emerged in relation to the decommissioning of offshore platforms: technological areas, green governance (GG), circular economy and socio-economic sphere. Research limitations/implications The alternative use of the offshore platforms, including tourism initiatives, aquaculture, alternative energy generation, hydrogen storage and environmental research, could improve the resilience of communities by offering the development of new jobs and the growth of local and innovative green businesses. Practical implications The adoption of a circular model and GG initiatives aims to limit the input of resources and energy, minimize waste and losses, adopt a sustainable approach and realize new social and territorial value. Originality/value The analysis underlines the importance to adopt a systems perspective, which takes into account the social, economic and environmental system as a whole, the different phenomena that occur and the variety of categories of stakeholders, from users to local governments that participate in the territorial development.


2020 ◽  
Vol 13 (1) ◽  
pp. 153-173 ◽  
Author(s):  
Andrea Gentili ◽  
Fabiano Compagnucci ◽  
Mauro Gallegati ◽  
Enzo Valentini

Abstract This study aims to contribute empirical evidence to the debate about the future of work in an increasingly robotised world. We implement a data-driven approach to study the technological transition in six leading Organisation for Economic Co-operation and Development (OECD) countries. First, we perform a cross-country and cross-sector cluster analysis based on the OECD-STAN database. Second, using the International Federation of Robotics database, we bridge these results with those regarding the sectoral density of robots. We show that the process of robotisation is industry- and country-sensitive. In the future, participants in the political and academic debate may be split into optimists and pessimists regarding the future of human labour; however, the two stances may not be contradictory.


Author(s):  
Vedat Bayram ◽  
Gohram Baloch ◽  
Fatma Gzara ◽  
Samir Elhedhli

Optimizing warehouse processes has direct impact on supply chain responsiveness, timely order fulfillment, and customer satisfaction. In this work, we focus on the picking process in warehouse management and study it from a data perspective. Using historical data from an industrial partner, we introduce, model, and study the robust order batching problem (ROBP) that groups orders into batches to minimize total order processing time accounting for uncertainty caused by system congestion and human behavior. We provide a generalizable, data-driven approach that overcomes warehouse-specific assumptions characterizing most of the work in the literature. We analyze historical data to understand the processes in the warehouse, to predict processing times, and to improve order processing. We introduce the ROBP and develop an efficient learning-based branch-and-price algorithm based on simultaneous column and row generation, embedded with alternative prediction models such as linear regression and random forest that predict processing time of a batch. We conduct extensive computational experiments to test the performance of the proposed approach and to derive managerial insights based on real data. The data-driven prescriptive analytics tool we propose achieves savings of seven to eight minutes per order, which translates into a 14.8% increase in daily picking operations capacity of the warehouse.


2019 ◽  
Vol 75 (6) ◽  
pp. 876-888 ◽  
Author(s):  
Yintao Song ◽  
Nobumichi Tamura ◽  
Chenbo Zhang ◽  
Mostafa Karami ◽  
Xian Chen

A novel data-driven approach is proposed for analyzing synchrotron Laue X-ray microdiffraction scans based on machine learning algorithms. The basic architecture and major components of the method are formulated mathematically. It is demonstrated through typical examples including polycrystalline BaTiO3, multiphase transforming alloys and finely twinned martensite. The computational pipeline is implemented for beamline 12.3.2 at the Advanced Light Source, Lawrence Berkeley National Laboratory. The conventional analytical pathway for X-ray diffraction scans is based on a slow pattern-by-pattern crystal indexing process. This work provides a new way for analyzing X-ray diffraction 2D patterns, independent of the indexing process, and motivates further studies of X-ray diffraction patterns from the machine learning perspective for the development of suitable feature extraction, clustering and labeling algorithms.


Sign in / Sign up

Export Citation Format

Share Document