Using Machine Learning to investigate Heat Waves and Myocardial Infarctions in Augsburg, Germany

Author(s):  
Lennart Marien ◽  
Mahyar Valizadeh ◽  
Wolfgang zu Castell ◽  
Alexandra Schneider ◽  
Kathrin Wolf ◽  
...  

<p>Myocardial infarctions (MI) are a major cause of death worldwide. In addition to well-known individual risk factors, studies have shown that temperature extremes, such as encountered during heat waves, lead to increases in MI. The relationship between health impacts and climate is complex, depending on a multitude of climatic, environmental, sociodemographic and behavioral factors. Machine Learning (ML) is a powerful tool for investigating complex and unknown relationships between extreme environmental conditions and their adverse impacts that has already been applied to other climate extremes, such as in the prediction of flood damages. By combining heterogeneous health, climatic, environmental and socio-economic datasets, this study is a first step in developing an ML model for predicting past and future MI risk due to heat waves.</p><p>Here, we present first results of our ML approach for modelling heat-related health effects in Augsburg based on the KORA MI and environmental data. The basis of our data-driven approach is the KORA cohort study and the MI Registry in the Augsburg region of Bavaria, Germany, comprising detailed information on MI and underlying health conditions. Additionally, weather and climate data, air pollution data (e.g., PM<sub>10</sub>, PM<sub>2.5</sub>, nitrous oxides, and ozone), as well as socio-economic data (household income, education) are used for this study. One of the key challenges is to assemble and integrate heterogeneous data from various sources and prepare them for the appropriate spatial scales. We outline major challenges in combining these data and deriving quantitative models from them.</p><p>Moreover, we present initial results based on both regression and classification models, discussing model performance for the period between 2000 and 2015, with a focus on two major heat wave events in Germany during 2003 and 2006. Ultimately, this research may be useful in better understanding heat-related MI risks, supporting possible adaptation options in urban areas and in identifying high-risk groups within society.</p>

2019 ◽  
Vol 12 (1) ◽  
pp. 21 ◽  
Author(s):  
Liangliang Zhang ◽  
Zhao Zhang ◽  
Yuchuan Luo ◽  
Juan Cao ◽  
Fulu Tao

Maize is an extremely important grain crop, and the demand has increased sharply throughout the world. China contributes nearly one-fifth of the total production alone with its decreasing arable land. Timely and accurate prediction of maize yield in China is critical for ensuring global food security. Previous studies primarily used either visible or near-infrared (NIR) based vegetation indices (VIs), or climate data, or both to predict crop yield. However, other satellite data from different spectral bands have been underutilized, which contain unique information on crop growth and yield. In addition, although a joint application of multi-source data significantly improves crop yield prediction, the combinations of input variables that could achieve the best results have not been well investigated. Here we integrated optical, fluorescence, thermal satellite, and environmental data to predict county-level maize yield across four agro-ecological zones (AEZs) in China using a regression-based method (LASSO), two machine learning (ML) methods (RF and XGBoost), and deep learning (DL) network (LSTM). The results showed that combining multi-source data explained more than 75% of yield variation. Satellite data at the silking stage contributed more information than other variables, and solar-induced chlorophyll fluorescence (SIF) had an almost equivalent performance with the enhanced vegetation index (EVI) largely due to the low signal to noise ratio and coarse spatial resolution. The extremely high temperature and vapor pressure deficit during the reproductive period were the most important climate variables affecting maize production in China. Soil properties and management factors contained extra information on crop growth conditions that cannot be fully captured by satellite and climate data. We found that ML and DL approaches definitely outperformed regression-based methods, and ML had more computational efficiency and easier generalizations relative to DL. Our study is an important effort to combine multi-source remote sensed and environmental data for large-scale yield prediction. The proposed methodology provides a paradigm for other crop yield predictions and in other regions.


2014 ◽  
Vol 2014 (1) ◽  
pp. 660-672
Author(s):  
Zachary Nixon

ABSTRACT For significant oil spills in remote areas with complex shoreline geometry, apportioning Shoreline Cleanup Assessment Technique (SCAT) survey effort is a complicated and difficult task. Aerial surveys are often used to select shoreline areas for ground survey after an initial prioritization based upon anecdotal reports or trajectory models, but aerial observers may have difficulty locating cryptic surface shoreline oiling in vegetated or other complex environments. In dynamic beach environments, stranded shoreline oiling may be rapidly buried, making aerial observation difficult. A machine learning-based model is presented for estimating shoreline oiling probabilities via satellite-derived surface oil analysis products, wind summary data, and shoreline habitat type and geometry data. These inputs are increasingly available at spatial and temporal scales sufficient for tactical use, enabling model predictions to be generated within hours after satellite remote sensing products are available. The model was constructed using SCAT data from the Deepwater Horizon oil spill, satellite-derived surface oil analysis products generated during the spill by NOAA's National Environmental Satellite, Data, and Information Service (NESDIS) using a variety of satellite platforms of opportunity, and available shoreline geometry, character, and other preexisting data. The model involves the generation of set of spatial indices of relative over-water proximity of surface oil slicks based upon the satellite-derived analysis products. The model then uses boosted regression trees (BRT), a flexible and relatively recently developed modeling methodology, to generate calibrated estimates of probability of subsequent shoreline oiling based upon these indices, wind climatological data over the time period of interest, and other shoreline data. The model can be implemented via data preparation in any Geographic Information System (GIS) software coupled with the open-source statistical computing language, R. The model is entirely probabilistic and makes no attempt to reproduce the physics of oil moving through the environment, as do trajectory models. It is best used in concert with such models to make estimates at different spatial scales, or when time and data requirements make implementation of fine-scale trajectory modeling impractical for tactical use. The details of model development implementation and assessments of model performance and limitations are presented.


2021 ◽  
Vol 13 (18) ◽  
pp. 3760
Author(s):  
Linghua Meng ◽  
Huanjun Liu ◽  
Susan L. Ustin ◽  
Xinle Zhang

Timely and reliable maize yield prediction is essential for the agricultural supply chain and food security. Previous studies using either climate or satellite data or both to build empirical or statistical models have prevailed for decades. However, to what extent climate and satellite data can improve yield prediction is still unknown. In addition, fertilizer information may also improve crop yield prediction, especially in regions with different fertilizer systems, such as cover crop, mineral fertilizer, or compost. Machine learning (ML) has been widely and successfully applied in crop yield prediction. Here, we attempted to predict maize yield from 1994 to 2007 at the plot scale by integrating multi-source data, including monthly climate data, satellite data (i.e., vegetation indices (VIs)), fertilizer data, and soil data to explore the accuracy of different inputs to yield prediction. The results show that incorporating all of the datasets using random forests (RF) and AB (adaptive boosting) can achieve better performances in yield prediction (R2: 0.85~0.98). In addition, the combination of VIs, climate data, and soil data (VCS) can predict maize yield more effectively than other combinations (e.g., combinations of all data and combinations of VIs and soil data). Furthermore, we also found that including different fertilizer systems had different prediction accuracies. This paper aggregates data from multiple sources and distinguishes the effects of different fertilization scenarios on crop yield predictions. In addition, the effects of different data on crop yield were analyzed in this study. Our study provides a paradigm that can be used to improve yield predictions for other crops and is an important effort that combines multi-source remotely sensed and environmental data for maize yield prediction at the plot scale and develops timely and robust methods for maize yield prediction grown under different fertilizing systems.


2022 ◽  
Author(s):  
Lennart Marien ◽  
Mahyar Valizadeh ◽  
Wolfgang zu Castell ◽  
Christine Nam ◽  
Diana Rechid ◽  
...  

Abstract. Myocardial infarctions (MI) are a major cause of death worldwide, and temperature extremes, e.g., during heat waves and cold winters, may increase the risk of MI. The relationship between health impacts and climate is complex and is influenced by a multitude of climatic, environmental, socio-demographic, and behavioral factors. Here, we present a Machine Learning (ML) approach for predicting MI events based on multiple environmental and demographic variables. We derived data on MI events from the KORA MI registry dataset for Augsburg, Germany between 1998 and 2015. Multivariable predictors include weather and climate, air pollution (PM10, NO, NO2, SO2, and O3), surrounding vegetation, as well as demographic data. We tested the following ML regression algorithms: Decision Tree, Random Forest, Multi-layer Perceptron, Gradient Boosting and Ridge Regression. The models are able to predict the total annual number of MI reasonably well (adjusted R2 = 0.59 − 0.71). Inter-annual variations and long-term trends are captured. Across models the most important predictors are air pollution and daily temperatures. Variables not related to environmental conditions, such as demographics need to be considered as well. This ML approach provides a promising basis to model future MI under changing environmental conditions, as projected by scenarios for climate and other environmental changes.


Finisterra ◽  
2012 ◽  
Vol 45 (89) ◽  
Author(s):  
Paulo Canário

The impact of heat waves on mortality has been the subject of numerous studies and the focus of attention of various national and international governmental bodies. In the summer of 2003 alone, which was exceptionally hot, the number of deaths in 12 European countries increased by 70,000. The overall trend of warming will lead to an increase in frequency, duration and intensity of heat waves and to an increase in heat related mortality. The need to assess the risk of death due to extreme heat, at a detailed spatial scale, has determined the implementation of a research project based on a general model of risk for potentially destructive natural phenomena; the model uses the relationship between hazard and vulnerability and was designed primarily for urban areas. The major hazardous meteorological variables are those that determine the thermal complex (air temperature, radiative temperature, wind and humidity) and the variables related to air quality (mainly ozone and Particulate matter). Vulnerability takes into account the population sensitivity (at various spatial scales) and their exposure to thermal extremes.


2021 ◽  
Vol 13 (6) ◽  
pp. 1224
Author(s):  
Izar Azpiroz ◽  
Noelia Oses ◽  
Marco Quartulli ◽  
Igor G. Olaizola ◽  
Diego Guidotti ◽  
...  

Machine-learning algorithms used for modelling olive-tree phenology generally and largely rely on temperature data. In this study, we developed a prediction model on the basis of climate data and geophysical information. Remote measurements of weather conditions, terrain slope, and surface spectral reflectance were considered for this purpose. The accuracy of the temperature data worsened when replacing weather-station measurements with remote-sensing records, though the addition of more complete environmental data resulted in an efficient prediction model of olive-tree phenology. Filtering and embedded feature-selection techniques were employed to analyze the impact of variables on olive-tree phenology prediction, facilitating the inclusion of measurable information in decision support frameworks for the sustainable management of olive-tree systems.


Atmosphere ◽  
2021 ◽  
Vol 12 (4) ◽  
pp. 499
Author(s):  
Chris G. Tzanis ◽  
Anastasios Alimissis ◽  
Ioannis Koutsogiannis

An important aspect in environmental sciences is the study of air quality, using statistical methods (environmental statistics) which utilize large datasets of climatic parameters. The air-quality-monitoring networks that operate in urban areas provide data on the most important pollutants, which, via environmental statistics, can be used for the development of continuous surfaces of pollutants’ concentrations. Generating ambient air-quality maps can help guide policy makers and researchers to formulate measures to minimize the adverse effects. The information needed for a mapping application can be obtained by employing spatial interpolation methods to the available data, for generating estimations of air-quality distributions. This study used point-monitoring data from the network of stations that operates in Athens, Greece. A machine-learning scheme was applied as a method to spatially estimate pollutants’ concentrations, and the results can be effectively used to implement missing values and provide representative data for statistical analyses purposes.


Author(s):  
Fahad Kamran ◽  
Kathryn Harrold ◽  
Jonathan Zwier ◽  
Wendy Carender ◽  
Tian Bao ◽  
...  

Abstract Background Recently, machine learning techniques have been applied to data collected from inertial measurement units to automatically assess balance, but rely on hand-engineered features. We explore the utility of machine learning to automatically extract important features from inertial measurement unit data for balance assessment. Findings Ten participants with balance concerns performed multiple balance exercises in a laboratory setting while wearing an inertial measurement unit on their lower back. Physical therapists watched video recordings of participants performing the exercises and rated balance on a 5-point scale. We trained machine learning models using different representations of the unprocessed inertial measurement unit data to estimate physical therapist ratings. On a held-out test set, we compared these learned models to one another, to participants’ self-assessments of balance, and to models trained using hand-engineered features. Utilizing the unprocessed kinematic data from the inertial measurement unit provided significant improvements over both self-assessments and models using hand-engineered features (AUROC of 0.806 vs. 0.768, 0.665). Conclusions Unprocessed data from an inertial measurement unit used as input to a machine learning model produced accurate estimates of balance performance. The ability to learn from unprocessed data presents a potentially generalizable approach for assessing balance without the need for labor-intensive feature engineering, while maintaining comparable model performance.


2021 ◽  
Vol 186 (Supplement_1) ◽  
pp. 445-451
Author(s):  
Yifei Sun ◽  
Navid Rashedi ◽  
Vikrant Vaze ◽  
Parikshit Shah ◽  
Ryan Halter ◽  
...  

ABSTRACT Introduction Early prediction of the acute hypotensive episode (AHE) in critically ill patients has the potential to improve outcomes. In this study, we apply different machine learning algorithms to the MIMIC III Physionet dataset, containing more than 60,000 real-world intensive care unit records, to test commonly used machine learning technologies and compare their performances. Materials and Methods Five classification methods including K-nearest neighbor, logistic regression, support vector machine, random forest, and a deep learning method called long short-term memory are applied to predict an AHE 30 minutes in advance. An analysis comparing model performance when including versus excluding invasive features was conducted. To further study the pattern of the underlying mean arterial pressure (MAP), we apply a regression method to predict the continuous MAP values using linear regression over the next 60 minutes. Results Support vector machine yields the best performance in terms of recall (84%). Including the invasive features in the classification improves the performance significantly with both recall and precision increasing by more than 20 percentage points. We were able to predict the MAP with a root mean square error (a frequently used measure of the differences between the predicted values and the observed values) of 10 mmHg 60 minutes in the future. After converting continuous MAP predictions into AHE binary predictions, we achieve a 91% recall and 68% precision. In addition to predicting AHE, the MAP predictions provide clinically useful information regarding the timing and severity of the AHE occurrence. Conclusion We were able to predict AHE with precision and recall above 80% 30 minutes in advance with the large real-world dataset. The prediction of regression model can provide a more fine-grained, interpretable signal to practitioners. Model performance is improved by the inclusion of invasive features in predicting AHE, when compared to predicting the AHE based on only the available, restricted set of noninvasive technologies. This demonstrates the importance of exploring more noninvasive technologies for AHE prediction.


Sign in / Sign up

Export Citation Format

Share Document