scholarly journals A multi-step machine learning approach to assess the impact of COVID-19 lockdown on NO2 attributable deaths in Milan and Rome, Italy

2022 ◽  
Vol 21 (1) ◽  
Author(s):  
Luca Boniardi ◽  
Federica Nobile ◽  
Massimo Stafoggia ◽  
Paola Michelozzi ◽  
Carla Ancona

Abstract Background Air pollution is one of the main concerns for the health of European citizens, and cities are currently striving to accomplish EU air pollution regulation. The 2020 COVID-19 lockdown measures can be seen as an unintended but effective experiment to assess the impact of traffic restriction policies on air pollution. Our objective was to estimate the impact of the lockdown measures on NO2 concentrations and health in the two largest Italian cities. Methods NO2 concentration datasets were built using data deriving from a 1-month citizen science monitoring campaign that took place in Milan and Rome just before the Italian lockdown period. Annual mean NO2 concentrations were estimated for a lockdown scenario (Scenario 1) and a scenario without lockdown (Scenario 2), by applying city-specific annual adjustment factors to the 1-month data. The latter were estimated deriving data from Air Quality Network stations and by applying a machine learning approach. NO2 spatial distribution was estimated at a neighbourhood scale by applying Land Use Random Forest models for the two scenarios. Finally, the impact of lockdown on health was estimated by subtracting attributable deaths for Scenario 1 and those for Scenario 2, both estimated by applying literature-based dose–response function on the counterfactual concentrations of 10 μg/m3. Results The Land Use Random Forest models were able to capture 41–42% of the total NO2 variability. Passing from Scenario 2 (annual NO2 without lockdown) to Scenario 1 (annual NO2 with lockdown), the population-weighted exposure to NO2 for Milan and Rome decreased by 15.1% and 15.3% on an annual basis. Considering the 10 μg/m3 counterfactual, prevented deaths were respectively 213 and 604. Conclusions Our results show that the lockdown had a beneficial impact on air quality and human health. However, compliance with the current EU legal limit is not enough to avoid a high number of NO2 attributable deaths. This contribution reaffirms the potentiality of the citizen science approach and calls for more ambitious traffic calming policies and a re-evaluation of the legal annual limit value for NO2 for the protection of human health.

10.2196/23948 ◽  
2021 ◽  
Vol 23 (4) ◽  
pp. e23948
Author(s):  
Yuanfang Chen ◽  
Liu Ouyang ◽  
Forrest S Bao ◽  
Qian Li ◽  
Lei Han ◽  
...  

Background Effectively and efficiently diagnosing patients who have COVID-19 with the accurate clinical type of the disease is essential to achieve optimal outcomes for the patients as well as to reduce the risk of overloading the health care system. Currently, severe and nonsevere COVID-19 types are differentiated by only a few features, which do not comprehensively characterize the complicated pathological, physiological, and immunological responses to SARS-CoV-2 infection in the different disease types. In addition, these type-defining features may not be readily testable at the time of diagnosis. Objective In this study, we aimed to use a machine learning approach to understand COVID-19 more comprehensively, accurately differentiate severe and nonsevere COVID-19 clinical types based on multiple medical features, and provide reliable predictions of the clinical type of the disease. Methods For this study, we recruited 214 confirmed patients with nonsevere COVID-19 and 148 patients with severe COVID-19. The clinical characteristics (26 features) and laboratory test results (26 features) upon admission were acquired as two input modalities. Exploratory analyses demonstrated that these features differed substantially between two clinical types. Machine learning random forest models based on all the features in each modality as well as on the top 5 features in each modality combined were developed and validated to differentiate COVID-19 clinical types. Results Using clinical and laboratory results independently as input, the random forest models achieved >90% and >95% predictive accuracy, respectively. The importance scores of the input features were further evaluated, and the top 5 features from each modality were identified (age, hypertension, cardiovascular disease, gender, and diabetes for the clinical features modality, and dimerized plasmin fragment D, high sensitivity troponin I, absolute neutrophil count, interleukin 6, and lactate dehydrogenase for the laboratory testing modality, in descending order). Using these top 10 multimodal features as the only input instead of all 52 features combined, the random forest model was able to achieve 97% predictive accuracy. Conclusions Our findings shed light on how the human body reacts to SARS-CoV-2 infection as a unit and provide insights on effectively evaluating the disease severity of patients with COVID-19 based on more common medical features when gold standard features are not available. We suggest that clinical information can be used as an initial screening tool for self-evaluation and triage, while laboratory test results should be applied when accuracy is the priority.


Atmosphere ◽  
2021 ◽  
Vol 12 (1) ◽  
pp. 109
Author(s):  
Ashima Malik ◽  
Megha Rajam Rao ◽  
Nandini Puppala ◽  
Prathusha Koouri ◽  
Venkata Anil Kumar Thota ◽  
...  

Over the years, rampant wildfires have plagued the state of California, creating economic and environmental loss. In 2018, wildfires cost nearly 800 million dollars in economic loss and claimed more than 100 lives in California. Over 1.6 million acres of land has burned and caused large sums of environmental damage. Although, recently, researchers have introduced machine learning models and algorithms in predicting the wildfire risks, these results focused on special perspectives and were restricted to a limited number of data parameters. In this paper, we have proposed two data-driven machine learning approaches based on random forest models to predict the wildfire risk at areas near Monticello and Winters, California. This study demonstrated how the models were developed and applied with comprehensive data parameters such as powerlines, terrain, and vegetation in different perspectives that improved the spatial and temporal accuracy in predicting the risk of wildfire including fire ignition. The combined model uses the spatial and the temporal parameters as a single combined dataset to train and predict the fire risk, whereas the ensemble model was fed separate parameters that were later stacked to work as a single model. Our experiment shows that the combined model produced better results compared to the ensemble of random forest models on separate spatial data in terms of accuracy. The models were validated with Receiver Operating Characteristic (ROC) curves, learning curves, and evaluation metrics such as: accuracy, confusion matrices, and classification report. The study results showed and achieved cutting-edge accuracy of 92% in predicting the wildfire risks, including ignition by utilizing the regional spatial and temporal data along with standard data parameters in Northern California.


2021 ◽  
Author(s):  
Merlin James Rukshan Dennis

Distributed Denial of Service (DDoS) attack is a serious threat on today’s Internet. As the traffic across the Internet increases day by day, it is a challenge to distinguish between legitimate and malicious traffic. This thesis proposes two different approaches to build an efficient DDoS attack detection system in the Software Defined Networking environment. SDN is the latest networking approach which implements centralized controller, which is programmable. The central control and the programming capability of the controller are used in this thesis to implement the detection and mitigation mechanisms. In this thesis, two designed approaches, statistical approach and machine-learning approach, are proposed for the DDoS detection. The statistical approach implements entropy computation and flow statistics analysis. It uses the mean and standard deviation of destination entropy, new flow arrival rate, packets per flow and flow duration to compute various thresholds. These thresholds are then used to distinguish normal and attack traffic. The machine learning approach uses Random Forest classifier to detect the DDoS attack. We fine-tune the Random Forest algorithm to make it more accurate in DDoS detection. In particular, we introduce the weighted voting instead of the standard majority voting to improve the accuracy. Our result shows that the proposed machine-learning approach outperforms the statistical approach. Furthermore, it also outperforms other machine-learning approach found in the literature.


2021 ◽  
Vol 5 (CHI PLAY) ◽  
pp. 1-29
Author(s):  
Alessandro Canossa ◽  
Dmitry Salimov ◽  
Ahmad Azadvar ◽  
Casper Harteveld ◽  
Georgios Yannakakis

Is it possible to detect toxicity in games just by observing in-game behavior? If so, what are the behavioral factors that will help machine learning to discover the unknown relationship between gameplay and toxic behavior? In this initial study, we examine whether it is possible to predict toxicity in the MOBA gameFor Honor by observing in-game behavior for players that have been labeled as toxic (i.e. players that have been sanctioned by Ubisoft community managers). We test our hypothesis of detecting toxicity through gameplay with a dataset of almost 1,800 sanctioned players, and comparing these sanctioned players with unsanctioned players. Sanctioned players are defined by their toxic action type (offensive behavior vs. unfair advantage) and degree of severity (warned vs. banned). Our findings, based on supervised learning with random forests, suggest that it is not only possible to behaviorally distinguish sanctioned from unsanctioned players based on selected features of gameplay; it is also possible to predict both the sanction severity (warned vs. banned) and the sanction type (offensive behavior vs. unfair advantage). In particular, all random forest models predict toxicity, its severity, and type, with an accuracy of at least 82%, on average, on unseen players. This research shows that observing in-game behavior can support the work of community managers in moderating and possibly containing the burden of toxic behavior.


2021 ◽  
Author(s):  
Enzo Losi ◽  
Mauro Venturini ◽  
Lucrezia Manservigi ◽  
Giuseppe Fabio Ceschini ◽  
Giovanni Bechini ◽  
...  

Abstract A gas turbine trip is an unplanned shutdown, of which the most relevant consequences are business interruption and a reduction of equipment remaining useful life. Thus, understanding the underlying causes of gas turbine trip would allow predicting its occurrence in order to maximize gas turbine profitability and improve its availability. In the ever competitive Oil & Gas sector, data mining and machine learning are increasingly being employed to support a deeper insight and improved operation of gas turbines. Among the various machine learning tools, Random Forests are an ensemble learning method consisting of an aggregation of decision tree classifiers. This paper presents a novel methodology aimed at exploiting information embedded in the data and develops Random Forest models, aimed at predicting gas turbine trip based on information gathered during a timeframe of historical data acquired from multiple sensors. The novel approach exploits time series segmentation to increase the amount of training data, thus reducing overfitting. First, data are transformed according to a feature engineering methodology developed in a separate work by the same authors. Then, Random Forest models are trained and tested on unseen observations to demonstrate the benefits of the novel approach. The superiority of the novel approach is proved by considering two real-word case-studies, involving filed data taken during three years of operation of two fleets of Siemens gas turbines located in different regions. The novel methodology allows values of Precision, Recall and Accuracy in the range 75–85 %, thus demonstrating the industrial feasibility of the predictive methodology.


DYNA ◽  
2020 ◽  
Vol 87 (212) ◽  
pp. 63-72
Author(s):  
Jorge Iván Pérez Rave ◽  
Favián González Echavarría ◽  
Juan Carlos Correa Morales

The objective of this work is to develop a machine learning model for online pricing of apartments in a Colombian context. This article addresses three aspects: i) it compares the predictive capacity of linear regression, regression trees, random forest and bagging; ii) it studies the effect of a group of text attributes on the predictive capability of the models; and iii) it identifies the more stable-important attributes and interprets them from an inferential perspective to better understand the object of study. The sample consists of 15,177 observations of real estate. The methods of assembly (random forest and bagging) show predictive superiority with respect to others. The attributes derived from the text had a significant relationship with the property price (on a log scale). However, their contribution to the predictive capacity was almost nil, since four different attributes achieved highly accurate predictions and remained stable when the sample change.


Sign in / Sign up

Export Citation Format

Share Document