scholarly journals Medium and Long-Term Precipitation Forecasting Method Based on Data Augmentation and Machine Learning Algorithms

Author(s):  
Tiantian Tang ◽  
Donglai Jiao ◽  
Tao Chen ◽  
Guan Gui
2020 ◽  
Vol 12 (15) ◽  
pp. 5972
Author(s):  
Nicholas Fiorentini ◽  
Massimo Losa

Screening procedures in road blackspot detection are essential tools for road authorities for quickly gathering insights on the safety level of each road site they manage. This paper suggests a road blackspot screening procedure for two-lane rural roads, relying on five different machine learning algorithms (MLAs) and real long-term traffic data. The network analyzed is the one managed by the Tuscany Region Road Administration, mainly composed of two-lane rural roads. An amount of 995 road sites, where at least one accident occurred in 2012–2016, have been labeled as “Accident Case”. Accordingly, an equal number of sites where no accident occurred in the same period, have been randomly selected and labeled as “Non-Accident Case”. Five different MLAs, namely Logistic Regression, Classification and Regression Tree, Random Forest, K-Nearest Neighbor, and Naïve Bayes, have been trained and validated. The output response of the MLAs, i.e., crash occurrence susceptibility, is a binary categorical variable. Therefore, such algorithms aim to classify a road site as likely safe (“Accident Case”) or potentially susceptible to an accident occurrence (“Non-Accident Case”) over five years. Finally, algorithms have been compared by a set of performance metrics, including precision, recall, F1-score, overall accuracy, confusion matrix, and the Area Under the Receiver Operating Characteristic. Outcomes show that the Random Forest outperforms the other MLAs with an overall accuracy of 73.53%. Furthermore, all the MLAs do not show overfitting issues. Road authorities could consider MLAs to draw up a priority list of on-site inspections and maintenance interventions.


Diagnostics ◽  
2019 ◽  
Vol 9 (3) ◽  
pp. 104 ◽  
Author(s):  
Ahmed ◽  
Yigit ◽  
Isik ◽  
Alpkocak

Leukemia is a fatal cancer and has two main types: Acute and chronic. Each type has two more subtypes: Lymphoid and myeloid. Hence, in total, there are four subtypes of leukemia. This study proposes a new approach for diagnosis of all subtypes of leukemia from microscopic blood cell images using convolutional neural networks (CNN), which requires a large training data set. Therefore, we also investigated the effects of data augmentation for an increasing number of training samples synthetically. We used two publicly available leukemia data sources: ALL-IDB and ASH Image Bank. Next, we applied seven different image transformation techniques as data augmentation. We designed a CNN architecture capable of recognizing all subtypes of leukemia. Besides, we also explored other well-known machine learning algorithms such as naive Bayes, support vector machine, k-nearest neighbor, and decision tree. To evaluate our approach, we set up a set of experiments and used 5-fold cross-validation. The results we obtained from experiments showed that our CNN model performance has 88.25% and 81.74% accuracy, in leukemia versus healthy and multiclass classification of all subtypes, respectively. Finally, we also showed that the CNN model has a better performance than other wellknown machine learning algorithms.


Cancers ◽  
2019 ◽  
Vol 11 (5) ◽  
pp. 606 ◽  
Author(s):  
Pablo Sala Elarre ◽  
Esther Oyaga-Iriarte ◽  
Kenneth H. Yu ◽  
Vicky Baudin ◽  
Leire Arbea Moreno ◽  
...  

Background: Although surgical resection is the only potentially curative treatment for pancreatic cancer (PC), long-term outcomes of this treatment remain poor. The aim of this study is to describe the feasibility of a neoadjuvant treatment with induction polychemotherapy (IPCT) followed by chemoradiation (CRT) in resectable PC, and to develop a machine-learning algorithm to predict risk of relapse. Methods: Forty patients with resectable PC treated in our institution with IPCT (based on mFOLFOXIRI, GEMOX or GEMOXEL) followed by CRT (50 Gy and concurrent Capecitabine) were retrospectively analyzed. Additionally, clinical, pathological and analytical data were collected in order to perform a 2-year relapse-risk predictive population model using machine-learning techniques. Results: A R0 resection was achieved in 90% of the patients. After a median follow-up of 33.5 months, median progression-free survival (PFS) was 18 months and median overall survival (OS) was 39 months. The 3 and 5-year actuarial PFS were 43.8% and 32.3%, respectively. The 3 and 5-year actuarial OS were 51.5% and 34.8%, respectively. Forty-percent of grade 3-4 IPCT toxicity, and 29.7% of grade 3 CRT toxicity were reported. Considering the use of granulocyte colony-stimulating factors, the number of resected lymph nodes, the presence of perineural invasion and the surgical margin status, a logistic regression algorithm predicted the individual 2-year relapse-risk with an accuracy of 0.71 (95% confidence interval [CI] 0.56–0.84, p = 0.005). The model-predicted outcome matched 64% of the observed outcomes in an external dataset. Conclusion: An intensified multimodal neoadjuvant approach (IPCT + CRT) in resectable PC is feasible, with an encouraging long-term outcome. Machine-learning algorithms might be a useful tool to predict individual risk of relapse. A small sample size and therapy heterogeneity remain as potential limitations.


Materials ◽  
2020 ◽  
Vol 13 (18) ◽  
pp. 4133
Author(s):  
Seungbum Koo ◽  
Jongkwon Choi ◽  
Changhyuk Kim

Soundproofing materials are widely used within structural components of multi-dwelling residential buildings to alleviate neighborhood noise problems. One of the critical mechanical properties for the soundproofing materials to ensure its appropriate structural and soundproofing performance is the long-term compressive deformation under the service loading conditions. The test method in the current test specifications only evaluates resilient materials for a limited period (90-day). It then extrapolates the test results using a polynomial function to predict the long-term compressive deformation. However, the extrapolation is universally applied to materials without considering the level of loads; thus, the calculated deformation may not accurately represent the actual compressive deformation of the materials. In this regard, long-term compressive deformation tests were performed on the selected soundproofing resilient materials (i.e., polystyrene, polyethylene, and ethylene-vinyl acetate). Four levels of loads were chosen to apply compressive loads up to 350 to 500 days continuously, and the deformations of the test specimens were periodically monitored. Then, three machine learning algorithms were used to predict long-term compressive deformations. The predictions based on machine learning and ISO 20392 method are compared with experimental test results, and the accuracy of machine learning algorithms and ISO 20392 method are discussed.


Author(s):  
S.I. Gabitova ◽  
L.A. Davletbakova ◽  
V.Yu. Klimov ◽  
D.V. Shuvaev ◽  
I.Ya. Edelman ◽  
...  

The article describes new decline curves (DC) forecasting method for project wells. The method is based on the integration of manual grouping of DC and machine learning (ML) algorithms appliance. ML allows finding hidden connections between features and the output. Article includes the decline curves analysis of two well completion types: horizontal and slanted wells, which illustrates that horizontal wells are more effective than slanted.


Sensors ◽  
2021 ◽  
Vol 21 (13) ◽  
pp. 4519
Author(s):  
Livia Petrescu ◽  
Cătălin Petrescu ◽  
Ana Oprea ◽  
Oana Mitruț ◽  
Gabriela Moise ◽  
...  

This paper focuses on the binary classification of the emotion of fear, based on the physiological data and subjective responses stored in the DEAP dataset. We performed a mapping between the discrete and dimensional emotional information considering the participants’ ratings and extracted a substantial set of 40 types of features from the physiological data, which represented the input to various machine learning algorithms—Decision Trees, k-Nearest Neighbors, Support Vector Machine and artificial networks—accompanied by dimensionality reduction, feature selection and the tuning of the most relevant hyperparameters, boosting classification accuracy. The methodology we approached included tackling different situations, such as resolving the problem of having an imbalanced dataset through data augmentation, reducing overfitting, computing various metrics in order to obtain the most reliable classification scores and applying the Local Interpretable Model-Agnostic Explanations method for interpretation and for explaining predictions in a human-understandable manner. The results show that fear can be predicted very well (accuracies ranging from 91.7% using Gradient Boosting Trees to 93.5% using dimensionality reduction and Support Vector Machine) by extracting the most relevant features from the physiological data and by searching for the best parameters which maximize the machine learning algorithms’ classification scores.


in the course of latest a long term, system gaining knowledge of (ML) has superior from the task of barely any laptop devotees abusing the plausibility of desktops figuring out a way to play around, and a chunk of mathematics (facts) that superb occasionally belief to be computational methodologies, to a free research area that has now not just given the essential base to measurable computational requirements of studying systems, yet moreover has created wonderful calculations which can be generally carried out for content material material translation, layout acknowledgment, and a numerous excellent commercial enterprise features or has precipitated a simply one among a type studies eagerness for data removal to differentiate wearing a veil regularities or abnormality within group facts that developing via way of next. This thesis centers round clarifying the concept and development of device learning, a portion of the mainstream device gaining knowledge of calculations and attempt to reflect onconsideration on 3 most ordinary calculations relying on a few essential thoughts. Sentiment140 dataset turn out to be utilized and execution of each estimate concerning getting ready time, expectation time and precision of forecast had been archived and analyzed.


Sign in / Sign up

Export Citation Format

Share Document