Predicting the Mechanical Properties of RCA-Based Concrete Using Supervised Machine Learning Algorithms

Meijun Shang; Hejun Li; Ayaz Ahmad; Waqas Ahmad; Krzysztof Adam Ostrowski; Fahid Aslam; Panuwat Joyklad; Tomasz M. Majka

doi:10.3390/ma15020647

Predicting the Mechanical Properties of RCA-Based Concrete Using Supervised Machine Learning Algorithms

Materials ◽

10.3390/ma15020647 ◽

2022 ◽

Vol 15 (2) ◽

pp. 647

Author(s):

Meijun Shang ◽

Hejun Li ◽

Ayaz Ahmad ◽

Waqas Ahmad ◽

Krzysztof Adam Ostrowski ◽

...

Keyword(s):

Machine Learning ◽

Mechanical Properties ◽

Mean Square Error ◽

Coarse Aggregate ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Environmental Damage ◽

Fine Aggregate ◽

Mean Square ◽

The Impact

Environment-friendly concrete is gaining popularity these days because it consumes less energy and causes less damage to the environment. Rapid increases in the population and demand for construction throughout the world lead to a significant deterioration or reduction in natural resources. Meanwhile, construction waste continues to grow at a high rate as older buildings are destroyed and demolished. As a result, the use of recycled materials may contribute to improving the quality of life and preventing environmental damage. Additionally, the application of recycled coarse aggregate (RCA) in concrete is essential for minimizing environmental issues. The compressive strength (CS) and splitting tensile strength (STS) of concrete containing RCA are predicted in this article using decision tree (DT) and AdaBoost machine learning (ML) techniques. A total of 344 data points with nine input variables (water, cement, fine aggregate, natural coarse aggregate, RCA, superplasticizers, water absorption of RCA and maximum size of RCA, density of RCA) were used to run the models. The data was validated using k-fold cross-validation and the coefficient correlation coefficient (R2), mean square error (MSE), mean absolute error (MAE), and root mean square error values (RMSE). However, the model’s performance was assessed using statistical checks. Additionally, sensitivity analysis was used to determine the impact of each variable on the forecasting of mechanical properties.

Download Full-text

Application of Advanced Machine Learning Approaches to Predict the Compressive Strength of Concrete Containing Supplementary Cementitious Materials

Materials ◽

10.3390/ma14195762 ◽

2021 ◽

Vol 14 (19) ◽

pp. 5762

Author(s):

Waqas Ahmad ◽

Ayaz Ahmad ◽

Krzysztof Adam Ostrowski ◽

Fahid Aslam ◽

Panuwat Joyklad ◽

...

Keyword(s):

Machine Learning ◽

Mechanical Properties ◽

Compressive Strength ◽

Mean Square Error ◽

Gene Expression Programming ◽

Cementitious Materials ◽

Supplementary Cementitious Materials ◽

Supervised Machine Learning ◽

Mean Square ◽

Compressive Strength Of Concrete

The casting and testing specimens for determining the mechanical properties of concrete is a time-consuming activity. This study employed supervised machine learning techniques, bagging, AdaBoost, gene expression programming, and decision tree to estimate the compressive strength of concrete containing supplementary cementitious materials (fly ash and blast furnace slag). The performance of the models was compared and assessed using the coefficient of determination (R2), mean absolute error, mean square error, and root mean square error. The performance of the model was further validated using the k-fold cross-validation approach. Compared to the other employed approaches, the bagging model was more effective in predicting results, with an R2 value of 0.92. A sensitivity analysis was also prepared to determine the level of contribution of each parameter utilized to run the models. The use of machine learning (ML) techniques to predict the mechanical properties of concrete will be beneficial to the field of civil engineering because it will save time, effort, and resources. The proposed techniques are efficient to forecast the strength properties of concrete containing supplementary cementitious materials (SCM) and pave the way towards the intelligent design of concrete elements and structures.

Download Full-text

Comparative Study of Supervised Machine Learning Algorithms for Predicting the Compressive Strength of Concrete at High Temperature

Materials ◽

10.3390/ma14154222 ◽

2021 ◽

Vol 14 (15) ◽

pp. 4222

Author(s):

Ayaz Ahmad ◽

Krzysztof Adam Ostrowski ◽

Mariusz Maślak ◽

Furqan Farooq ◽

Imran Mehmood ◽

...

Keyword(s):

Machine Learning ◽

Compressive Strength ◽

High Temperature ◽

Mean Square Error ◽

Supervised Machine Learning ◽

Gradient Boosting ◽

Fine Aggregate ◽

Mean Square ◽

Coefficient Corrélation ◽

Compressive Strength Of Concrete

High temperature severely affects the nature of the ingredients used to produce concrete, which in turn reduces the strength properties of the concrete. It is a difficult and time-consuming task to achieve the desired compressive strength of concrete. However, the application of supervised machine learning (ML) approaches makes it possible to initially predict the targeted result with high accuracy. This study presents the use of a decision tree (DT), an artificial neural network (ANN), bagging, and gradient boosting (GB) to forecast the compressive strength of concrete at high temperatures on the basis of 207 data points. Python coding in Anaconda navigator software was used to run the selected models. The software requires information regarding both the input variables and the output parameter. A total of nine input parameters (water, cement, coarse aggregate, fine aggregate, fly ash, superplasticizers, silica fume, nano silica, and temperature) were incorporated as the input, while one variable (compressive strength) was selected as the output. The performance of the employed ML algorithms was evaluated with regards to statistical indicators, including the coefficient correlation (R2), mean absolute error (MAE), mean square error (MSE), and root mean square error (RMSE). Individual models using DT and ANN gave R2 equal to 0.83 and 0.82, respectively, while the use of the ensemble algorithm and gradient boosting gave R2 of 0.90 and 0.88, respectively. This indicates a strong correlation between the actual and predicted outcomes. The k-fold cross-validation, coefficient correlation (R2), and lesser errors (MAE, MSE, and RMSE) showed better performance than the ensemble algorithms. Sensitivity analyses were also conducted in order to check the contribution of each input variable. It has been shown that the use of the ensemble machine learning algorithm would enhance the performance level of the model.

Download Full-text

Forecasting the Market with Machine Learning Algorithms: An Application of NMC-BERT-LSTM-DQN-X Algorithm in Quantitative Trading

ACM Transactions on Knowledge Discovery from Data ◽

10.1145/3488378 ◽

2022 ◽

Vol 16 (4) ◽

pp. 1-22

Author(s):

Chang Liu ◽

Jie Yan ◽

Feiyue Guo ◽

Min Guo

Keyword(s):

Machine Learning ◽

Stock Market ◽

Mean Square Error ◽

Short Term Memory ◽

The State ◽

Machine Learning Algorithms ◽

Future Market ◽

Mean Square ◽

Market Data ◽

Market Trends

Although machine learning (ML) algorithms have been widely used in forecasting the trend of stock market indices, they failed to consider the following crucial aspects for market forecasting: (1) that investors’ emotions and attitudes toward future market trends have material impacts on market trend forecasting (2) the length of past market data should be dynamically adjusted according to the market status and (3) the transition of market statutes should be considered when forecasting market trends. In this study, we proposed an innovative ML method to forecast China's stock market trends by addressing the three issues above. Specifically, sentimental factors (see Appendix [1] for full trans) were first collected to measure investors’ emotions and attitudes. Then, a non-stationary Markov chain (NMC) model was used to capture dynamic transitions of market statutes. We choose the state-of-the-art (SOTA) method, namely, Bidirectional Encoder Representations from Transformers ( BERT ), to predict the state of the market at time t , and a long short-term memory ( LSTM ) model was used to estimate the varying length of past market data in market trend prediction, where the input of LSTM (the state of the market at time t ) was the output of BERT and probabilities for opening and closing of the gates in the LSTM model were based on outputs of the NMC model. Finally, the optimum parameters of the proposed algorithm were calculated using a reinforced learning-based deep Q-Network. Compared to existing forecasting methods, the proposed algorithm achieves better results with a forecasting accuracy of 61.77%, annualized return of 29.25%, and maximum losses of −8.29%. Furthermore, the proposed model achieved the lowest forecasting error: mean square error (0.095), root mean square error (0.0739), mean absolute error (0.104), and mean absolute percent error (15.1%). As a result, the proposed market forecasting model can help investors obtain more accurate market forecast information.

Download Full-text

Insider Threat Detection Using Supervised Machine Learning Algorithms on an Extremely Imbalanced Dataset

International Journal of Cyber Warfare and Terrorism ◽

10.4018/ijcwt.2020040101 ◽

2020 ◽

Vol 10 (2) ◽

pp. 1-26

Author(s):

Naghmeh Moradpoor Sheykhkanloo ◽

Adam Hall

Keyword(s):

Machine Learning ◽

Performance Metrics ◽

Machine Learning Algorithms ◽

Third Party ◽

Supervised Machine Learning ◽

Machine Learning Techniques ◽

Insider Threat ◽

Threat Detection ◽

Imbalanced Dataset ◽

The Impact

An insider threat can take on many forms and fall under different categories. This includes malicious insider, careless/unaware/uneducated/naïve employee, and the third-party contractor. Machine learning techniques have been studied in published literature as a promising solution for such threats. However, they can be biased and/or inaccurate when the associated dataset is hugely imbalanced. Therefore, this article addresses the insider threat detection on an extremely imbalanced dataset which includes employing a popular balancing technique known as spread subsample. The results show that although balancing the dataset using this technique did not improve performance metrics, it did improve the time taken to build the model and the time taken to test the model. Additionally, the authors realised that running the chosen classifiers with parameters other than the default ones has an impact on both balanced and imbalanced scenarios, but the impact is significantly stronger when using the imbalanced dataset.

Download Full-text

Model Evaluation for Forecasting Traffic Accident Severity in Rainy Seasons Using Machine Learning Algorithms: Seoul City Study

Applied Sciences ◽

10.3390/app10010129 ◽

2019 ◽

Vol 10 (1) ◽

pp. 129 ◽

Cited By ~ 3

Author(s):

Jonghak Lee ◽

Taekwan Yoon ◽

Sangil Kwon ◽

Jongtae Lee

Keyword(s):

Machine Learning ◽

Random Forest ◽

Linear Regression ◽

Mean Square Error ◽

Model Evaluation ◽

Traffic Accident ◽

Negative Binomial ◽

Machine Learning Algorithms ◽

Mean Square ◽

Road Geometry

There have been numerous studies on traffic accidents and their severity, particularly in relation to weather conditions and road geometry. In these studies, traditional statistical methods have been employed, such as linear regression, logistic regression, and negative binomial regression modeling, which are the most common linear and non-linear regression analysis methods. In this research, machine learning architecture was applied to this problem using the random forest, artificial neural network, and decision tree techniques to ascertain the strengths and weaknesses of these methods. Three data sets were used: road geometry data, precipitation data, and traffic accident data over nine years corresponding to the Naebu Expressway, which is located in Seoul, Korea. For the model evaluation, three measures were employed: the out-of-bag estimate of error rate (OOB), mean square error (MSE), and root mean square error (RMSE). The low mean OOB, MSE, and RMSE observed in the results obtained using the proposed random forest model demonstrate its accuracy.

Download Full-text

Teleconsultations between Patients and Healthcare Professionals in Primary Care in Catalonia: The Evaluation of Text Classification Algorithms Using Supervised Machine Learning

International Journal of Environmental Research and Public Health ◽

10.3390/ijerph17031093 ◽

2020 ◽

Vol 17 (3) ◽

pp. 1093 ◽

Cited By ~ 4

Author(s):

Francesc López Seguí ◽

Ricardo Ander Egg Aguilar ◽

Gabriel de Maeztu ◽

Anna García-Altés ◽

Francesc García Cuyàs ◽

...

Keyword(s):

Machine Learning ◽

Primary Care ◽

Text Classification ◽

Learning Strategy ◽

Care Service ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Face To Face ◽

The Impact

Background: The primary care service in Catalonia has operated an asynchronous teleconsulting service between GPs and patients since 2015 (eConsulta), which has generated some 500,000 messages. New developments in big data analysis tools, particularly those involving natural language, can be used to accurately and systematically evaluate the impact of the service. Objective: The study was intended to assess the predictive potential of eConsulta messages through different combinations of vector representation of text and machine learning algorithms and to evaluate their performance. Methodology: Twenty machine learning algorithms (based on five types of algorithms and four text representation techniques) were trained using a sample of 3559 messages (169,102 words) corresponding to 2268 teleconsultations (1.57 messages per teleconsultation) in order to predict the three variables of interest (avoiding the need for a face-to-face visit, increased demand and type of use of the teleconsultation). The performance of the various combinations was measured in terms of precision, sensitivity, F-value and the ROC curve. Results: The best-trained algorithms are generally effective, proving themselves to be more robust when approximating the two binary variables “avoiding the need of a face-to-face visit” and “increased demand” (precision = 0.98 and 0.97, respectively) rather than the variable “type of query” (precision = 0.48). Conclusion: To the best of our knowledge, this study is the first to investigate a machine learning strategy for text classification using primary care teleconsultation datasets. The study illustrates the possible capacities of text analysis using artificial intelligence. The development of a robust text classification tool could be feasible by validating it with more data, making it potentially more useful for decision support for health professionals.

Download Full-text

The Impact of Psychopathology, Social Adversity and Stress-relevant DNAm on Prospective Risk for Post-traumatic Stress: A Machine Learning Approach

10.1101/2020.09.25.313635 ◽

2020 ◽

Author(s):

Agaz H. Wani ◽

Allison E. Aiello ◽

Grace S. Kim ◽

Fei Xue ◽

Chantel L. Martin ◽

...

Keyword(s):

Machine Learning ◽

Mean Square Error ◽

Traumatic Stress ◽

High Accuracy ◽

Mean Square ◽

Post Traumatic Stress ◽

Social Adversity ◽

Post Traumatic ◽

Prospective Risk ◽

The Impact

AbstractBackgroundA range of factors have been identified that contribute to greater incidence, severity, and prolonged course of post-traumatic stress disorder (PTSD), including: comorbid and/or prior psychopathology; social adversity such as low socioeconomic position, perceived discrimination, and isolation; and biological factors such as genomic variation at glucocorticoid receptor regulatory network (GRRN) genes. This complex etiology and clinical course make identification of people at higher risk of PTSD challenging. Here we leverage machine learning (ML) approaches to identify a core set of factors that may together predispose persons to PTSD.MethodsWe used multiple ML approaches to assess the relationship among DNA methylation (DNAm) at GRRN genes, prior psychopathology, social adversity, and prospective risk for PTS severity (PTSS).ResultsML models predicted prospective risk of PTSS with high accuracy. The Gradient Boost approach was the top-performing model with mean absolute error of 0.135, mean square error of 0.047, root mean square error of 0.217, and R2 of 95.29%. Prior PTSS ranked highest in predicting the prospective risk of PTSS, accounting for >88% of the prediction. The top ranked GRRN CpG site was cg05616442, in AKT1, and the top ranked social adversity feature was loneliness.ConclusionMultiple factors including prior PTSS, social adversity, and DNAm play a role in predicting prospective risk of PTSS. ML models identified factors accounting for increased PTSS risk with high accuracy, which may help to target risk factors that reduce the likelihood or course of PTSD, potentially pointing to approaches that can lead to early intervention.

Download Full-text

Reinforced XGBoost machine learning model for sustainable intelligent agrarian applications

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-200862 ◽

2020 ◽

Vol 39 (5) ◽

pp. 7605-7620 ◽

Cited By ~ 1

Author(s):

Dhivya Elavarasan ◽

Durai Raj Vincent

Keyword(s):

Machine Learning ◽

Decision Tree ◽

Mean Square Error ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

The Other ◽

Gradient Boosting ◽

Model Assessment ◽

Mean Square ◽

Extreme Gradient Boosting

The development in science and technical intelligence has incited to represent an extensive amount ofdata from various fields of agriculture. Therefore an objective rises up for the examination of the available data and integrating with processes like crop enhancement, yield prediction, examination of plant infections etc. Machine learning has up surged with tremendous processing techniques to perceive new contingencies in the multi-disciplinary agrarian advancements. In this pa- per a novel hybrid regression algorithm, reinforced extreme gradient boosting is proposed which displays essentially improved execution over traditional machine learning algorithms like artificial neural networks, deep Q-Network, gradient boosting, ran- dom forest and decision tree. Extreme gradient boosting constructs new models, which are essentially, decision trees learning from the mistakes of their predecessors by optimizing the gradient descent loss function. The proposed hybrid model performs reinforcement learning at every node during the node splitting process of the decision tree construction. This leads to effective utilizationofthesamplesbyselectingtheappropriatesplitattributeforenhancedperformance. Model’sperformanceisevaluated by means of Mean Square Error, Root Mean Square Error, Mean Absolute Error, and Coefficient of Determination. To assure a fair assessment of the results, the model assessment is performed on both training and test dataset. The regression diagnostic plots from residuals and the results obtained evidently delineates the fact that proposed hybrid approach performs better with reduced error measure and improved accuracy of 94.15% over the other machine learning algorithms. Also the performance of probability density function for the proposed model delineates that, it can preserve the actual distributional characteristics of the original crop yield data more approximately when compared to the other experimented machine learning models.

Download Full-text

Integrating water quality and streamflow into prediction of chemical dosage in a drinking water treatment plant using machine learning algorithms

Water Science & Technology Water Supply ◽

10.2166/ws.2021.435 ◽

2021 ◽

Author(s):

Hui Wang ◽

Tirusew Asefa ◽

Jack Thornburgh

Keyword(s):

Machine Learning ◽

Water Quality ◽

Drinking Water ◽

Water Treatment ◽

Mean Square Error ◽

Learning Algorithms ◽

Drinking Water Treatment ◽

Machine Learning Algorithms ◽

Support Vector ◽

Mean Square

Abstract Understanding the relationship between raw water quality and chemical dosage is especially important for drinking water treatment plants (DWTP) that have multiple water sources where the ratio of different supply sources could change with seasons or in a matter of weeks in response to changing hydrologic conditions. In this study, the potential for deploying machine learning algorithms, including principal component regression (PCR), support vector regression (SVR) and long short-term memory (LSTM) neural network, are tested to build predictive models. These tools were used to estimate chemical dosage at daily time scale. Influent water quality such as pH, color, turbidity, and alkalinity, as well as chemical dosage including sulfuric acid, ferric sulfate and liquid oxygen were used to build and test these models. An 80/20 percent data split was used for training and testing model performance using correlation coefficients, relative mean square error, relative root mean square error and Nash-Sutcliffe efficiency. Results indicate, compared to PCR, both SVR and LSTM, were able to capture the nonlinear relationship between chemical dose and source water quality changes and displayed higher predictive skills. These types of models have application in real-time operational support without requiring computationally expensive physics-based models.

Download Full-text

Forecasting US movies box office performances in Turkey using machine learning algorithms

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-189120 ◽

2020 ◽

Vol 39 (5) ◽

pp. 6579-6590

Author(s):

Sandy Çağlıyor ◽

Başar Öztayşi ◽

Selime Sezgin

Keyword(s):

Machine Learning ◽

Global Economy ◽

Learning Algorithms ◽

Forecast Model ◽

Machine Learning Algorithms ◽

Gradient Boosting ◽

High Stakes ◽

Box Office ◽

Industry Forecast ◽

The Impact

The motion picture industry is one of the largest industries worldwide and has significant importance in the global economy. Considering the high stakes and high risks in the industry, forecast models and decision support systems are gaining importance. Several attempts have been made to estimate the theatrical performance of a movie before or at the early stages of its release. Nevertheless, these models are mostly used for predicting domestic performances and the industry still struggles to predict box office performances in overseas markets. In this study, the aim is to design a forecast model using different machine learning algorithms to estimate the theatrical success of US movies in Turkey. From various sources, a dataset of 1559 movies is constructed. Firstly, independent variables are grouped as pre-release, distributor type, and international distribution based on their characteristic. The number of attendances is discretized into three classes. Four popular machine learning algorithms, artificial neural networks, decision tree regression and gradient boosting tree and random forest are employed, and the impact of each group is observed by compared by the performance models. Then the number of target classes is increased into five and eight and results are compared with the previously developed models in the literature.

Download Full-text