scholarly journals Using Machine Learning To Improve the Accuracy of Genomic Prediction on Reproduction Traits in Pigs

Author(s):  
Xue Wang ◽  
Shaolei Shi ◽  
Guijiang Wang ◽  
Wenxue Luo ◽  
Xia Wei ◽  
...  

Abstract Background Recently, machine learning (ML) is becoming attractive in genomic prediction, while its superiority in genomic prediction and the choosing of optimal ML methods are needed investigation. Results In this study, 2566 Chinese Yorkshire pigs with reproduction traits records were used, they were genotyped with GenoBaits Porcine SNP 50K and PorcineSNP50 panel. Four ML methods, including support vector regression (SVR), kernel ridge regression (KRR), random forest (RF) and Adaboost.R2 were implemented. Through 20 replicates of five-fold cross-validation, the genomic prediction abilities of ML methods were explored. Compared with genomic BLUP(GBLUP), single-step GBLUP (ssGBLUP) and Bayesian method BayesHE, our results indicated that ML methods significantly outperformed. The prediction accuracy of ML methods was improved by 19.3%, 15.0% and 20.8% on average over GBLUP, ssGBLUP and BayesHE, ranging from 8.9–24.0%, 7.6–17.5% and 11.1–24.6%, respectively. In addition, ML methods yielded smaller mean squared error (MSE) and mean absolute error (MAE) in all scenarios. ssGBLUP yielded improvement of 3.7% on average compared to GBLUP, and the performance of BayesHE was close to GBLUP. Among four ML methods, SVR and KRR had the most robust prediction abilities, which yielded higher accuracies, lower bias, lower MSE and MAE, and comparable computing efficiency as GBLUP. RF demonstrated the lowest prediction ability and computational efficiency among ML methods. Conclusion Our findings demonstrated that ML methods are more efficient than traditional genomic selection methods, and it could be new options for genomic prediction.

Author(s):  
Ahmed Hassan Mohammed Hassan ◽  
◽  
Arfan Ali Mohammed Qasem ◽  
Walaa Faisal Mohammed Abdalla ◽  
Omer H. Elhassan

Day by day, the accumulative incidence of COVID-19 is rapidly increasing. After the spread of the Corona epidemic and the death of more than a million people around the world countries, scientists and researchers have tended to conduct research and take advantage of modern technologies to learn machine to help the world to get rid of the Coronavirus (COVID-19) epidemic. To track and predict the disease Machine Learning (ML) can be deployed very effectively. ML techniques have been anticipated in areas that need to identify dangerous negative factors and define their priorities. The significance of a proposed system is to find the predict the number of people infected with COVID19 using ML. Four standard models anticipate COVID-19 prediction, which are Neural Network (NN), Support Vector Machines (SVM), Bayesian Network (BN) and Polynomial Regression (PR). The data utilized to test these models content of number of deaths, newly infected cases, and recoveries in the next 20 days. Five measures parameters were used to evaluate the performance of each model, namely root mean squared error (RMSE), mean squared error (MAE), mean absolute error (MSE), Explained Variance score and r2 score (R2). The significance and value of proposed system auspicious mechanism to anticipate these models for the current cenario of the COVID-19 epidemic. The results showed NN outperformed the other models, while in the available dataset the SVM performs poorly in all the prediction. Reference to our results showed that injuries will increase slightly in the coming days. Also, we find that the results give rise to hope due to the low death rate. For future perspective, case explanation and data amalgamation must be kept up persistently.


Author(s):  
Gaurav Singh ◽  
Shivam Rai ◽  
Himanshu Mishra ◽  
Manoj Kumar

The prime objective of this work is to predicting and analysing the Covid-19 pandemic around the world using Machine Learning algorithms like Polynomial Regression, Support Vector Machine and Ridge Regression. And furthermore, assess and compare the performance of the varied regression algorithms as far as parameters like R squared, Mean Absolute Error, Mean Squared Error and Root Mean Squared Error. In this work, we have used the dataset available on Covid-19 Data Repository by the Center for Systems Science and Engineering (CSSE) at John Hopkins University. We have analyzed the covid19 cases from 22/1/2020 till now. We applied a supervised machine learning prediction model to forecast the possible confirmed cases for the next ten days.


2020 ◽  
Vol 17 (9) ◽  
pp. 4703-4708
Author(s):  
K. Anitha Kumari ◽  
Avinash Sharma ◽  
S. Nivethitha ◽  
V. Dharini ◽  
V. Sanjith ◽  
...  

Electrical appliances most commonly consist of two electrical devices, namely, electrical motors and transformers. Typically, electrical motors are normally used in all sort of industrial purposes. Failures of such motors results in serious problems, such as overheat, shut down and even burnt, in their host systems. Thus, more attention have to be paid in detecting the outliers. In a similar way, to avoid the unexpected power reliability problems and system damages, the prediction of the failures in the transformers is expected to quantify the impacts. By predicting the failures, the lifetime of the transformers increases and unnecessary accidents is avoided. Therefore, this paper presents the detection of the outliers in electrical motors and failures in transformers using supervised machine learning algorithms. Machine learning techniques such as Support Vector Machine (SVM), Random Forest (RF) and regression techniques like Support Vector Regression (SVR), Polynomial Regression (PR) are used to analyze the use cases of different motor specifications. Evaluation and the efficiency of findings are proved by considering accuracy, precision, F-measure, and recall for motors. Mean Absolute Error (MAE), Mean Squared Error (MSE), Root Mean Square Error (RMSE) and R-squared Error (R2) are considered as metrics for transformers. The proposed approach helps to identify the anomalies like vibration loss, copper loss and overheating in the industrial motor and to determine the abnormal functioning of the transformer that in turn leads to ascertain the lifetime. The proposed system analyses the behaviour of the electrical machines using the energy meter data and reports the outliers to users. It also analyses the abnormalities occurring in the transformer using the parameters involved in the degradation of the paper-oil insulation system and the voltage of operation as a whole leads to the predict the lifetime.


Water ◽  
2021 ◽  
Vol 13 (4) ◽  
pp. 547 ◽  
Author(s):  
Ahmed Elbeltagi ◽  
Nikul Kumari ◽  
Jaydeo K. Dharpure ◽  
Ali Mokhtar ◽  
Karam Alsafadi ◽  
...  

Drought is a fundamental physical feature of the climate pattern worldwide. Over the past few decades, a natural disaster has accelerated its occurrence, which has significantly impacted agricultural systems, economies, environments, water resources, and supplies. Therefore, it is essential to develop new techniques that enable comprehensive determination and observations of droughts over large areas with satisfactory spatial and temporal resolution. This study modeled a new drought index called the Combined Terrestrial Evapotranspiration Index (CTEI), developed in the Ganga river basin. For this, five Machine Learning (ML) techniques, derived from artificial intelligence theories, were applied: the Support Vector Machine (SVM) algorithm, decision trees, Matern 5/2 Gaussian process regression, boosted trees, and bagged trees. These techniques were driven by twelve different models generated from input combinations of satellite data and hydrometeorological parameters. The results indicated that the eighth model performed best and was superior among all the models, with the SVM algorithm resulting in an R2 value of 0.82 and the lowest errors in terms of the Root Mean Squared Error (RMSE) (0.33) and Mean Absolute Error (MAE) (0.20), followed by the Matern 5/2 Gaussian model with an R2 value of 0.75 and RMSE and MAE of 0.39 and 0.21 mm/day, respectively. Moreover, among all the five methods, the SVM and Matern 5/2 Gaussian methods were the best-performing ML algorithms in our study of CTEI predictions for the Ganga basin.


2021 ◽  
Vol 14 (4) ◽  
pp. 829-834
Author(s):  
Thanida Sananmuang ◽  
Kanchanarat Mankong ◽  
Suppawiwat Ponglowhapan ◽  
Kaj Chokeshaiusaha

Background and Aim: Fetal biparietal diameter (BPD) is a feasible parameter to predict canine parturition date due to its inverted correlation with days before parturition (DBP). Although such a relationship is generally described using a simple linear regression (SLR) model, the imprecision of this model in predicting the parturition date in small- to medium-sized dogs is a common problem among veterinarian practitioners. Support vector regression (SVR) is a useful machine learning model for prediction. This study aimed to compare the accuracy of SVR with that of SLR in predicting DBP. Materials and Methods: After measuring 101 BPDs in 35 small- to medium-sized pregnant bitches, we fitted the data to the routine SLR model and the SVR model using three different kernel functions, radial basis function SVR, linear SVR, and polynomial SVR. The predicted DBP acquired from each model was further utilized for calculating the coefficient of determination (R2), mean absolute error, and mean squared error scores for determining the prediction accuracy. Results: All SVR models were more accurate than the SLR model at predicting DBP. The linear and polynomial SVRs were identified as the two most accurate models (p<0.01). Conclusion: With available machine learning software, linear and polynomial SVRs can be applied to predicting DBP in small- to medium-sized pregnant bitches.


2020 ◽  
Author(s):  
Mang Liang ◽  
Tianpeng Chang ◽  
Bingxing An ◽  
Xinghai Duan ◽  
Lili Du ◽  
...  

Abstract Background: Machine learning (ML) is perhaps the most useful for the interpretation of large genomic datasets. However, the performance of a single machine learning method in genomic selection (GS) was unsatisfactory in existing research. To improve the genomic predictions, we constructed a stacking ensemble learning framework (SELF) integrated three machine learning methods to predict genomic estimated breeding values (GEBVs). Results: We evaluated the prediction ability of SELF by three real datasets and compared the prediction accuracy of SELF, base learners, GBLUP and BayesB. For each trait, SELF performed better than base learners, which included support vector regression (SVR), kernel ridge regression (KRR) and elastic net (ENET). The prediction accuracy of SELF had an average 7.70% improvement compared with GBLUP in three datasets. Except for the milk fat percentage (MFP) traits of the German Holstein dairy cattle dataset, SELF more robust than BayesB in the remaining traits.Conclusions: In this study, we utilized a stacking ensemble learning framework (SELF) to genomic prediction and it performed much better than GBLUP and BayesB in three real datasets with different genetic architecture. Therefore, we believed SEFL had the potential to be promoted to estimate GEBVs in other animals and plants.


Water ◽  
2020 ◽  
Vol 12 (6) ◽  
pp. 1734 ◽  
Author(s):  
Samit Thapa ◽  
Zebin Zhao ◽  
Bo Li ◽  
Lu Lu ◽  
Donglei Fu ◽  
...  

Although machine learning (ML) techniques are increasingly popular in water resource studies, they are not extensively utilized in modeling snowmelt. In this study, we developed a model based on a deep learning long short-term memory (LSTM) for snowmelt-driven discharge modeling in a Himalayan basin. For comparison, we developed the nonlinear autoregressive exogenous model (NARX), Gaussian process regression (GPR), and support vector regression (SVR) models. The snow area derived from moderate resolution imaging spectroradiometer (MODIS) snow images along with remotely sensed meteorological products were utilized as inputs to the models. The Gamma test was conducted to determine the appropriate input combination for the models. The shallow LSTM model with a hidden layer achieved superior results than the deeper LSTM models with multiple hidden layers. Out of seven optimizers tested, Adamax proved to be the aptest optimizer for this study. The evaluation of the ML models was done by the coefficient of determination (R2), mean absolute error (MAE), modified Kling–Gupta efficiency (KGE’), Nash–Sutcliffe efficiency (NSE), and root-mean-squared error (RMSE). The LSTM model (KGE’ = 0.99) enriched with snow cover input achieved the best results followed by NARX (KGE’ = 0.974), GPR (KGE’ = 0.95), and SVR (KGE’ = 0.949), respectively. The outcome of this study proves the applicability of the ML models, especially the LSTM model, in predicting snowmelt driven discharge in the data-scant mountainous watersheds.


Transport ◽  
2020 ◽  
Vol 35 (5) ◽  
pp. 462-473
Author(s):  
Aleksandar Vorkapić ◽  
Radoslav Radonja ◽  
Karlo Babić ◽  
Sanda Martinčić-Ipšić

The aim of this article is to enhance performance monitoring of a two-stroke electronically controlled ship propulsion engine on the operating envelope. This is achieved by setting up a machine learning model capable of monitoring influential operating parameters and predicting the fuel consumption. Model is tested with different machine learning algorithms, namely linear regression, multilayer perceptron, Support Vector Machines (SVM) and Random Forests (RF). Upon verification of modelling framework and analysing the results in order to improve the prediction accuracy, the best algorithm is selected based on standard evaluation metrics, i.e. Root Mean Square Error (RMSE) and Relative Absolute Error (RAE). Experimental results show that, by taking an adequate combination and processing of relevant sensory data, SVM exhibit the lowest RMSE 7.1032 and RAE 0.5313%. RF achieve the lowest RMSE 22.6137 and RAE 3.8545% in a setting when minimal number of input variables is considered, i.e. cylinder indicated pressures and propulsion engine revolutions. Further, article deals with the detection of anomalies of operating parameters, which enables the evaluation of the propulsion engine condition and the early identification of failures and deterioration. Such a time-dependent, self-adopting anomaly detection model can be used for comparison with the initial condition recorded during the test and sea run or after survey and docking. Finally, we propose a unified model structure, incorporating fuel consumption prediction and anomaly detection model with on-board decision-making process regarding navigation and maintenance.


2021 ◽  
Vol 2021 ◽  
pp. 1-14
Author(s):  
Hai-Bang Ly ◽  
Thuy-Anh Nguyen ◽  
Binh Thai Pham

Soil cohesion (C) is one of the critical soil properties and is closely related to basic soil properties such as particle size distribution, pore size, and shear strength. Hence, it is mainly determined by experimental methods. However, the experimental methods are often time-consuming and costly. Therefore, developing an alternative approach based on machine learning (ML) techniques to solve this problem is highly recommended. In this study, machine learning models, namely, support vector machine (SVM), Gaussian regression process (GPR), and random forest (RF), were built based on a data set of 145 soil samples collected from the Da Nang-Quang Ngai expressway project, Vietnam. The database also includes six input parameters, that is, clay content, moisture content, liquid limit, plastic limit, specific gravity, and void ratio. The performance of the model was assessed by three statistical criteria, namely, the correlation coefficient (R), mean absolute error (MAE), and root mean square error (RMSE). The results demonstrated that the proposed RF model could accurately predict soil cohesion with high accuracy (R = 0.891) and low error (RMSE = 3.323 and MAE = 2.511), and its predictive capability is better than SVM and GPR. Therefore, the RF model can be used as a cost-effective approach in predicting soil cohesion forces used in the design and inspection of constructions.


2021 ◽  
Author(s):  
Hangsik Shin

BACKGROUND Arterial stiffness due to vascular aging is a major indicator for evaluating cardiovascular risk. OBJECTIVE In this study, we propose a method of estimating age by applying machine learning to photoplethysmogram for non-invasive vascular age assessment. METHODS The machine learning-based age estimation model that consists of three convolutional layers and two-layer fully connected layers, was developed using segmented photoplethysmogram by pulse from a total of 752 adults aged 19–87 years. The performance of the developed model was quantitatively evaluated using mean absolute error, root-mean-squared-error, Pearson’s correlation coefficient, coefficient of determination. The Grad-Cam was used to explain the contribution of photoplethysmogram waveform characteristic in vascular age estimation. RESULTS Mean absolute error of 8.03, root mean squared error of 9.96, 0.62 of correlation coefficient, and 0.38 of coefficient of determination were shown through 10-fold cross validation. Grad-Cam, used to determine the weight that the input signal contributes to the result, confirmed that the contribution to the age estimation of the photoplethysmogram segment was high around the systolic peak. CONCLUSIONS The machine learning-based vascular aging analysis method using the PPG waveform showed comparable or superior performance compared to previous studies without complex feature detection in evaluating vascular aging. CLINICALTRIAL 2015-0104


Sign in / Sign up

Export Citation Format

Share Document