scholarly journals Improving eQTL Analysis Using a Machine Learning Approach for Data Integration: A Logistic Model Tree Solution

2018 ◽  
Vol 25 (10) ◽  
pp. 1091-1105 ◽  
Author(s):  
Stefano Beretta ◽  
Mauro Castelli ◽  
Ivo Gonçalves ◽  
Ivan Kel ◽  
Valentina Giansanti ◽  
...  
PLoS ONE ◽  
2021 ◽  
Vol 16 (5) ◽  
pp. e0252096
Author(s):  
Maria B. Rabaglino ◽  
Alan O’Doherty ◽  
Jan Bojsen-Møller Secher ◽  
Patrick Lonergan ◽  
Poul Hyttel ◽  
...  

Pregnancy rates for in vitro produced (IVP) embryos are usually lower than for embryos produced in vivo after ovarian superovulation (MOET). This is potentially due to alterations in their trophectoderm (TE), the outermost layer in physical contact with the maternal endometrium. The main objective was to apply a multi-omics data integration approach to identify both temporally differentially expressed and differentially methylated genes (DEG and DMG), between IVP and MOET embryos, that could impact TE function. To start, four and five published transcriptomic and epigenomic datasets, respectively, were processed for data integration. Second, DEG from day 7 to days 13 and 16 and DMG from day 7 to day 17 were determined in the TE from IVP vs. MOET embryos. Third, genes that were both DE and DM were subjected to hierarchical clustering and functional enrichment analysis. Finally, findings were validated through a machine learning approach with two additional datasets from day 15 embryos. There were 1535 DEG and 6360 DMG, with 490 overlapped genes, whose expression profiles at days 13 and 16 resulted in three main clusters. Cluster 1 (188) and Cluster 2 (191) genes were down-regulated at day 13 or day 16, respectively, while Cluster 3 genes (111) were up-regulated at both days, in IVP embryos compared to MOET embryos. The top enriched terms were the KEGG pathway "focal adhesion" in Cluster 1 (FDR = 0.003), and the cellular component: "extracellular exosome" in Cluster 2 (FDR<0.0001), also enriched in Cluster 1 (FDR = 0.04). According to the machine learning approach, genes in Cluster 1 showed a similar expression pattern between IVP and less developed (short) MOET conceptuses; and between MOET and DKK1-treated (advanced) IVP conceptuses. In conclusion, these results suggest that early conceptuses derived from IVP embryos exhibit epigenomic and transcriptomic changes that later affect its elongation and focal adhesion, impairing post-transfer survival.


2020 ◽  
Vol 12 (17) ◽  
pp. 2742
Author(s):  
Ehsan Kamali Maskooni ◽  
Seyed Amir Naghibi ◽  
Hossein Hashemi ◽  
Ronny Berndtsson

Groundwater (GW) is being uncontrollably exploited in various parts of the world resulting from huge needs for water supply as an outcome of population growth and industrialization. Bearing in mind the importance of GW potential assessment in reaching sustainability, this study seeks to use remote sensing (RS)-derived driving factors as an input of the advanced machine learning algorithms (MLAs), comprising deep boosting and logistic model trees to evaluate their efficiency. To do so, their results are compared with three benchmark MLAs such as boosted regression trees, k-nearest neighbors, and random forest. For this purpose, we firstly assembled different topographical, hydrological, RS-based, and lithological driving factors such as altitude, slope degree, aspect, slope length, plan curvature, profile curvature, relative slope position, distance from rivers, river density, topographic wetness index, land use/land cover (LULC), normalized difference vegetation index (NDVI), distance from lineament, lineament density, and lithology. The GW spring indicator was divided into two classes for training (434 springs) and validation (186 springs) with a proportion of 70:30. The training dataset of the springs accompanied by the driving factors were incorporated into the MLAs and the outputs were validated by different indices such as accuracy, kappa, receiver operating characteristics (ROC) curve, specificity, and sensitivity. Based upon the area under the ROC curve, the logistic model tree (87.813%) generated similar performance to deep boosting (87.807%), followed by boosted regression trees (87.397%), random forest (86.466%), and k-nearest neighbors (76.708%) MLAs. The findings confirm the great performance of the logistic model tree and deep boosting algorithms in modelling GW potential. Thus, their application can be suggested for other areas to obtain an insight about GW-related barriers toward sustainability. Further, the outcome based on the logistic model tree algorithm depicts the high impact of the RS-based factor, such as NDVI with 100 relative influence, as well as high influence of the distance from river, altitude, and RSP variables with 46.07, 43.47, and 37.20 relative influence, respectively, on GW potential.


Forests ◽  
2020 ◽  
Vol 11 (8) ◽  
pp. 830 ◽  
Author(s):  
Viet-Ha Nhu ◽  
Ayub Mohammadi ◽  
Himan Shahabi ◽  
Baharin Bin Ahmad ◽  
Nadhir Al-Ansari ◽  
...  

We used remote sensing techniques and machine learning to detect and map landslides, and landslide susceptibility in the Cameron Highlands, Malaysia. We located 152 landslides using a combination of interferometry synthetic aperture radar (InSAR), Google Earth (GE), and field surveys. Of the total slide locations, 80% (122 landslides) were utilized for training the selected algorithms, and the remaining 20% (30 landslides) were applied for validation purposes. We employed 17 conditioning factors, including slope angle, aspect, elevation, curvature, profile curvature, stream power index (SPI), topographic wetness index (TWI), lithology, soil type, land cover, normalized difference vegetation index (NDVI), distance to river, distance to fault, distance to road, river density, fault density, and road density, which were produced from satellite imageries, geological map, soil maps, and a digital elevation model (DEM). We used these factors to produce landslide susceptibility maps using logistic regression (LR), logistic model tree (LMT), and random forest (RF) models. To assess prediction accuracy of the models we employed the following statistical measures: negative predictive value (NPV), sensitivity, positive predictive value (PPV), specificity, root-mean-squared error (RMSE), accuracy, and area under the receiver operating characteristic (ROC) curve (AUC). Our results indicated that the AUC was 92%, 90%, and 88% for the LMT, LR, and RF algorithms, respectively. To assess model performance, we also applied non-parametric statistical tests of Friedman and Wilcoxon, where the results revealed that there were no practical differences among the used models in the study area. While landslide mapping in tropical environment such as Cameron Highlands remains difficult, the remote sensing (RS) along with machine learning techniques, such as the LMT model, show promise for landslide susceptibility mapping in the study area.


2013 ◽  
Vol 5 (1) ◽  
Author(s):  
Francesco Napolitano ◽  
Yan Zhao ◽  
Vânia M Moreira ◽  
Roberto Tagliaferri ◽  
Juha Kere ◽  
...  

2021 ◽  
Vol 19 (3) ◽  
pp. 184-193
Author(s):  
G. Geetharamani ◽  
K. Dhinakaran ◽  
Janarthanan Selvaraj ◽  
S. Christopher Ezhil Singh

Data mining and machine learning analytics in manufacturing field is one of the major research fields in Information Technology with a lot of challenges. The goal of this research is to design a categorical solution to decide whether a customer is eligible and interested to purchase a sport-utility vehicle (SUV) based on the available data from the previous records collected from the banks. The data from different customers across various ages who have purchased the sport-utility vehicle earlier are collected and used in building a solution for this logistic model. A range of age and an estimated salary across different ages are the dependent factors in building this model. In addition, this model will predict the binary logistic outcome to show whether a customer can purchase a sport-utility vehicle or not. By enhanced cloud platform with larger volume of data keeping the algorithm remains the same using machine learning deployment for predicting the customer mindset in purchasing a sport-utility vehicle.


Sign in / Sign up

Export Citation Format

Share Document