scholarly journals Supervised Machine Learning Algorithms for Bioelectromagnetics: Prediction Models and Feature Selection Techniques Using Data from Weak Radiofrequency Radiation Effect on Human and Animals Cells

Author(s):  
Malka N. Halgamuge

The emergence of new technologies to incorporate and analyze data with high-performance computing has expanded our capability to accurately predict any incident. Supervised Machine learning (ML) can be utilized for a fast and consistent prediction, and to obtain the underlying pattern of the data better. We develop a prediction strategy, for the first time, using supervised ML to observe the possible impact of weak radiofrequency electromagnetic field (RF-EMF) on human and animal cells without performing in-vitro laboratory experiments. We extracted laboratory experimental data from 300 peer-reviewed scientific publications (1990–2015) describing 1127 experimental case studies of human and animal cells response to RF-EMF. We used domain knowledge, Principal Component Analysis (PCA), and the Chi-squared feature selection techniques to select six optimal features for computation and cost-efficiency. We then develop grouping or clustering strategies to allocate these selected features into five different laboratory experiment scenarios. The dataset has been tested with ten different classifiers, and the outputs are estimated using the k-fold cross-validation method. The assessment of a classifier’s prediction performance is critical for assessing its suitability. Hence, a detailed comparison of the percentage of the model accuracy (PCC), Root Mean Squared Error (RMSE), precision, sensitivity (recall), 1 − specificity, Area under the ROC Curve (AUC), and precision-recall (PRC Area) for each classification method were observed. Our findings suggest that the Random Forest algorithm exceeds in all groups in terms of all performance measures and shows AUC = 0.903 where k-fold = 60. A robust correlation was observed in the specific absorption rate (SAR) with frequency and cumulative effect or exposure time with SAR×time (impact of accumulated SAR within the exposure time) of RF-EMF. In contrast, the relationship between frequency and exposure time was not significant. In future, with more experimental data, the sample size can be increased, leading to more accurate work.

Sensors ◽  
2021 ◽  
Vol 21 (14) ◽  
pp. 4821
Author(s):  
Rami Ahmad ◽  
Raniyah Wazirali ◽  
Qusay Bsoul ◽  
Tarik Abu-Ain ◽  
Waleed Abu-Ain

Wireless Sensor Networks (WSNs) continue to face two major challenges: energy and security. As a consequence, one of the WSN-related security tasks is to protect them from Denial of Service (DoS) and Distributed DoS (DDoS) attacks. Machine learning-based systems are the only viable option for these types of attacks, as traditional packet deep scan systems depend on open field inspection in transport layer security packets and the open field encryption trend. Moreover, network data traffic will become more complex due to increases in the amount of data transmitted between WSN nodes as a result of increasing usage in the future. Therefore, there is a need to use feature selection techniques with machine learning in order to determine which data in the DoS detection process are most important. This paper examined techniques for improving DoS anomalies detection along with power reservation in WSNs to balance them. A new clustering technique was introduced, called the CH_Rotations algorithm, to improve anomaly detection efficiency over a WSN’s lifetime. Furthermore, the use of feature selection techniques with machine learning algorithms in examining WSN node traffic and the effect of these techniques on the lifetime of WSNs was evaluated. The evaluation results showed that the Water Cycle (WC) feature selection displayed the best average performance accuracy of 2%, 5%, 3%, and 3% greater than Particle Swarm Optimization (PSO), Simulated Annealing (SA), Harmony Search (HS), and Genetic Algorithm (GA), respectively. Moreover, the WC with Decision Tree (DT) classifier showed 100% accuracy with only one feature. In addition, the CH_Rotations algorithm improved network lifetime by 30% compared to the standard LEACH protocol. Network lifetime using the WC + DT technique was reduced by 5% compared to other WC + DT-free scenarios.


This paper demonstrates the utilization of machine learning algorithms in the prediction of housing selling prices on real dataset collected from the Petaling Jaya area, Selangor, Malaysia. To date, literature about research on machine learning prediction of housing selling price in Malaysia is scarce. This paper provides a brief review of the existing machine learning algorithms for the prediction problem and presents the characteristics of the collected datasets with different groups of feature selection. The findings indicate that using irrelevant features from the dataset can decrease the accuracy of the prediction models.


2021 ◽  
Author(s):  
Ravi Arkalgud ◽  
◽  
Andrew McDonald ◽  
Ross Brackenridge ◽  
◽  
...  

Automation is becoming an integral part of our daily lives as technology and techniques rapidly develop. Many automation workflows are now routinely being applied within the geoscience domain. The basic structure of automation and its success of modelling fundamentally hinges on the appropriate choice of parameters and speed of processing. The entire process demands that the data being fed into any machine learning model is essentially of good quality. The technological advances in well logging technology over decades have enabled the collection of vast amounts of data across wells and fields. This poses a major issue in automating petrophysical workflows. It necessitates to ensure that, the data being fed is appropriate and fit for purpose. The selection of features (logging curves) and parameters for machine learning algorithms has therefore become a topic at the forefront of related research. Inappropriate feature selections can lead erroneous results, reduced precision and have proved to be computationally expensive. Experienced Eye (EE) is a novel methodology, derived from Domain Transfer Analysis (DTA), which seeks to identify and elicit the optimum input curves for modelling. During the EE solution process, relationships between the input variables and target variables are developed, based on characteristics and attributes of the inputs instead of statistical averages. The relationships so developed between variables can then be ranked appropriately and selected for modelling process. This paper focuses on three distinct petrophysical data scenarios where inputs are ranked prior to modelling: prediction of continuous permeability from discrete core measurements, porosity from multiple logging measurements and finally the prediction of key geomechanical properties. Each input curve is ranked against a target feature. For each case study, the best ranked features were carried forward to the modelling stage, and the results are validated alongside conventional interpretation methods. Ranked features were also compared between different machine learning algorithms: DTA, Neural Networks and Multiple Linear Regression. Results are compared with the available data for various case studies. The use of the new feature selection has been proven to improve accuracy and precision of prediction results from multiple modelling algorithms.


2020 ◽  
Vol 183 (6) ◽  
pp. 657-667
Author(s):  
Jacopo Burrello ◽  
Alessio Burrello ◽  
Jacopo Pieroni ◽  
Elisa Sconfienza ◽  
Vittorio Forestiero ◽  
...  

Objective Adrenal venous sampling (AVS) is the gold standard to discriminate patients with unilateral primary aldosteronism (UPA) from bilateral disease (BPA). AVS is technically demanding and in cases of unsuccessful cannulation of adrenal veins, the results may not always be interpreted. The aim of our study was to develop diagnostic models to distinguish UPA from BPA, in cases of unilateral successful AVS and the presence of contralateral suppression of aldosterone secretion. Design Retrospective evaluation of 158 patients referred to a tertiary hypertension unit who underwent AVS. We randomly assigned 110 patients to a training cohort and 48 patients to a validation cohort to develop and test the diagnostic models. Methods Supervised machine learning algorithms and regression models were used to develop and validate two prediction models and a simple 19-point score system to stratify patients according to their subtype diagnosis. Results Aldosterone levels at screening and after confirmatory testing, lowest potassium, ipsilateral and contralateral imaging findings at CT scanning, and contralateral ratio at AVS, were associated with a diagnosis of UPA and were included in the diagnostic models. Machine learning algorithms correctly classified the majority of patients both at training and validation (accuracy: 82.9–95.7%). The score system displayed a sensitivity/specificity of 95.2/96.9%, with an AUC of 0.971. A flow-chart integrating our score correctly managed all patients except 3 (98.1% accuracy), avoiding the potential repetition of 77.2% of AVS procedures. Conclusions Our score could be integrated in clinical practice and guide surgical decision-making in patients with unilateral successful AVS and contralateral suppression.


PLoS ONE ◽  
2020 ◽  
Vol 15 (11) ◽  
pp. e0242943
Author(s):  
Sutanu Nandi ◽  
Piyali Ganguli ◽  
Ram Rup Sarkar

Essential gene prediction helps to find minimal genes indispensable for the survival of any organism. Machine learning (ML) algorithms have been useful for the prediction of gene essentiality. However, currently available ML pipelines perform poorly for organisms with limited experimental data. The objective is the development of a new ML pipeline to help in the annotation of essential genes of less explored disease-causing organisms for which minimal experimental data is available. The proposed strategy combines unsupervised feature selection technique, dimension reduction using the Kamada-Kawai algorithm, and semi-supervised ML algorithm employing Laplacian Support Vector Machine (LapSVM) for prediction of essential and non-essential genes from genome-scale metabolic networks using very limited labeled dataset. A novel scoring technique, Semi-Supervised Model Selection Score, equivalent to area under the ROC curve (auROC), has been proposed for the selection of the best model when supervised performance metrics calculation is difficult due to lack of data. The unsupervised feature selection followed by dimension reduction helped to observe a distinct circular pattern in the clustering of essential and non-essential genes. LapSVM then created a curve that dissected this circle for the classification and prediction of essential genes with high accuracy (auROC > 0.85) even with 1% labeled data for model training. After successful validation of this ML pipeline on both Eukaryotes and Prokaryotes that show high accuracy even when the labeled dataset is very limited, this strategy is used for the prediction of essential genes of organisms with inadequate experimentally known data, such as Leishmania sp. Using a graph-based semi-supervised machine learning scheme, a novel integrative approach has been proposed for essential gene prediction that shows universality in application to both Prokaryotes and Eukaryotes with limited labeled data. The essential genes predicted using the pipeline provide an important lead for the prediction of gene essentiality and identification of novel therapeutic targets for antibiotic and vaccine development against disease-causing parasites.


Sign in / Sign up

Export Citation Format

Share Document