Machine Learned Metamodeling of a Computationally Intensive Accident Simulation Code

Abstract The nuclear power industry is increasingly identifying applications of machine learning to reduce design, engineering, manufacturing, and operational costs. In some cases, applications have been deployed and are realizing value, in particular in the higher volume and data rich manufacturing areas of the nuclear industry. In this paper, we use machine learning to develop metamodel approximations of a computationally intense safety analysis code used to simulate a postulated loss-of-coolant accident (LOCA). The benefit of an accurate metamodel is that it runs at a fraction of the computational cost (milliseconds) compared to the LOCA analysis code. Metamodels can therefore support applications requiring a high volume of runs, for example optimization, uncertainty analysis, and probabilistic decision analysis, which would otherwise not be possible using the computationally intense code. We first generate training data by running the safety analysis code over a design of experiment. We then perform exploratory data analysis and an initial fitting of several model forms, including neighbor-based models, tree-based models, support vector machines, and artificial neural networks. We select neural network as the most promising candidate and perform hyperparameter optimization using a genetic algorithm. We discuss the resulting model, its potential applications, and areas for further research.

Download Full-text

NLOS Multipath Classification of GNSS Signal Correlation Output Using Machine Learning

Sensors ◽

10.3390/s21072503 ◽

2021 ◽

Vol 21 (7) ◽

pp. 2503

Author(s):

Taro Suzuki ◽

Yoshiharu Amano

Keyword(s):

Machine Learning ◽

Satellite System ◽

Training Data ◽

Support Vector ◽

Positioning Errors ◽

Automated Method ◽

Global Navigation Satellite ◽

Better Than ◽

Signal Correlation

This paper proposes a method for detecting non-line-of-sight (NLOS) multipath, which causes large positioning errors in a global navigation satellite system (GNSS). We use GNSS signal correlation output, which is the most primitive GNSS signal processing output, to detect NLOS multipath based on machine learning. The shape of the multi-correlator outputs is distorted due to the NLOS multipath. The features of the shape of the multi-correlator are used to discriminate the NLOS multipath. We implement two supervised learning methods, a support vector machine (SVM) and a neural network (NN), and compare their performance. In addition, we also propose an automated method of collecting training data for LOS and NLOS signals of machine learning. The evaluation of the proposed NLOS detection method in an urban environment confirmed that NN was better than SVM, and 97.7% of NLOS signals were correctly discriminated.

Download Full-text

Possibility of Autonomous Estimation of Shiba Goat’s Estrus and Non-Estrus Behavior by Machine Learning Methods

Animals ◽

10.3390/ani10050771 ◽

2020 ◽

Vol 10 (5) ◽

pp. 771

Author(s):

Toshiya Arakawa

Keyword(s):

Neural Network ◽

Machine Learning ◽

Random Forest ◽

Markov Models ◽

Tracking System ◽

Video Tracking ◽

Training Data ◽

Support Vector ◽

Learning Methods ◽

Machine Learning Methods

Mammalian behavior is typically monitored by observation. However, direct observation requires a substantial amount of effort and time, if the number of mammals to be observed is sufficiently large or if the observation is conducted for a prolonged period. In this study, machine learning methods as hidden Markov models (HMMs), random forests, support vector machines (SVMs), and neural networks, were applied to detect and estimate whether a goat is in estrus based on the goat’s behavior; thus, the adequacy of the method was verified. Goat’s tracking data was obtained using a video tracking system and used to estimate whether they, which are in “estrus” or “non-estrus”, were in either states: “approaching the male”, or “standing near the male”. Totally, the PC of random forest seems to be the highest. However, The percentage concordance (PC) value besides the goats whose data were used for training data sets is relatively low. It is suggested that random forest tend to over-fit to training data. Besides random forest, the PC of HMMs and SVMs is high. However, considering the calculation time and HMM’s advantage in that it is a time series model, HMM is better method. The PC of neural network is totally low, however, if the more goat’s data were acquired, neural network would be an adequate method for estimation.

Download Full-text

Drill-Core Mineral Abundance Estimation Using Hyperspectral and High-Resolution Mineralogical Data

Remote Sensing ◽

10.3390/rs12071218 ◽

2020 ◽

Vol 12 (7) ◽

pp. 1218

Author(s):

Laura Tuşa ◽

Mahdi Khodadadzadeh ◽

Cecilia Contreras ◽

Kasra Rafiezadeh Shahi ◽

Margret Fuchs ◽

...

Keyword(s):

Machine Learning ◽

High Resolution ◽

Ore Deposits ◽

Machine Learning Algorithms ◽

Training Data ◽

Support Vector ◽

Drill Core ◽

Data Types ◽

Mineralogical Characterization ◽

Core Samples

Due to the extensive drilling performed every year in exploration campaigns for the discovery and evaluation of ore deposits, drill-core mapping is becoming an essential step. While valuable mineralogical information is extracted during core logging by on-site geologists, the process is time consuming and dependent on the observer and individual background. Hyperspectral short-wave infrared (SWIR) data is used in the mining industry as a tool to complement traditional logging techniques and to provide a rapid and non-invasive analytical method for mineralogical characterization. Additionally, Scanning Electron Microscopy-based image analyses using a Mineral Liberation Analyser (SEM-MLA) provide exhaustive high-resolution mineralogical maps, but can only be performed on small areas of the drill-cores. We propose to use machine learning algorithms to combine the two data types and upscale the quantitative SEM-MLA mineralogical data to drill-core scale. This way, quasi-quantitative maps over entire drill-core samples are obtained. Our upscaling approach increases result transparency and reproducibility by employing physical-based data acquisition (hyperspectral imaging) combined with mathematical models (machine learning). The procedure is tested on 5 drill-core samples with varying training data using random forests, support vector machines and neural network regression models. The obtained mineral abundance maps are further used for the extraction of mineralogical parameters such as mineral association.

Download Full-text

Headnote Prediction Using Machine Learning

The International Arab Journal of Information Technology ◽

10.34028/iajit/18/5/7 ◽

2021 ◽

Vol 18 (5) ◽

Author(s):

Sarmad Mahar ◽

Sahar Zafar ◽

Kamran Nishat

Keyword(s):

Machine Learning ◽

Feature Extraction ◽

Active Learning ◽

Text Classification ◽

Extraction Methods ◽

Text Summarization ◽

Training Data ◽

Second Step ◽

Support Vector ◽

Classification Algorithms

Headnotes are the precise explanation and summary of legal points in an issued judgment. Law journals hire experienced lawyers to write these headnotes. These headnotes help the reader quickly determine the issue discussed in the case. Headnotes comprise two parts. The first part comprises the topic discussed in the judgment, and the second part contains a summary of that judgment. In this thesis, we design, develop and evaluate headnote prediction using machine learning, without involving human involvement. We divided this task into a two steps process. In the first step, we predict law points used in the judgment by using text classification algorithms. The second step generates a summary of the judgment using text summarization techniques. To achieve this task, we created a Databank by extracting data from different law sources in Pakistan. We labelled training data generated based on Pakistan law websites. We tested different feature extraction methods on judiciary data to improve our system. Using these feature extraction methods, we developed a dictionary of terminology for ease of reference and utility. Our approach achieves 65% accuracy by using Linear Support Vector Classification with tri-gram and without stemmer. Using active learning our system can continuously improve the accuracy with the increased labelled examples provided by the users of the system.

Download Full-text

Abstract 1122‐000047: Machine Learning to Predict Stroke Outcomes after Mechanical Thrombectomy

Stroke: Vascular and Interventional Neurology ◽

10.1161/svin.01.suppl_1.000047 ◽

2021 ◽

Vol 1 (S1) ◽

Author(s):

Mehdi Bouslama ◽

Leonardo Pisani ◽

Diogo Haussen ◽

Raul Nogueira

Keyword(s):

Machine Learning ◽

Decision Making ◽

Cerebral Artery ◽

Training Data ◽

Gradient Boosting ◽

Support Vector ◽

Anterior Circulation ◽

Discriminative Performance ◽

Post Procedure ◽

Procedural Models

Introduction : Prognostication is an integral part of clinical decision‐making in stroke care. Machine learning (ML) methods have gained increasing popularity in the medical field due to their flexibility and high performance. Using a large comprehensive stroke center registry, we sought to apply various ML techniques for 90‐day stroke outcome predictions after thrombectomy. Methods : We used individual patient data from our prospectively collected thrombectomy database between 09/2010 and 03/2020. Patients with anterior circulation strokes (Internal Carotid Artery, Middle Cerebral Artery M1, M2, or M3 segments and Anterior Cerebral Artery) and complete records were included. Our primary outcome was 90‐day functional independence (defined as modified Rankin Scale score 0–2). Pre‐ and post‐procedure models were developed. Four known ML algorithms (support vector machine, random forest, gradient boosting, and artificial neural network) were implemented using a 70/30 training‐test data split and 10‐fold cross‐validation on the training data for model calibration. Discriminative performance was evaluated using the area under the receiver operator characteristics curve (AUC) metric. Results : Among 1248 patients with anterior circulation large vessel occlusion stroke undergoing thrombectomy during the study period, 1020 had complete records and were included in the analysis. In the training data (n = 714), 49.3% of the patients achieved independence at 90‐days. Fifteen baseline clinical, laboratory and neuroimaging features were used to develop the pre‐procedural models, with four additional parameters included in the post‐procedure models. For the preprocedural models, the highest AUC was 0.797 (95%CI [0.75‐ 0.85]) for the gradient boosting model. Similarly, the same ML technique performed best on post‐procedural data and had an improved discriminative performance compared to the pre‐procedure model with an AUC of 0.82 (95%CI [0.77‐ 0.87]). Conclusions : Our pre‐and post‐procedural models reliably estimated outcomes in stroke patients undergoing thrombectomy. They represent a step forward in creating simple and efficient prognostication tools to aid treatment decision‐making. A web‐based platform and related mobile app are underway.

Download Full-text

A sentiment analysis system for social media using machine learning techniques: Social enablement

Digital Scholarship in the Humanities ◽

10.1093/llc/fqy037 ◽

2018 ◽

Vol 34 (3) ◽

pp. 569-581 ◽

Cited By ~ 1

Author(s):

Sujata Rani ◽

Parteek Kumar

Keyword(s):

Machine Learning ◽

Social Media ◽

Sentiment Analysis ◽

Media Analysis ◽

Training Data ◽

Machine Learning Techniques ◽

Support Vector ◽

Analysis Tool ◽

Data Set ◽

Learning Techniques

Abstract In this article, an innovative approach to perform the sentiment analysis (SA) has been presented. The proposed system handles the issues of Romanized or abbreviated text and spelling variations in the text to perform the sentiment analysis. The training data set of 3,000 movie reviews and tweets has been manually labeled by native speakers of Hindi in three classes, i.e. positive, negative, and neutral. The system uses WEKA (Waikato Environment for Knowledge Analysis) tool to convert these string data into numerical matrices and applies three machine learning techniques, i.e. Naive Bayes (NB), J48, and support vector machine (SVM). The proposed system has been tested on 100 movie reviews and tweets, and it has been observed that SVM has performed best in comparison to other classifiers, and it has an accuracy of 68% for movie reviews and 82% in case of tweets. The results of the proposed system are very promising and can be used in emerging applications like SA of product reviews and social media analysis. Additionally, the proposed system can be used in other cultural/social benefits like predicting/fighting human riots.

Download Full-text

Automatic Grading of Stroke Symptoms for Rapid Assessment Using Optimized Machine Learning and 4-Limb Kinematics: Clinical Validation Study (Preprint)

10.2196/preprints.20641 ◽

2020 ◽

Author(s):

Eunjeong Park ◽

Kijeong Lee ◽

Taehwa Han ◽

Hyo Suk Nam

Keyword(s):

Machine Learning ◽

Medical Staff ◽

Machine Learning Algorithms ◽

Training Data ◽

Support Vector ◽

Grading System ◽

Operating Characteristics ◽

Stroke Patients ◽

Motor Weakness ◽

Automatic Grading

BACKGROUND Subtle abnormal motor signs are indications of serious neurological diseases. Although neurological deficits require fast initiation of treatment in a restricted time, it is difficult for nonspecialists to detect and objectively assess the symptoms. In the clinical environment, diagnoses and decisions are based on clinical grading methods, including the National Institutes of Health Stroke Scale (NIHSS) score or the Medical Research Council (MRC) score, which have been used to measure motor weakness. Objective grading in various environments is necessitated for consistent agreement among patients, caregivers, paramedics, and medical staff to facilitate rapid diagnoses and dispatches to appropriate medical centers. OBJECTIVE In this study, we aimed to develop an autonomous grading system for stroke patients. We investigated the feasibility of our new system to assess motor weakness and grade NIHSS and MRC scores of 4 limbs, similar to the clinical examinations performed by medical staff. METHODS We implemented an automatic grading system composed of a measuring unit with wearable sensors and a grading unit with optimized machine learning. Inertial sensors were attached to measure subtle weaknesses caused by paralysis of upper and lower limbs. We collected 60 instances of data with kinematic features of motor disorders from neurological examination and demographic information of stroke patients with NIHSS 0 or 1 and MRC 7, 8, or 9 grades in a stroke unit. Training data with 240 instances were generated using a synthetic minority oversampling technique to complement the imbalanced number of data between classes and low number of training data. We trained 2 representative machine learning algorithms, an ensemble and a support vector machine (SVM), to implement auto-NIHSS and auto-MRC grading. The optimized algorithms performed a 5-fold cross-validation and were searched by Bayes optimization in 30 trials. The trained model was tested with the 60 original hold-out instances for performance evaluation in accuracy, sensitivity, specificity, and area under the receiver operating characteristics curve (AUC). RESULTS The proposed system can grade NIHSS scores with an accuracy of 83.3% and an AUC of 0.912 using an optimized ensemble algorithm, and it can grade with an accuracy of 80.0% and an AUC of 0.860 using an optimized SVM algorithm. The auto-MRC grading achieved an accuracy of 76.7% and a mean AUC of 0.870 in SVM classification and an accuracy of 78.3% and a mean AUC of 0.877 in ensemble classification. CONCLUSIONS The automatic grading system quantifies proximal weakness in real time and assesses symptoms through automatic grading. The pilot outcomes demonstrated the feasibility of remote monitoring of motor weakness caused by stroke. The system can facilitate consistent grading with instant assessment and expedite dispatches to appropriate hospitals and treatment initiation by sharing auto-MRC and auto-NIHSS scores between prehospital and hospital responses as an objective observation.

Download Full-text

A max-margin training of RNA secondary structure prediction integrated with the thermodynamic model

10.1101/205047 ◽

2017 ◽

Cited By ~ 1

Author(s):

Manato Akiyama ◽

Kengo Sato ◽

Yasubumi Sakakibara

Keyword(s):

Machine Learning ◽

Secondary Structure ◽

Structure Prediction ◽

Rna Secondary Structure ◽

Prediction Accuracy ◽

Secondary Structure Prediction ◽

Training Data ◽

Support Vector ◽

Rna Secondary Structure Prediction ◽

Fine Grained

AbstractMotivation: A popular approach for predicting RNA secondary structure is the thermodynamic nearest neighbor model that finds a thermodynamically most stable secondary structure with the minimum free energy (MFE). For further improvement, an alternative approach that is based on machine learning techniques has been developed. The machine learning based approach can employ a fine-grained model that includes much richer feature representations with the ability to fit the training data. Although a machine learning based fine-grained model achieved extremely high performance in prediction accuracy, a possibility of the risk of overfitting for such model has been reported.Results: In this paper, we propose a novel algorithm for RNA secondary structure prediction that integrates the thermodynamic approach and the machine learning based weighted approach. Ourfine-grained model combines the experimentally determined thermodynamic parameters with a large number of scoring parameters for detailed contexts of features that are trained by the structured support vector machine (SSVM) with the ℓ1 regularization to avoid overfitting. Our benchmark shows that our algorithm achieves the best prediction accuracy compared with existing methods, and heavy overfitting cannot be observed.Availability: The implementation of our algorithm is available at https://github.com/keio-bioinformatics/mxfold.Contact:[email protected]

Download Full-text

Android Malware Detection using Machine Learning

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.b1011.0982s1219 ◽

2020 ◽

Vol 8 (2S12) ◽

pp. 65-70

Keyword(s):

Machine Learning ◽

Nearest Neighbor ◽

Machine Learning Algorithms ◽

Training Data ◽

Machine Learning Techniques ◽

Support Vector ◽

K Nearest Neighbor ◽

User Interest ◽

Android Malware ◽

Android Malware Detection

Machine Learning is empowering many aspects of day-to-day lives from filtering the content on social networks to suggestions of products that we may be looking for. This technology focuses on taking objects as image input to find new observations or show items based on user interest. The major discussion here is the Machine Learning techniques where we use supervised learning where the computer learns by the input data/training data and predict result based on experience. We also discuss the machine learning algorithms: Naïve Bayes Classifier, K-Nearest Neighbor, Random Forest, Decision Tress, Boosted Trees, Support Vector Machine, and use these classifiers on a dataset Malgenome and Drebin which are the Android Malware Dataset. Android is an operating system that is gaining popularity these days and with a rise in demand of these devices the rise in Android Malware. The traditional techniques methods which were used to detect malware was unable to detect unknown applications. We have run this dataset on different machine learning classifiers and have recorded the results. The experiment result provides a comparative analysis that is based on performance, accuracy, and cost.

Download Full-text

Prediction of drug-protein interaction and drug repositioning using machine learning model

10.1101/2020.07.29.218826 ◽

2020 ◽

Author(s):

Yu-Ting Lin ◽

Sheh-Yi Sheu ◽

Chen-Ching Lin

Keyword(s):

Machine Learning ◽

Betweenness Centrality ◽

Drug Repositioning ◽

Protein Network ◽

Training Data ◽

Support Vector ◽

Binding Interaction ◽

Binding Strength ◽

Edge Betweenness ◽

Protein Models

AbstractBackgroundTraditional drug development is time-consuming and expensive, while computer-aided drug repositioning can improve efficiency and productivity. In this study, we proposed a machine learning pipeline to predict the binding interaction between proteins and marketed or studied drugs. We then extended the predicted interactions to construct a protein network that could be applied to discover the potentially shared drugs between proteins and thus predict drug repositioning.MethodsBinding information between proteins and drugs from the Binding Database and the physicochemical properties of drugs from the ChEMBL database were used to build the machine learning models, i.e. support vector regression. We further measured proportionalities between proteins by the predicted binding affinity and introduced edge betweenness centrality to construct a protein similarity network for drug repositioning.ResultsAs the proof of concept, we demonstrated our machine learning approach is capable of reflecting the binding strength between drugs and the target protein. When comparing coefficients of protein models, we found proteins SYUA and TAU that may share common ligand which were not in our training data. Using the edge betweenness centrality network based on the prediction proportionality of protein models, we found a potential target, AK1C2, of aspirin and of which the binding interaction had been validated.ConclusionsOur study could not only be applied to drug repositioning by comparing protein models or searching the protein-protein network, but also to predict the binding strength once the sufficient experimental data was provided to train the protein models.

Download Full-text