The Future of PHM Could be Tiny under Cloud: Exploring Potential Application Patterns of TinyML in PHM Scenarios

Deep learning has shown impressive performance acrosshealth management and prognostics applications. Nowadays, an emerging trend of machine learning deployment on resource constraint hardware devices like micro-controllers(MCU) has aroused much attention. Given the distributed andresource constraint nature of many PHM applications, using tiny machine learning models close to data source sensors for on-device inferences would be beneficial to save both time andadditional hardware resources. Even though there has beenpast works that bring TinyML on MCUs for some PHM ap-plications, they are mainly targeting single data source usage without higher-level data incorporation with cloud computing.We study the impact of potential cooperation patterns betweenTinyML on edge and more powerful computation resources oncloud and how this would make an impact on the application patterns in data-driven prognostics. We introduce potential ap-plications where sensor readings are utilized for system health status prediction including status classification and remaining useful life regression. We find that MCUs and cloud com-puting can be adaptive to different kinds of machine learning models and combined in flexible ways for diverse requirement.Our work also shows limitations of current MCU-based deep learning in data-driven prognostics And we hope our work can

Download Full-text

Sports Ed 3.5: Establishing the value of data-driven sports development programs for universities through machine learning models

2020 The 8th International Conference on Information Technology: IoT and Smart City ◽

10.1145/3446999.3447009 ◽

2020 ◽

Author(s):

Edwin Mendoza Torralba

Keyword(s):

Machine Learning ◽

Data Driven ◽

Development Programs ◽

Learning Models ◽

Machine Learning Models

Download Full-text

Machine Learning-Based Malicious X.509 Certificates’ Detection

Applied Sciences ◽

10.3390/app11052164 ◽

2021 ◽

Vol 11 (5) ◽

pp. 2164

Author(s):

Jiaxin Li ◽

Zhaoxin Zhang ◽

Changyong Guo

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Ensemble Learning ◽

Traffic Analysis ◽

Learning Models ◽

Detection Model ◽

Analysis Tools ◽

Average Accuracy ◽

Machine Learning Models

X.509 certificates play an important role in encrypting the transmission of data on both sides under HTTPS. With the popularization of X.509 certificates, more and more criminals leverage certificates to prevent their communications from being exposed by malicious traffic analysis tools. Phishing sites and malware are good examples. Those X.509 certificates found in phishing sites or malware are called malicious X.509 certificates. This paper applies different machine learning models, including classical machine learning models, ensemble learning models, and deep learning models, to distinguish between malicious certificates and benign certificates with Verification for Extraction (VFE). The VFE is a system we design and implement for obtaining plentiful characteristics of certificates. The result shows that ensemble learning models are the most stable and efficient models with an average accuracy of 95.9%, which outperforms many previous works. In addition, we obtain an SVM-based detection model with an accuracy of 98.2%, which is the highest accuracy. The outcome indicates the VFE is capable of capturing essential and crucial characteristics of malicious X.509 certificates.

Download Full-text

A Survey on Data-driven Network Intrusion Detection

ACM Computing Surveys ◽

10.1145/3472753 ◽

2022 ◽

Vol 54 (9) ◽

pp. 1-36

Author(s):

Dylan Chou ◽

Meng Jiang

Keyword(s):

Machine Learning ◽

Intrusion Detection ◽

Real World ◽

Data Driven ◽

Network Intrusion Detection ◽

Large Network ◽

Learning Models ◽

Simulated Environments ◽

Network Intrusion ◽

Machine Learning Models

Data-driven network intrusion detection (NID) has a tendency towards minority attack classes compared to normal traffic. Many datasets are collected in simulated environments rather than real-world networks. These challenges undermine the performance of intrusion detection machine learning models by fitting machine learning models to unrepresentative “sandbox” datasets. This survey presents a taxonomy with eight main challenges and explores common datasets from 1999 to 2020. Trends are analyzed on the challenges in the past decade and future directions are proposed on expanding NID into cloud-based environments, devising scalable models for large network data, and creating labeled datasets collected in real-world networks.

Download Full-text

Implementing clinical decision support for oncology advanced care planning: A systems engineering framework to optimize the usability and utility of a machine learning predictive model in clinical practice.

Journal of Clinical Oncology ◽

10.1200/jco.2020.39.28_suppl.330 ◽

2021 ◽

Vol 39 (28_suppl) ◽

pp. 330-330

Author(s):

Teja Ganta ◽

Stephanie Lehrman ◽

Rachel Pappalardo ◽

Madalene Crow ◽

Meagan Will ◽

...

Keyword(s):

Machine Learning ◽

High Risk ◽

Predictive Model ◽

Systems Engineering ◽

Care Planning ◽

Learning Models ◽

Predictive Tool ◽

Risk Of Death ◽

The Impact ◽

Machine Learning Models

330 Background: Machine learning models are well-positioned to transform cancer care delivery by providing oncologists with more accurate or accessible information to augment clinical decisions. Many machine learning projects, however, focus on model accuracy without considering the impact of using the model in real-world settings and rarely carry forward to clinical implementation. We present a human-centered systems engineering approach to address clinical problems with workflow interventions utilizing machine learning algorithms. Methods: We aimed to develop a mortality predictive tool, using a Random Forest algorithm, to identify oncology patients at high risk of death within 30 days to move advance care planning (ACP) discussions earlier in the illness trajectory. First, a project sponsor defined the clinical need and requirements of an intervention. The data scientists developed the predictive algorithm using data available in the electronic health record (EHR). A multidisciplinary workgroup was assembled including oncology physicians, advanced practice providers, nurses, social workers, chaplain, clinical informaticists, and data scientists. Meeting bi-monthly, the group utilized human-centered design (HCD) methods to understand clinical workflows and identify points of intervention. The workgroup completed a workflow redesign workshop, a 90-minute facilitated group discussion, to integrate the model in a future state workflow. An EHR (Epic) analyst built the user interface to support the intervention per the group’s requirements. The workflow was piloted in thoracic oncology and bone marrow transplant with plans to scale to other cancer clinics. Results: Our predictive model performance on test data was acceptable (sensitivity 75%, specificity 75%, F-1 score 0.71, AUC 0.82). The workgroup identified a “quality of life coordinator” who: reviews an EHR report of patients scheduled in the upcoming 7 days who have a high risk of 30-day mortality; works with the oncology team to determine ACP clinical appropriateness; documents the need for ACP; identifies potential referrals to supportive oncology, social work, or chaplain; and coordinates the oncology appointment. The oncologist receives a reminder on the day of the patient’s scheduled visit. Conclusions: This workgroup is a viable approach that can be replicated at institutions to address clinical needs and realize the full potential of machine learning models in healthcare. The next steps for this project are to address end-user feedback from the pilot, expand the intervention to other cancer disease groups, and track clinical metrics.

Download Full-text

Comparison of Machine Learning Models for Data-Driven Aircraft Icing Severity Evaluation

Journal of Aerospace Information Systems ◽

10.2514/1.i011047 ◽

2021 ◽

pp. 1-5

Author(s):

Sibo Li ◽

Roberto Paoli

Keyword(s):

Machine Learning ◽

Data Driven ◽

Aircraft Icing ◽

Learning Models ◽

Machine Learning Models

Download Full-text

A Physics-Infused Deep Learning Model for the Prediction of Refractive Indices and Its Use for the Large-Scale Screening of Organic Compound Space

10.26434/chemrxiv.8796950 ◽

2019 ◽

Author(s):

Mojtaba Haghighatlari ◽

Gaurav Vishwakarma ◽

Mohammad Atif Faiz Afzal ◽

Johannes Hachmann

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Large Scale ◽

Organic Molecules ◽

Learning Model ◽

Training Data ◽

Refractive Indices ◽

Learning Models ◽

Deep Learning Model ◽

Machine Learning Models

<div><div><div><p>We present a multitask, physics-infused deep learning model to accurately and efficiently predict refractive indices (RIs) of organic molecules, and we apply it to a library of 1.5 million compounds. We show that it outperforms earlier machine learning models by a significant margin, and that incorporating known physics into data-derived models provides valuable guardrails. Using a transfer learning approach, we augment the model to reproduce results consistent with higher-level computational chemistry training data, but with a considerably reduced number of corresponding calculations. Prediction errors of machine learning models are typically smallest for commonly observed target property values, consistent with the distribution of the training data. However, since our goal is to identify candidates with unusually large RI values, we propose a strategy to boost the performance of our model in the remoter areas of the RI distribution: We bias the model with respect to the under-represented classes of molecules that have values in the high-RI regime. By adopting a metric popular in web search engines, we evaluate our effectiveness in ranking top candidates. We confirm that the models developed in this study can reliably predict the RIs of the top 1,000 compounds, and are thus able to capture their ranking. We believe that this is the first study to develop a data-derived model that ensures the reliability of RI predictions by model augmentation in the extrapolation region on such a large scale. These results underscore the tremendous potential of machine learning in facilitating molecular (hyper)screening approaches on a massive scale and in accelerating the discovery of new compounds and materials, such as organic molecules with high-RI for applications in opto-electronics.</p></div></div></div>

Download Full-text

Influence of social determinants of health and county vaccination rates on machine learning models to predict COVID-19 case growth in Tennessee

BMJ Health & Care Informatics ◽

10.1136/bmjhci-2021-100439 ◽

2021 ◽

Vol 28 (1) ◽

pp. e100439

Author(s):

Lukasz S Wylezinski ◽

Coleman R Harris ◽

Cody N Heiser ◽

Jamieson D Gray ◽

Charles F Spurlock

Keyword(s):

Machine Learning ◽

Risk Factors ◽

Social Determinants Of Health ◽

Social Determinants ◽

Determinants Of Health ◽

Learning Models ◽

Vaccination Rates ◽

Data Framework ◽

The Impact ◽

Machine Learning Models

IntroductionThe SARS-CoV-2 (COVID-19) pandemic has exposed health disparities throughout the USA, particularly among racial and ethnic minorities. As a result, there is a need for data-driven approaches to pinpoint the unique constellation of clinical and social determinants of health (SDOH) risk factors that give rise to poor patient outcomes following infection in US communities.MethodsWe combined county-level COVID-19 testing data, COVID-19 vaccination rates and SDOH information in Tennessee. Between February and May 2021, we trained machine learning models on a semimonthly basis using these datasets to predict COVID-19 incidence in Tennessee counties. We then analyzed SDOH data features at each time point to rank the impact of each feature on model performance.ResultsOur results indicate that COVID-19 vaccination rates play a crucial role in determining future COVID-19 disease risk. Beginning in mid-March 2021, higher vaccination rates significantly correlated with lower COVID-19 case growth predictions. Further, as the relative importance of COVID-19 vaccination data features grew, demographic SDOH features such as age, race and ethnicity decreased while the impact of socioeconomic and environmental factors, including access to healthcare and transportation, increased.ConclusionIncorporating a data framework to track the evolving patterns of community-level SDOH risk factors could provide policy-makers with additional data resources to improve health equity and resilience to future public health emergencies.

Download Full-text

Application of Bioactivity Profile Based Fingerprints for Building Machine Learning Models

10.26434/chemrxiv.6969584 ◽

2018 ◽

Cited By ~ 1

Author(s):

Noé Sturm ◽

Jiangming Sun ◽

Yves Vandriessche ◽

Andreas Mayr ◽

Günter Klambauer ◽

...

Keyword(s):

Machine Learning ◽

Deep Learning ◽

High Throughput ◽

Scaffold Hopping ◽

Learning Models ◽

Industrial Data ◽

Structural Descriptors ◽

Bioactivity Profile ◽

Machine Learning Models

<div>This article describes an application of high-throughput fingerprints (HTSFP) built upon industrial data accumulated over the years. </div><div>The fingerprint was used to build machine learning models (multi-task deep learning + SVM) for compound activity predictions towards a panel of 131 targets. </div><div>Quality of the predictions and the scaffold hopping potential of the HTSFP were systematically compared to traditional structural descriptors ECFP. </div><div><br></div>

Download Full-text

Predictive modelling of turbofan engine components condition using machine and deep learning methods

Eksploatacja i Niezawodnosc - Maintenance and Reliability ◽

10.17531/ein.2021.2.16 ◽

2021 ◽

Vol 23 (2) ◽

pp. 359-370

Author(s):

Michał Matuszczak ◽

Mateusz Żbikowski ◽

Andrzej Teodorczyk

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Predictive Modelling ◽

Continuous Variable ◽

Environmental Data ◽

Bayesian Optimization ◽

Aviation Industry ◽

Learning Models ◽

Turbofan Engine ◽

Machine Learning Models

The article proposes an approach based on deep and machine learning models to predict a component failure as an enhancement of condition based maintenance scheme of a turbofan engine and reviews currently used prognostics approaches in the aviation industry. Component degradation scale representing its life consumption is proposed and such collected condition data are combined with engines sensors and environmental data. With use of data manipulation techniques, a framework for models training is created and models' hyperparameters obtained through Bayesian optimization. Models predict the continuous variable representing condition based on the input. Best performed model is identified by detemining its score on the holdout set. Deep learning models achieved 0.71 MSE score (ensemble meta-model of neural networks) and outperformed significantly machine learning models with their best score at 1.75. The deep learning models shown their feasibility to predict the component condition within less than 1 unit of the error in the rank scale.

Download Full-text

Train Deep Learning Models using subsurface geological images datasets

10.5194/egusphere-egu21-6385 ◽

2021 ◽

Author(s):

Ramy Abdallah ◽

Clare E. Bond ◽

Robert W.H. Butler

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Neural Networks ◽

Deep Learning ◽

Convolutional Neural Networks ◽

Learning Models ◽

Structural Interpretation ◽

Performance Accuracy ◽

Learning Techniques ◽

Machine Learning Models

<p>Machine learning is being presented as a new solution for a wide range of geoscience problems. Primarily machine learning has been used for 3D seismic data processing, seismic facies analysis and well log data correlation. The rapid development in technology with open-source artificial intelligence libraries and the accessibility of affordable computer graphics processing units (GPU) makes the application of machine learning in geosciences increasingly tractable. However, the application of artificial intelligence in structural interpretation workflows of subsurface datasets is still ambiguous. This study aims to use machine learning techniques to classify images of folds and fold-thrust structures. Here we show that convolutional neural networks (CNNs) as supervised deep learning techniques provide excellent algorithms to discriminate between geological image datasets. Four different datasets of images have been used to train and test the machine learning models. These four datasets are a seismic character dataset with five classes (faults, folds, salt, flat layers and basement), folds types with three classes (buckle, chevron and conjugate), fault types with three classes (normal, reverse and thrust) and fold-thrust geometries with three classes (fault bend fold, fault propagation fold and detachment fold). These image datasets are used to investigate three machine learning models. One Feedforward linear neural network model and two convolutional neural networks models (Convolution 2d layer transforms sequential model and Residual block model (ResNet with 9, 34, and 50 layers)). Validation and testing datasets forms a critical part of testing the model&#8217;s performance accuracy. The ResNet model records the highest performance accuracy score, of the machine learning models tested. Our CNN image classification model analysis provides a framework for applying machine learning to increase structural interpretation efficiency, and shows that CNN classification models can be applied effectively to geoscience problems. The study provides a starting point to apply unsupervised machine learning approaches to sub-surface structural interpretation workflows.</p>

Download Full-text