Collaborative Multi-Expert Active Learning for Mobile Health Monitoring: Architecture, Algorithms, and Evaluation

Mobile health monitoring plays a central role in the future of cyber physical systems (CPS) for healthcare applications. Such monitoring systems need to process user data accurately. Unlike in other human-centered CPS, in healthcare CPS, the user functions in multiple roles all at the same time: as an operator, an actuator, the physical environment and, most importantly, the target that needs to be monitored in the process. Therefore, mobile health CPS devices face highly dynamic settings generally, and accuracy of the machine learning models the devices employ may drop dramatically every time a change in setting happens. Novel learning architecture that specifically address challenges associated with dynamic environments are therefore needed. Using active learning and transfer learning as organizing principles, we propose a collaborative multiple-expert architecture and accompanying algorithms for the design of machine learning models that autonomously adapt to a new configuration, context, or user need. Specifically, our architecture and its constituent algorithms are designed to manage heterogeneous knowledge sources or experts with varying levels of confidence and type while minimizing adaptation cost. Additionally, our framework incorporates a mechanism for collaboration among experts to enrich their knowledge, which in turn decreases both cost and uncertainty of data labeling in future steps. We evaluate the efficacy of the architecture using two publicly available human activity datasets. We attain activity recognition accuracy of over 85 % (for the first dataset) and 92 % (for the second dataset) by labeling only 15 % of unlabeled data.

Download Full-text

Active learning for the power factor prediction in diamond-like thermoelectric materials

npj Computational Materials ◽

10.1038/s41524-020-00439-8 ◽

2020 ◽

Vol 6 (1) ◽

Author(s):

Ye Sheng ◽

Yasong Wu ◽

Jiong Yang ◽

Wencong Lu ◽

Pierre Villars ◽

...

Keyword(s):

Machine Learning ◽

Active Learning ◽

Transport Properties ◽

Gradient Boosting ◽

Learning Models ◽

Material Development ◽

Materials Genome ◽

P Type ◽

Type Power ◽

Machine Learning Models

Abstract The Materials Genome Initiative requires the crossing of material calculations, machine learning, and experiments to accelerate the material development process. In recent years, data-based methods have been applied to the thermoelectric field, mostly on the transport properties. In this work, we combined data-driven machine learning and first-principles automated calculations into an active learning loop, in order to predict the p-type power factors (PFs) of diamond-like pnictides and chalcogenides. Our active learning loop contains two procedures (1) based on a high-throughput theoretical database, machine learning methods are employed to select potential candidates and (2) computational verification is applied to these candidates about their transport properties. The verification data will be added into the database to improve the extrapolation abilities of the machine learning models. Different strategies of selecting candidates have been tested, finally the Gradient Boosting Regression model of Query by Committee strategy has the highest extrapolation accuracy (the Pearson R = 0.95 on untrained systems). Based on the prediction from the machine learning models, binary pnictides, vacancy, and small atom-containing chalcogenides are predicted to have large PFs. The bonding analysis reveals that the alterations of anionic bonding networks due to small atoms are beneficial to the PFs in these compounds.

Download Full-text

Bayesian semi-supervised learning for uncertainty-calibrated prediction of molecular properties and active learning

Chemical Science ◽

10.1039/c9sc00616h ◽

2019 ◽

Vol 10 (35) ◽

pp. 8154-8163 ◽

Cited By ~ 14

Author(s):

Yao Zhang ◽

Alpha A. Lee

Keyword(s):

Machine Learning ◽

Active Learning ◽

Supervised Learning ◽

Molecular Properties ◽

Learning Models ◽

Molecular Properties Prediction ◽

Design Experiments ◽

Machine Learning Models

We report a statistically principled method to quantify the uncertainty of machine learning models for molecular properties prediction. We show that this uncertainty estimate can be used to judiciously design experiments.

Download Full-text

A Machine Learning Framework for Edge Computing to Improve Prediction Accuracy in Mobile Health Monitoring

Computational Science and Its Applications – ICCSA 2019 - Lecture Notes in Computer Science ◽

10.1007/978-3-030-24302-9_30 ◽

2019 ◽

pp. 417-431

Author(s):

Sigdel Shree Ram ◽

Bernady Apduhan ◽

Norio Shiratori

Keyword(s):

Machine Learning ◽

Mobile Health ◽

Health Monitoring ◽

Prediction Accuracy ◽

Edge Computing ◽

Learning Framework ◽

Mobile Health Monitoring

Download Full-text

A wearable computing platform for developing cloud-based machine learning models for health monitoring applications

2016 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) ◽

10.1109/embc.2016.7592095 ◽

2016 ◽

Cited By ~ 5

Author(s):

Shyamal Patel ◽

Ryan S. McGinnis ◽

Ikaro Silva ◽

Steve DiCristofaro ◽

Nikhil Mahadevan ◽

...

Keyword(s):

Machine Learning ◽

Health Monitoring ◽

Wearable Computing ◽

Learning Models ◽

Computing Platform ◽

Monitoring Applications ◽

Machine Learning Models

Download Full-text

A Data Cleaning Approach for a Structural Health Monitoring System in a 75 MW Electric Arc Ferronickel Furnace

Engineering Proceedings ◽

10.3390/ecsa-7-08245 ◽

2020 ◽

Vol 2 (1) ◽

pp. 21

Author(s):

Jaiber Camacho-Olarte ◽

Julián Esteban Salomón Torres ◽

Daniel A. Garavito Jimenez ◽

Jersson X. Leon Medina ◽

Ricardo C. Gomez Vargas ◽

...

Keyword(s):

Machine Learning ◽

Data Quality ◽

Health Monitoring ◽

Data Cleaning ◽

Electric Arc ◽

Furnace Operation ◽

Arc Furnace ◽

Learning Models ◽

Operation Parameters ◽

Machine Learning Models

Within a model of scientific and technical cooperation between the smelting company Cerro Matoso S.A. (CMSA) and the Universidad Nacional de Colombia (UNAL), a project was developed in order to take advantage of the data that were obtained from a sensor network in a ferronickel electric arc furnace at CMSA to improve the structural health monitoring process. Through this sensor network, online data are obtained on the temperature measurement along the refractory lining of the electric furnace, as well as heat fluxes and chemical characterization of the minerals on each stage of the process. These data are stored in a local database, which stores several years of historical data with valuable information for control and analysis purposes. These data reflect the behavior of the industrial process and can be used in the development of machine learning models to predict some of the electric arc furnace operation parameters, and thus improve the decision-making process. Currently, most of the data are analyzed by the experts of the structural control department, but, due to the large amount of data, the development of analytical tools is necessary to support their work. This paper proposes a data cleaning approach for improving data quality by creating a set of rules and filters based on both expert judgment and best practices in data quality. A statistical analysis was also carried out in order to detect variables with anomalies and outliers, which do not reflect real operation parameters and belong to anomalous data that should not be considered for modelling. With the proposed process, the quality of the data was improved and abnormal data were eliminated in order to consolidate a clean data set for later use in the development of machine learning models. This work contributes on understanding data cleansing rules that must be considered in order to reflect the real behavior of the electric furnace operation for further analysis and modeling tasks.

Download Full-text

Machine Learning and Mobile Health Monitoring Platforms: A Case Study on Research and Implementation Challenges

Journal of Healthcare Informatics Research ◽

10.1007/s41666-018-0021-1 ◽

2018 ◽

Vol 2 (1-2) ◽

pp. 179-203 ◽

Cited By ~ 1

Author(s):

Omar Boursalie ◽

Reza Samavi ◽

Thomas E. Doyle

Keyword(s):

Machine Learning ◽

Mobile Health ◽

Health Monitoring ◽

Implementation Challenges ◽

Mobile Health Monitoring

Download Full-text

Continual Active Learning for Efficient Adaptation of Machine Learning Models to Changing Image Acquisition

Lecture Notes in Computer Science - Information Processing in Medical Imaging ◽

10.1007/978-3-030-78191-0_50 ◽

2021 ◽

pp. 649-660

Author(s):

Matthias Perkonigg ◽

Johannes Hofmanninger ◽

Georg Langs

Keyword(s):

Machine Learning ◽

Active Learning ◽

Image Acquisition ◽

Learning Models ◽

Machine Learning Models

Download Full-text

Automated Hyper-parameter Tuning for Machine Learning Models in Machine Health Prognostics

Annual Conference of the PHM Society ◽

10.36001/phmconf.2018.v10i1.490 ◽

2018 ◽

Vol 10 (1) ◽

Author(s):

Wang-Chi Cheung ◽

Weiwen Zhang ◽

Yong Liu ◽

Feng Yang ◽

Rick-Siow-Mong Goh

Keyword(s):

Machine Learning ◽

Health Monitoring ◽

Domain Knowledge ◽

Parameter Tuning ◽

Bayesian Optimization ◽

Learning Models ◽

Effective Choice ◽

Machine Health Monitoring ◽

Machine Health ◽

Machine Learning Models

Recent studies have revealed the success of data-driven machine health monitoring, which motivates the use of machine learning models in machine health prognostic tasks. While the machine learning approach to health monitoring is gaining importance, the construction of machine learning models is often impeded by the difficulty in choosing the underlying hyper-parameter configuration (HP-config), which governs the construction of the machine learning model. While an effective choice of HP-config can be achieved with human effort, such an effort is often time consuming and requires domain knowledge. In this paper, we consider the use of Bayesian optimization algorithms, which automate an effective choice of HP-config by solving the associated hyperparameter optimization problem. Numerical experiments on the data from PHM 2016 Data Challenge demonstrate the salience of the proposed automatic framework, and exhibit improvement over default HP-configs in standard machine learning packages or chosen by a human agent.

Download Full-text

Clinical Text Data in Machine Learning: Systematic Review

JMIR Medical Informatics ◽

10.2196/17984 ◽

2020 ◽

Vol 8 (3) ◽

pp. e17984 ◽

Cited By ~ 8

Author(s):

Irena Spasic ◽

Goran Nenadic

Keyword(s):

Machine Learning ◽

Active Learning ◽

Supervised Machine Learning ◽

Manual Annotation ◽

Learning Approaches ◽

Learning Models ◽

Text Data ◽

Data Annotation ◽

Distant Supervision ◽

Machine Learning Models

Background Clinical narratives represent the main form of communication within health care, providing a personalized account of patient history and assessments, and offering rich information for clinical decision making. Natural language processing (NLP) has repeatedly demonstrated its feasibility to unlock evidence buried in clinical narratives. Machine learning can facilitate rapid development of NLP tools by leveraging large amounts of text data. Objective The main aim of this study was to provide systematic evidence on the properties of text data used to train machine learning approaches to clinical NLP. We also investigated the types of NLP tasks that have been supported by machine learning and how they can be applied in clinical practice. Methods Our methodology was based on the guidelines for performing systematic reviews. In August 2018, we used PubMed, a multifaceted interface, to perform a literature search against MEDLINE. We identified 110 relevant studies and extracted information about text data used to support machine learning, NLP tasks supported, and their clinical applications. The data properties considered included their size, provenance, collection methods, annotation, and any relevant statistics. Results The majority of datasets used to train machine learning models included only hundreds or thousands of documents. Only 10 studies used tens of thousands of documents, with a handful of studies utilizing more. Relatively small datasets were utilized for training even when much larger datasets were available. The main reason for such poor data utilization is the annotation bottleneck faced by supervised machine learning algorithms. Active learning was explored to iteratively sample a subset of data for manual annotation as a strategy for minimizing the annotation effort while maximizing the predictive performance of the model. Supervised learning was successfully used where clinical codes integrated with free-text notes into electronic health records were utilized as class labels. Similarly, distant supervision was used to utilize an existing knowledge base to automatically annotate raw text. Where manual annotation was unavoidable, crowdsourcing was explored, but it remains unsuitable because of the sensitive nature of data considered. Besides the small volume, training data were typically sourced from a small number of institutions, thus offering no hard evidence about the transferability of machine learning models. The majority of studies focused on text classification. Most commonly, the classification results were used to support phenotyping, prognosis, care improvement, resource management, and surveillance. Conclusions We identified the data annotation bottleneck as one of the key obstacles to machine learning approaches in clinical NLP. Active learning and distant supervision were explored as a way of saving the annotation efforts. Future research in this field would benefit from alternatives such as data augmentation and transfer learning, or unsupervised learning, which do not require data annotation.

Download Full-text