Machine Learning–Based Case Studies for Healthcare Analytics

Deep reinforcement learning (RL) has emerged as a promising approach for autonomously acquiring complex behaviors from low-level sensor observations. Although a large portion of deep RL research has focused on applications in video games and simulated control, which does not connect with the constraints of learning in real environments, deep RL has also demonstrated promise in enabling physical robots to learn complex skills in the real world. At the same time, real-world robotics provides an appealing domain for evaluating such algorithms, as it connects directly to how humans learn: as an embodied agent in the real world. Learning to perceive and move in the real world presents numerous challenges, some of which are easier to address than others, and some of which are often not considered in RL research that focuses only on simulated domains. In this review article, we present a number of case studies involving robotic deep RL. Building off of these case studies, we discuss commonly perceived challenges in deep RL and how they have been addressed in these works. We also provide an overview of other outstanding challenges, many of which are unique to the real-world robotics setting and are not often the focus of mainstream RL research. Our goal is to provide a resource both for roboticists and machine learning researchers who are interested in furthering the progress of deep RL in the real world.

Download Full-text

Pattern discovery and disentanglement on relational datasets

Scientific Reports ◽

10.1038/s41598-021-84869-4 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Andrew K. C. Wong ◽

Pei-Yuan Zhou ◽

Zahid A. Butt

Keyword(s):

Machine Learning ◽

Knowledge Base ◽

Case Studies ◽

Prediction Accuracy ◽

Pattern Discovery ◽

Explicit Representation ◽

Human Cognition ◽

Source Level ◽

Low Volume

AbstractMachine Learning has made impressive advances in many applications akin to human cognition for discernment. However, success has been limited in the areas of relational datasets, particularly for data with low volume, imbalanced groups, and mislabeled cases, with outputs that typically lack transparency and interpretability. The difficulties arise from the subtle overlapping and entanglement of functional and statistical relations at the source level. Hence, we have developed Pattern Discovery and Disentanglement System (PDD), which is able to discover explicit patterns from the data with various sizes, imbalanced groups, and screen out anomalies. We present herein four case studies on biomedical datasets to substantiate the efficacy of PDD. It improves prediction accuracy and facilitates transparent interpretation of discovered knowledge in an explicit representation framework PDD Knowledge Base that links the sources, the patterns, and individual patients. Hence, PDD promises broad and ground-breaking applications in genomic and biomedical machine learning.

Download Full-text

Casing Failure Using Machine Learning Algorithms: Five Case Studies

10.2118/193373-ms ◽

2018 ◽

Author(s):

C. I. Noshi ◽

S. F. Noynaert ◽

J. J. Schubert

Keyword(s):

Machine Learning ◽

Case Studies ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Casing Failure

Download Full-text

Implications of properties and quality of indoor sensor data for building machine learning applications: Two case studies in smart campuses

Building and Environment ◽

10.1016/j.buildenv.2021.108529 ◽

2021 ◽

pp. 108529

Author(s):

Miia Lillstrang ◽

Markus Harju ◽

Guillermo del Campo ◽

Gonzalo Calderon ◽

Juha Röning ◽

...

Keyword(s):

Machine Learning ◽

Case Studies ◽

Sensor Data ◽

Machine Learning Applications

Download Full-text

Efficient inference for agent-based models of real-world phenomena

10.1101/2021.10.04.462980 ◽

2021 ◽

Author(s):

Andreas Christ Sølvsten Jørgensen ◽

Atiyo Ghosh ◽

Marc Sturrock ◽

Vahid Shahrezaei

Keyword(s):

Machine Learning ◽

Case Studies ◽

Parameter Space ◽

Real World ◽

Autonomous Agents ◽

Stochastic Simulations ◽

Model Parameters ◽

Learning Approaches ◽

Real World Applications ◽

Real World Problems

AbstractThe modelling of many real-world problems relies on computationally heavy simulations. Since statistical inference rests on repeated simulations to sample the parameter space, the high computational expense of these simulations can become a stumbling block. In this paper, we compare two ways to mitigate this issue based on machine learning methods. One approach is to construct lightweight surrogate models to substitute the simulations used in inference. Alternatively, one might altogether circumnavigate the need for Bayesian sampling schemes and directly estimate the posterior distribution. We focus on stochastic simulations that track autonomous agents and present two case studies of real-world applications: tumour growths and the spread of infectious diseases. We demonstrate that good accuracy in inference can be achieved with a relatively small number of simulations, making our machine learning approaches orders of magnitude faster than classical simulation-based methods that rely on sampling the parameter space. However, we find that while some methods generally produce more robust results than others, no algorithm offers a one-size-fits-all solution when attempting to infer model parameters from observations. Instead, one must choose the inference technique with the specific real-world application in mind. The stochastic nature of the considered real-world phenomena poses an additional challenge that can become insurmountable for some approaches. Overall, we find machine learning approaches that create direct inference machines to be promising for real-world applications. We present our findings as general guidelines for modelling practitioners.Author summaryComputer simulations play a vital role in modern science as they are commonly used to compare theory with observations. One can thus infer the properties of a observed system by comparing the data to the predicted behaviour in different scenarios. Each of these scenarios corresponds to a simulation with slightly different settings. However, since real-world problems are highly complex, the simulations often require extensive computational resources, making direct comparisons with data challenging, if not insurmountable. It is, therefore, necessary to resort to inference methods that mitigate this issue, but it is not clear-cut what path to choose for any specific research problem. In this paper, we provide general guidelines for how to make this choice. We do so by studying examples from oncology and epidemiology and by taking advantage of developments in machine learning. More specifically, we focus on simulations that track the behaviour of autonomous agents, such as single cells or individuals. We show that the best way forward is problem-dependent and highlight the methods that yield the most robust results across the different case studies. We demonstrate that these methods are highly promising and produce reliable results in a small fraction of the time required by classic approaches that rely on comparisons between data and individual simulations. Rather than relying on a single inference technique, we recommend employing several methods and selecting the most reliable based on predetermined criteria.

Download Full-text

Engineering AI Systems

Advances in Systems Analysis, Software Engineering, and High Performance Computing - Artificial Intelligence Paradigms for Smart Cyber-Physical Systems ◽

10.4018/978-1-7998-5101-1.ch001 ◽

2021 ◽

pp. 1-19 ◽

Cited By ~ 1

Author(s):

Jan Bosch ◽

Helena Holmström Olsson ◽

Ivica Crnkovic

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Data Quality ◽

Case Studies ◽

Research Agenda ◽

Production Quality ◽

Research Community ◽

Design Methods ◽

Engineering Approach ◽

Quality Design

Artificial intelligence (AI) and machine learning (ML) are increasingly broadly adopted in industry. However, based on well over a dozen case studies, we have learned that deploying industry-strength, production quality ML models in systems proves to be challenging. Companies experience challenges related to data quality, design methods and processes, performance of models as well as deployment and compliance. We learned that a new, structured engineering approach is required to construct and evolve systems that contain ML/DL components. In this chapter, the authors provide a conceptualization of the typical evolution patterns that companies experience when employing ML as well as an overview of the key problems experienced by the companies that they have studied. The main contribution of the chapter is a research agenda for AI engineering that provides an overview of the key engineering challenges surrounding ML solutions and an overview of open items that need to be addressed by the research community at large.

Download Full-text

Chapter 15. Human-Centered Concept Explanations for Neural Networks

10.3233/faia210362 ◽

2021 ◽

Author(s):

Chih-Kuan Yeh ◽

Been Kim ◽

Pradeep Ravikumar

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Case Studies ◽

Real World ◽

Deep Neural Networks ◽

Learning Models ◽

Real World Applications ◽

The Right ◽

Concept Activation ◽

Machine Learning Models

Understanding complex machine learning models such as deep neural networks with explanations is crucial in various applications. Many explanations stem from the model perspective, and may not necessarily effectively communicate why the model is making its predictions at the right level of abstraction. For example, providing importance weights to individual pixels in an image can only express which parts of that particular image is important to the model, but humans may prefer an explanation which explains the prediction by concept-based thinking. In this work, we review the emerging area of concept based explanations. We start by introducing concept explanations including the class of Concept Activation Vectors (CAV) which characterize concepts using vectors in appropriate spaces of neural activations, and discuss different properties of useful concepts, and approaches to measure the usefulness of concept vectors. We then discuss approaches to automatically extract concepts, and approaches to address some of their caveats. Finally, we discuss some case studies that showcase the utility of such concept-based explanations in synthetic settings and real world applications.

Download Full-text

Machine Learning Techniques for Multimedia: Case Studies on Organization and Retrieval

Journal of Electronic Imaging ◽

10.1117/1.3207770 ◽

2007 ◽

Vol 18 (3) ◽

pp. 039901

Author(s):

Matthieu Cord

Keyword(s):

Machine Learning ◽

Case Studies ◽

Machine Learning Techniques ◽

Learning Techniques ◽

Multimedia Case Studies

Download Full-text

Applications of supervised deep learning for seismic interpretation and inversion

The Leading Edge ◽

10.1190/tle38070526.1 ◽

2019 ◽

Vol 38 (7) ◽

pp. 526-533 ◽

Cited By ~ 12

Author(s):

York Zheng ◽

Qie Zhang ◽

Anar Yusifov ◽

Yunzhi Shi

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Case Studies ◽

Seismic Inversion ◽

Seismic Interpretation ◽

Elastic Model ◽

Regression Problem ◽

First Case ◽

Seismic Image ◽

Prestack Seismic Inversion

Recent advances in machine learning and its applications in various sectors are generating a new wave of experiments and solutions to solve geophysical problems in the oil and gas industry. We present two separate case studies in which supervised deep learning is used as an alternative to conventional techniques. The first case is an example of image classification applied to seismic interpretation. A convolutional neural network (CNN) is trained to pick faults automatically in 3D seismic volumes. Every sample in the input seismic image is classified as either a nonfault or fault with a certain dip and azimuth that are predicted simultaneously. The second case is an example of elastic model building — casting prestack seismic inversion as a machine learning regression problem. A CNN is trained to make predictions of 1D velocity and density profiles from input seismic records. In both case studies, we demonstrate that CNN models trained from synthetic data can be used to make efficient and effective predictions on field data. While results from the first example show that high-quality fault picks can be predicted from migrated seismic images, we find that it is more challenging in the prestack seismic inversion case where constraining the subsurface geologic variations and careful preconditioning of input seismic data are important for obtaining reasonably reliable results. This observation matches our experience using conventional workflows and methods, which also respond to improved signal to noise after migration and stack, and the inherent subsurface ambiguity makes unique parameter inversion difficult.

Download Full-text

Accurate prediction of DNA N4-methylcytosine sites via boost-learning various types of sequence features

BMC Genomics ◽

10.1186/s12864-020-07033-8 ◽

2020 ◽

Vol 21 (1) ◽

Author(s):

Zhixun Zhao ◽

Xiaocai Zhang ◽

Fang Chen ◽

Liang Fang ◽

Jinyan Li

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Case Studies ◽

Learning Algorithm ◽

State Of The Art ◽

Learning Algorithms ◽

Feature Space ◽

Sequence Features ◽

Independent Test ◽

Benchmark Datasets

Abstract Background DNA N4-methylcytosine (4mC) is a critical epigenetic modification and has various roles in the restriction-modification system. Due to the high cost of experimental laboratory detection, computational methods using sequence characteristics and machine learning algorithms have been explored to identify 4mC sites from DNA sequences. However, state-of-the-art methods have limited performance because of the lack of effective sequence features and the ad hoc choice of learning algorithms to cope with this problem. This paper is aimed to propose new sequence feature space and a machine learning algorithm with feature selection scheme to address the problem. Results The feature importance score distributions in datasets of six species are firstly reported and analyzed. Then the impact of the feature selection on model performance is evaluated by independent testing on benchmark datasets, where ACC and MCC measurements on the performance after feature selection increase by 2.3% to 9.7% and 0.05 to 0.19, respectively. The proposed method is compared with three state-of-the-art predictors using independent test and 10-fold cross-validations, and our method outperforms in all datasets, especially improving the ACC by 3.02% to 7.89% and MCC by 0.06 to 0.15 in the independent test. Two detailed case studies by the proposed method have confirmed the excellent overall performance and correctly identified 24 of 26 4mC sites from the C.elegans gene, and 126 out of 137 4mC sites from the D.melanogaster gene. Conclusions The results show that the proposed feature space and learning algorithm with feature selection can improve the performance of DNA 4mC prediction on the benchmark datasets. The two case studies prove the effectiveness of our method in practical situations.

Download Full-text

Machine Learning–Based Case Studies for Healthcare Analytics

How to train your robot with deep reinforcement learning: lessons we have learned

Pattern discovery and disentanglement on relational datasets

Casing Failure Using Machine Learning Algorithms: Five Case Studies

Implications of properties and quality of indoor sensor data for building machine learning applications: Two case studies in smart campuses

Efficient inference for agent-based models of real-world phenomena

Engineering AI Systems

Chapter 15. Human-Centered Concept Explanations for Neural Networks

Machine Learning Techniques for Multimedia: Case Studies on Organization and Retrieval

Applications of supervised deep learning for seismic interpretation and inversion

Accurate prediction of DNA N4-methylcytosine sites via boost-learning various types of sequence features

Export Citation Format