Real-time analysis and forecasting of the microseismic cloud size: Physics-based models versus machine learning

Geophysics ◽

10.1190/geo2021-0094.1 ◽

2021 ◽

pp. 1-69

Author(s):

Jorge Nustes Andrade ◽

Mirko van der Baan

Keyword(s):

Machine Learning ◽

Real Time ◽

Data Augmentation ◽

Time Lag ◽

Fracture Propagation ◽

Time Step ◽

Physical Constraints ◽

Real Time Analysis ◽

Engineering Parameters ◽

Temporal Interaction

The spatiotemporal distribution of hydraulic fracturing microseismicity is complicated and depends on various mechanical and diffusional parameters. Hydraulic fracture modeling can aid in understanding fracture propagation and microseismicity. Nevertheless, the complex spatial and temporal interaction of several processes occurring within and around the fracture represents a challenge for developing real-time tools for microseismic prediction. Two approaches were developed to forecast the microseismic cloud size in real-time. The first approach uses fracture propagation models to derive the cloud size directly from the microseismic observations. The second approach is based on a convolutional neural network (CNN) trained with the engineering parameters and past microseismic cloud size values. A rolling-forecasting strategy is employed to train consecutive CNN models in real-time to make predictions at a specified time lag. A data augmentation technique known as double noise injection is used to ensure that the amount of training examples available to the machine learning models at each time step is similar or larger than the number of free parameters. Results show that the CNN outperforms the quality of predictions of the physics-based models but with a reduced prediction capability. The physics-based approach can predict growth at any time but ignores the engineering parameters. In addition, the physics-based methods lead to real-time insights into the fracturing regime, revealing whether microseismicity is most likely generated due to a leak-off-dominated or a storage-dominated regime. The CNN model can forecast the cloud size only at a single future time lag while using the engineering parameters and past cloud growth as input. However, this approach does not provide a physical interpretation of the fracture propagation regime. The prediction accuracy of both methodologies varies depending on the microseismic behavior. We postulate that the CNN forecasts could be improved by including more physical constraints into the predictive model.

Download Full-text

A scalable machine learning online service for big data real-time analysis

2014 IEEE Symposium on Computational Intelligence in Big Data (CIBD) ◽

10.1109/cibd.2014.7011537 ◽

2014 ◽

Cited By ~ 13

Author(s):

Alejandro Baldominos ◽

Esperanza Albacete ◽

Yago Saez ◽

Pedro Isasi

Keyword(s):

Machine Learning ◽

Big Data ◽

Real Time ◽

Time Analysis ◽

Online Service ◽

Real Time Analysis

Download Full-text

Machine Learning Assisted Intraoperative Assessment of Brain Tumor Margins Using HRMAS NMR Spectroscopy

10.1101/2020.02.24.20026955 ◽

2020 ◽

Author(s):

Doruk Cakmakci ◽

Emin Onur Karakaslar ◽

Elisa Ruhland ◽

Marie-Pierre Chenard ◽

Francois Proust ◽

...

Keyword(s):

Machine Learning ◽

Real Time ◽

Magic Angle Spinning ◽

Magic Angle ◽

Full Spectrum ◽

Real Time Analysis ◽

Targeted Analysis ◽

Hrmas Nmr ◽

Angle Spinning ◽

And Control

AbstractComplete resection of the tumor is important for survival in glioma patients. Even if the gross total resection was achieved, left-over micro-scale tissue in the excision cavity risks recurrence. High Resolution Magic Angle Spinning Nuclear Magnetic Resonance (HRMAS NMR) technique can distinguish healthy and malign tissue efficiently using peak intensities of biomarker metabolites. The method is fast, sensitive and can work with small and unprocessed samples, which makes it a good fit for real-time analysis during surgery. However, only a targeted analysis for the existence of known tumor biomarkers can be made and this requires a technician with chemistry background, and a pathologist with knowledge on tumor metabolism to be present during surgery. Here, we show that we can accurately perform this analysis in real-time and can analyze the full spectrum in an untargeted fashion using machine learning. We work on a new and large HRMAS NMR dataset of glioma and control samples (n = 568), which are also labeled with a quantitative pathology analysis. Our results show that a random forest based approach can distinguish samples with tumor cells and controls accurately and effectively with a mean AUC of 85.6% and AUPR of 93.4%. We also show that we can further distinguish benign and malignant samples with a mean AUC of 87.1% and AUPR of 96.1%. We analyze the feature (peak) importance for classification to interpret the results of the classifier. We validate that known malignancy biomarkers such as creatine and 2-hydroxyglutarate play an important role in distinguish tumor and normal cells and suggest new biomarker regions. The code is released at http://github.com/ciceklab/HRMAS_NC.

Download Full-text

Attackdet: Combining web data parsing and real-time analysis with machine learning

Journal of Advances in Technology and Engineering Research ◽

10.20474/jater-6.1.4 ◽

2020 ◽

Vol 6 (1) ◽

Author(s):

Zeydin Pala ◽

Musa Şana

Keyword(s):

Machine Learning ◽

Real Time ◽

Time Analysis ◽

Web Data ◽

Real Time Analysis

Download Full-text

Enabling Rapid Classification of Social Media Communications During Crises

International Journal of Information Systems for Crisis Response and Management ◽

10.4018/ijiscram.2016070101 ◽

2016 ◽

Vol 8 (3) ◽

pp. 1-17 ◽

Cited By ~ 2

Author(s):

Muhammad Imran ◽

Prasenjit Mitra ◽

Jaideep Srivastava

Keyword(s):

Machine Learning ◽

Social Media ◽

Real Time ◽

Poor Performance ◽

Crisis Response ◽

Classification Performance ◽

Sudden Onset ◽

Supervised Machine Learning ◽

Online Information ◽

Real Time Analysis

The use of social media platforms such as Twitter by affected people during crises is considered a vital source of information for crisis response. However, rapid crisis response requires real-time analysis of online information. When a disaster happens, among other data processing techniques, supervised machine learning can help classify online information in real-time. However, scarcity of labeled data causes poor performance in machine training. Often labeled data from past event is available. Can past labeled data be reused to train classifiers? We study the usefulness of labeled data of past events. We observe the performance of our classifiers trained using different combinations of training sets obtained from past disasters. Moreover, we propose two approaches (target labeling and active learning) to boost classification performance of a learning scheme. We perform extensive experimentation on real crisis datasets and show the utility of past-labeled data to train machine learning classifiers to process sudden-onset crisis-related data in real-time.

Download Full-text

PhysOnline: An Open Source Machine Learning Pipeline for Real-Time Analysis of Streaming Physiological Waveform

IEEE Journal of Biomedical and Health Informatics ◽

10.1109/jbhi.2018.2832610 ◽

2019 ◽

Vol 23 (1) ◽

pp. 59-65 ◽

Cited By ~ 9

Author(s):

Jacob R. Sutton ◽

Ruhi Mahajan ◽

Oguz Akbilgic ◽

Rishikesan Kamaleswaran

Keyword(s):

Machine Learning ◽

Open Source ◽

Real Time ◽

Time Analysis ◽

Real Time Analysis

Download Full-text

Real-Time Forecasting of the COVID-19 Outbreak in Chinese Provinces: Machine Learning Approach Using Novel Digital Data and Estimates From Mechanistic Models (Preprint)

10.2196/preprints.20285 ◽

2020 ◽

Author(s):

Canelle Poirier ◽

Dianbo Liu ◽

Leonardo Clemente ◽

Xiyu Ding ◽

Matteo Chinazzi ◽

...

Keyword(s):

Machine Learning ◽

Real Time ◽

News Media ◽

Data Augmentation ◽

Mechanistic Model ◽

Digital Data ◽

Mechanistic Models ◽

Search Activity ◽

Current Time ◽

Aid Decision

BACKGROUND The inherent difficulty of identifying and monitoring emerging outbreaks caused by novel pathogens can lead to their rapid spread; and if left unchecked, they may become major public health threats to the planet. The ongoing coronavirus disease (COVID-19) outbreak, which has infected over 2,300,000 individuals and caused over 150,000 deaths, is an example of one of these catastrophic events. OBJECTIVE We present a timely and novel methodology that combines disease estimates from mechanistic models and digital traces, via interpretable machine learning methodologies, to reliably forecast COVID-19 activity in Chinese provinces in real time. METHODS Our method uses the following as inputs: (a) official health reports, (b) COVID-19–related internet search activity, (c) news media activity, and (d) daily forecasts of COVID-19 activity from a metapopulation mechanistic model. Our machine learning methodology uses a clustering technique that enables the exploitation of geospatial synchronicities of COVID-19 activity across Chinese provinces and a data augmentation technique to deal with the small number of historical disease observations characteristic of emerging outbreaks. RESULTS Our model is able to produce stable and accurate forecasts 2 days ahead of the current time and outperforms a collection of baseline models in 27 out of 32 Chinese provinces. CONCLUSIONS Our methodology could be easily extended to other geographies currently affected by COVID-19 to aid decision makers with monitoring and possibly prevention.

Download Full-text

Mobile Cloud-Based Framework for Health Monitoring with Real-Time Analysis Using Machine Learning Algorithms

10.1007/978-981-16-4244-9_14 ◽

2021 ◽

pp. 173-183

Author(s):

Suman Mohanty ◽

Ravi Anand ◽

Ambarish Dutta ◽

Venktesh Kumar ◽

Utsav Kumar ◽

...

Keyword(s):

Machine Learning ◽

Real Time ◽

Health Monitoring ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Mobile Cloud ◽

Time Analysis ◽

Real Time Analysis

Download Full-text

34: Real-time analysis of data using machine learning model significantly improves prediction of unplanned cesarean deliveries

American Journal of Obstetrics and Gynecology ◽

10.1016/j.ajog.2019.11.050 ◽

2020 ◽

Vol 222 (1) ◽

pp. S29

Author(s):

Yishai Sompolinsky ◽

Joshua Guedalia ◽

Amihai Rottenstreich ◽

Michal Novoselsky Persky ◽

Gabriel levin ◽

...

Keyword(s):

Machine Learning ◽

Real Time ◽

Learning Model ◽

Time Analysis ◽

Real Time Analysis ◽

Cesarean Deliveries ◽

Machine Learning Model

Download Full-text

Machine learning assisted intraoperative assessment of brain tumor margins using HRMAS NMR spectroscopy

PLoS Computational Biology ◽

10.1371/journal.pcbi.1008184 ◽

2020 ◽

Vol 16 (11) ◽

pp. e1008184

Author(s):

Doruk Cakmakci ◽

Emin Onur Karakaslar ◽

Elisa Ruhland ◽

Marie-Pierre Chenard ◽

Francois Proust ◽

...

Keyword(s):

Machine Learning ◽

Real Time ◽

Magic Angle Spinning ◽

Magic Angle ◽

Full Spectrum ◽

Real Time Analysis ◽

Targeted Analysis ◽

Hrmas Nmr ◽

Angle Spinning ◽

And Control

Complete resection of the tumor is important for survival in glioma patients. Even if the gross total resection was achieved, left-over micro-scale tissue in the excision cavity risks recurrence. High Resolution Magic Angle Spinning Nuclear Magnetic Resonance (HRMAS NMR) technique can distinguish healthy and malign tissue efficiently using peak intensities of biomarker metabolites. The method is fast, sensitive and can work with small and unprocessed samples, which makes it a good fit for real-time analysis during surgery. However, only a targeted analysis for the existence of known tumor biomarkers can be made and this requires a technician with chemistry background, and a pathologist with knowledge on tumor metabolism to be present during surgery. Here, we show that we can accurately perform this analysis in real-time and can analyze the full spectrum in an untargeted fashion using machine learning. We work on a new and large HRMAS NMR dataset of glioma and control samples (n = 565), which are also labeled with a quantitative pathology analysis. Our results show that a random forest based approach can distinguish samples with tumor cells and controls accurately and effectively with a median AUC of 85.6% and AUPR of 93.4%. We also show that we can further distinguish benign and malignant samples with a median AUC of 87.1% and AUPR of 96.1%. We analyze the feature (peak) importance for classification to interpret the results of the classifier. We validate that known malignancy biomarkers such as creatine and 2-hydroxyglutarate play an important role in distinguishing tumor and normal cells and suggest new biomarker regions. The code is released at http://github.com/ciceklab/HRMAS_NC.

Download Full-text

Real-Time Forecasting of the COVID-19 Outbreak in Chinese Provinces: Machine Learning Approach Using Novel Digital Data and Estimates From Mechanistic Models

Journal of Medical Internet Research ◽

10.2196/20285 ◽

2020 ◽

Vol 22 (8) ◽

pp. e20285

Author(s):

Dianbo Liu ◽

Leonardo Clemente ◽

Canelle Poirier ◽

Xiyu Ding ◽

Matteo Chinazzi ◽

...

Keyword(s):

Machine Learning ◽

Real Time ◽

News Media ◽

Data Augmentation ◽

Mechanistic Model ◽

Digital Data ◽

Mechanistic Models ◽

Search Activity ◽

Current Time ◽

Aid Decision

Background The inherent difficulty of identifying and monitoring emerging outbreaks caused by novel pathogens can lead to their rapid spread; and if left unchecked, they may become major public health threats to the planet. The ongoing coronavirus disease (COVID-19) outbreak, which has infected over 2,300,000 individuals and caused over 150,000 deaths, is an example of one of these catastrophic events. Objective We present a timely and novel methodology that combines disease estimates from mechanistic models and digital traces, via interpretable machine learning methodologies, to reliably forecast COVID-19 activity in Chinese provinces in real time. Methods Our method uses the following as inputs: (a) official health reports, (b) COVID-19–related internet search activity, (c) news media activity, and (d) daily forecasts of COVID-19 activity from a metapopulation mechanistic model. Our machine learning methodology uses a clustering technique that enables the exploitation of geospatial synchronicities of COVID-19 activity across Chinese provinces and a data augmentation technique to deal with the small number of historical disease observations characteristic of emerging outbreaks. Results Our model is able to produce stable and accurate forecasts 2 days ahead of the current time and outperforms a collection of baseline models in 27 out of 32 Chinese provinces. Conclusions Our methodology could be easily extended to other geographies currently affected by COVID-19 to aid decision makers with monitoring and possibly prevention.

Download Full-text