scholarly journals CalBehav: A Machine Learning-Based Personalized Calendar Behavioral Model Using Time-Series Smartphone Data

2019 ◽  
Vol 63 (7) ◽  
pp. 1109-1123 ◽  
Author(s):  
Iqbal H Sarker ◽  
Alan Colman ◽  
Jun Han ◽  
A S M Kayes ◽  
Paul Watters

Abstract The electronic calendar is a valuable resource nowadays for managing our daily life appointments or schedules, also known as events, ranging from professional to highly personal. Researchers have studied various types of calendar events to predict smartphone user behavior for incoming mobile communications. However, these studies typically do not take into account behavioral variations between individuals. In the real world, smartphone users can differ widely from each other in how they respond to incoming communications during their scheduled events. Moreover, an individual user may respond the incoming communications differently in different contexts subject to what type of event is scheduled in her personal calendar. Thus, a static calendar-based behavioral model for individual smartphone users does not necessarily reflect their behavior to the incoming communications. In this paper, we present a machine learning based context-aware model that is personalized and dynamically identifies individual’s dominant behavior for their scheduled events using logged time-series smartphone data, and shortly name as ‘CalBehav’. The experimental results based on real datasets from calendar and phone logs, show that this data-driven personalized model is more effective for intelligently managing the incoming mobile communications compared to existing calendar-based approaches.

2022 ◽  
Vol 54 (9) ◽  
pp. 1-36
Author(s):  
Dylan Chou ◽  
Meng Jiang

Data-driven network intrusion detection (NID) has a tendency towards minority attack classes compared to normal traffic. Many datasets are collected in simulated environments rather than real-world networks. These challenges undermine the performance of intrusion detection machine learning models by fitting machine learning models to unrepresentative “sandbox” datasets. This survey presents a taxonomy with eight main challenges and explores common datasets from 1999 to 2020. Trends are analyzed on the challenges in the past decade and future directions are proposed on expanding NID into cloud-based environments, devising scalable models for large network data, and creating labeled datasets collected in real-world networks.


2021 ◽  
Author(s):  
Stefano Olgiati ◽  
Nima Heidari ◽  
Davide Meloni ◽  
Federico Pirovano ◽  
Ali Noorani ◽  
...  

Background Quantum computing (QC) and quantum machine learning (QML) are promising experimental technologies which can improve precision medicine applications by reducing the computational complexity of algorithms driven by big, unstructured, real-world data. The clinical problem of knee osteoarthritis is that, although some novel therapies are safe and effective, the response is variable, and defining the characteristics of an individual who will respond remains a challenge. In this paper we tested a quantum neural network (QNN) application to support precision data-driven clinical decisions to select personalized treatments for advanced knee osteoarthritis. Methods Following patients consent and Research Ethics Committee approval, we collected clinico-demographic data before and after the treatment from 170 patients eligible for knee arthroplasty (Kellgren-Lawrence grade ≥ 3, OKS ≤ 27, Age ≥ 64 and idiopathic aetiology of arthritis) treated over a 2 year period with a single injection of microfragmented fat. Gender classes were balanced (76 M, 94 F) to mitigate gender bias. A patient with an improvement ≥ 7 OKS has been considered a Responder. We trained our QNN Classifier on a randomly selected training subset of 113 patients to classify responders from non-responders (73 R, 40 NR) in pain and function at 1 year. Outliers were hidden from the training dataset but not from the validation set. Results We tested our QNN Classifier on a randomly selected test subset of 57 patients (34 R, 23 NR) including outliers. The No Information Rate was equal to 0.59. Our application correctly classified 28 Responders out of 34 and 6 non-Responders out of 23 (Sensitivity = 0.82, Specificity = 0.26, F1 Statistic= 0.71). The Positive (LR+) and Negative (LR-) Likelihood Ratios were respectively 1.11 and 0.68. The Diagnostic Odds Ratio (DOR) was equal to 2. Conclusions Preliminary results on a small validation dataset show that quantum machine learning applied to data-driven clinical decisions for the personalized treatment of advanced knee osteoarthritis is a promising technology to reduce computational complexity and improve prognostic performance. Our results need further research validation with larger, real-world unstructured datasets, and clinical validation with an AI Clinical Trial to test model efficacy, safety, clinical significance and relevance at a public health level.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Irfan Haider Shakri

Purpose The purpose of this study is to compare five data-driven-based ML techniques to predict the time series data of Bitcoin returns, namely, alternating model tree, random forest (RF), multiple linear regression, multi-layer perceptron regression and M5 Tree algorithms. Design/methodology/approach The data used to forecast time series data of Bitcoin returns ranges from 8 July 2010 to 30 Aug 2020. This study used several predictors to predict bitcoin returns including economic policy uncertainty, equity market volatility index, S&P returns, USD/EURO exchange rates, oil and gold prices, volatilities and returns. Five statistical indexes, namely, correlation coefficient, mean absolute error, root mean square error, relative absolute error and root relative squared error are determined. The results of these metrices are used to develop colour intensity ranking. Findings Among the machine learning (ML) techniques used in this study, RF models has shown superior predictive ability for estimating the Bitcoin returns. Originality/value This study is first of its kind to use and compare ML models in the prediction of Bitcoins. More studies can be carried out by using further cryptocurrencies and other ML data-driven models in future.


Author(s):  
Georgia A. Papacharalampous ◽  
Hristos Tyralis ◽  
Demetris Koutsoyiannis

We perform an extensive comparison between 11 stochastic to 9 machine learning methods regarding their multi-step ahead forecasting properties by conducting 12 large-scale computational experiments. Each of these experiments uses 2 000 time series generated by linear stationary stochastic processes. We conduct each simulation experiment twice; the first time using time series of 110 values and the second time using time series of 310 values. Additionally, we conduct 92 real-world case studies using mean monthly time series of streamflow and particularly focus on one of them to reinforce the findings and highlight important facts. We quantify the performance of the methods using 18 metrics. The results indicate that the machine learning methods do not differ dramatically from the stochastic, while none of the methods under comparison is uniformly better or worse than the rest. However, there are methods that are regularly better or worse than others according to specific metrics.


2019 ◽  
Vol 252 ◽  
pp. 06006
Author(s):  
Andrzej Puchalski ◽  
Iwona Komorska

Data-driven diagnostic methods allow to obtain a statistical model of time series and to identify deviations of recorded data from the pattern of the monitored system. Statistical analysis of time series of mechanical vibrations creates a new quality in the monitoring of rotating machines. Most real vibration signals exhibit nonlinear properties well described by scaling exponents. Multifractal analysis, which relies mainly on assessing local singularity exponents, has become a popular tool for statistical analysis of empirical data. There are many methods to study time series in terms of their fractality. Comparing computational complexity, a wavelet leaders algorithm was chosen. Using Wavelet Leaders Multifractal Formalism, multifractal parameters were estimated, taking them as diagnostic features in the pattern recognition procedure, using machine learning methods. The classification was performed using neural network, k-nearest neighbours’ algorithm and support vector machine. The article presents the results of vibration acceleration tests in a demonstration transmission system that allows simulations of assembly errors and teeth wear.


Sensors ◽  
2021 ◽  
Vol 21 (17) ◽  
pp. 5739
Author(s):  
Narjes Davari ◽  
Bruno Veloso ◽  
Gustavo de Assis Costa ◽  
Pedro Mota Pereira ◽  
Rita P. Ribeiro ◽  
...  

In the last few years, many works have addressed Predictive Maintenance (PdM) by the use of Machine Learning (ML) and Deep Learning (DL) solutions, especially the latter. The monitoring and logging of industrial equipment events, like temporal behavior and fault events—anomaly detection in time-series—can be obtained from records generated by sensors installed in different parts of an industrial plant. However, such progress is incipient because we still have many challenges, and the performance of applications depends on the appropriate choice of the method. This article presents a survey of existing ML and DL techniques for handling PdM in the railway industry. This survey discusses the main approaches for this specific application within a taxonomy defined by the type of task, employed methods, metrics of evaluation, the specific equipment or process, and datasets. Lastly, we conclude and outline some suggestions for future research.


Author(s):  
Parisa Kordjamshidi ◽  
Dan Roth ◽  
Kristian Kersting

Data-driven approaches are becoming dominant problem-solving techniques in many areas of research and industry. Unfortunately, current technologies do not make such techniques easy to use for application experts who are not fluent in machine learning nor for machine learning experts who aim at testing ideas on real-world data and need to evaluate those as a part of an end-to-end system. We review key efforts made by various AI communities to provide languages for high-level abstractions over learning and reasoning techniques needed for designing complex AI systems. We classify the existing frameworks based on the type of techniques as well as the data and knowledge representations they use, provide a comparative study of the way they address the challenges of programming real-world applications, and highlight some shortcomings and future directions.


2022 ◽  
Vol 16 (1) ◽  
pp. e0010056
Author(s):  
Emmanuelle Sylvestre ◽  
Clarisse Joachim ◽  
Elsa Cécilia-Joseph ◽  
Guillaume Bouzillé ◽  
Boris Campillo-Gimenez ◽  
...  

Background Traditionally, dengue surveillance is based on case reporting to a central health agency. However, the delay between a case and its notification can limit the system responsiveness. Machine learning methods have been developed to reduce the reporting delays and to predict outbreaks, based on non-traditional and non-clinical data sources. The aim of this systematic review was to identify studies that used real-world data, Big Data and/or machine learning methods to monitor and predict dengue-related outcomes. Methodology/Principal findings We performed a search in PubMed, Scopus, Web of Science and grey literature between January 1, 2000 and August 31, 2020. The review (ID: CRD42020172472) focused on data-driven studies. Reviews, randomized control trials and descriptive studies were not included. Among the 119 studies included, 67% were published between 2016 and 2020, and 39% used at least one novel data stream. The aim of the included studies was to predict a dengue-related outcome (55%), assess the validity of data sources for dengue surveillance (23%), or both (22%). Most studies (60%) used a machine learning approach. Studies on dengue prediction compared different prediction models, or identified significant predictors among several covariates in a model. The most significant predictors were rainfall (43%), temperature (41%), and humidity (25%). The two models with the highest performances were Neural Networks and Decision Trees (52%), followed by Support Vector Machine (17%). We cannot rule out a selection bias in our study because of our two main limitations: we did not include preprints and could not obtain the opinion of other international experts. Conclusions/Significance Combining real-world data and Big Data with machine learning methods is a promising approach to improve dengue prediction and monitoring. Future studies should focus on how to better integrate all available data sources and methods to improve the response and dengue management by stakeholders.


2020 ◽  
Author(s):  
Andrea Cominola ◽  
Marie-Philine Becker ◽  
Riccardo Taormina

<p>As several cities all over the world face the exacerbating challenges posed by climate change, population growth, and urbanization, it becomes clear how increased water security and more resilient urban water systems can be achieved by optimizing the use of water resources and minimize losses and inefficient usage. In the literature, there is growing evidence about the potential of demand management programs to complement supply-side interventions and foster more efficient water use behaviors. A new boost to demand management is offered by the ongoing digitalization of the water utility sector, which facilitates accurate measuring and estimation of urban water demands down to the scale of individual end-uses of residential water consumers (e.g., showering, watering). This high-resolution data can play a pivotal role in supporting demand-side management programs, fostering more efficient and sustainable water uses, and prompting the detection of anomalous behaviors (e.g., leakages, faulty meters). The problem of deriving individual end-use consumption traces from the composite signal recorded by single-point meters installed at the inlet of each household has been studied for nearly 30 years in the electricity field (Non-Intrusive Load Monitoring). Conversely, the similar disaggregation problem in the water sector - here called Non-Intrusive Water Monitoring (NIWM) - is still a very open research challenge. Most of the state-of-the-art end-use disaggregation algorithms still need an intrusive calibration or time- consuming expert-based manual processing. Moreover, the limited availability of large-scale open datasets with end- use ground truth data has so far greatly limited the development and benchmarking of NIWM methods.</p><p>In this work, we comparatively test the suitability of different machine learning algorithms to perform NIWM. First, we formulate the NIWM problem both as a regression problem, where water consumption traces are processed as continuous time-series, and a classification problem, where individual water use events are associated to one or more end use labels. Second, a number of algorithms based on the last trends in Artificial Intelligence and Machine Learning are tested both on synthetic and real-world data, including state-of-the-art tree-based and Deep Learning methods. Synthetic water end-use time series generated with the STREaM stochastic simulation model are considered for algorithm testing, along with labelled real-world data from the Residential End Uses of Water, Version 2, database by the Water Research Foundation. Finally, the performance of the different NIWM algorithms is comparatively assessed with metrics that include (i) NIWM accuracy, (ii) computational cost, and (iii) amount of needed training data.</p>


Sign in / Sign up

Export Citation Format

Share Document