scholarly journals FOCUS ON ARTIFICIAL INTELLIGENCE FOR PREDICTING THE OUTFLOW OF CLIENTS FROM ON-LINE EDUCATION SITES

2021 ◽  
Vol 6 (166) ◽  
pp. 2-7
Author(s):  
V. Bredikhin ◽  
T. Senchuk ◽  
K. Stuzhuk

The article examines the process of forecasting customer outflows, which is especially important for companies that use a business model based on subscription. It was found that the outflow rate is extremely important for companies with a subscription and transactional business model, which implies regular payments to the company (banks, telecom operators, SaaS-services, etc.). For this purpose, the types, the main reasons for the outflow of customers and the parameters defined to build a predictive model using machine learning algorithms were considered. The result was the hypothesis of the reasons for the outflow of customers from sites that provide training services based on courses that are presented on-line in the Internet space. To build a model of outflow forecasting, the behavioral characteristics of students, their motivation and the structure of the courses themselves were studied. Based on the collected large array of data, their change was analyzed by a large number of parameters and the relationships between the behavioral characteristics of students, course structures and their passage were identified. A variant of the forecasting model was built, for which the accuracy of its operation was increased and the results were integrated into the customer outflow prediction module. The final list of features included more than 100 parameters, which were divided into 6 blocks. As a result, a predictive model was created using the Weibull distribution, as client behavior can be considered as a kind of survival model. To estimate the probability of customer outflow, based on the considered hypotheses, a recurrent neural network with an LSTM layer was developed, where a negative logarithmic likelihood function was used as a loss function for the Weibull distribution. As a conclusion, it was proposed to introduce a stable proactive educational business, when decisions are made not only on the basis of feelings, but also on the basis of data, comes a clearer and more sound understanding of how to improve the educational product.

Sensors ◽  
2021 ◽  
Vol 21 (2) ◽  
pp. 405
Author(s):  
Marcos Lupión ◽  
Javier Medina-Quero ◽  
Juan F. Sanjuan ◽  
Pilar M. Ortigosa

Activity Recognition (AR) is an active research topic focused on detecting human actions and behaviours in smart environments. In this work, we present the on-line activity recognition platform DOLARS (Distributed On-line Activity Recognition System) where data from heterogeneous sensors are evaluated in real time, including binary, wearable and location sensors. Different descriptors and metrics from the heterogeneous sensor data are integrated in a common feature vector whose extraction is developed by a sliding window approach under real-time conditions. DOLARS provides a distributed architecture where: (i) stages for processing data in AR are deployed in distributed nodes, (ii) temporal cache modules compute metrics which aggregate sensor data for computing feature vectors in an efficient way; (iii) publish-subscribe models are integrated both to spread data from sensors and orchestrate the nodes (communication and replication) for computing AR and (iv) machine learning algorithms are used to classify and recognize the activities. A successful case study of daily activities recognition developed in the Smart Lab of The University of Almería (UAL) is presented in this paper. Results present an encouraging performance in recognition of sequences of activities and show the need for distributed architectures to achieve real time recognition.


2021 ◽  
Vol 17 ◽  
Author(s):  
Hui Zhang ◽  
Qidong Liu ◽  
Xiaoru Sun ◽  
Yaru Xu ◽  
Yiling Fang ◽  
...  

Background: The pathophysiology of Alzheimer's disease (AD) is still not fully studied. Objective: This study aimed to explore the differently expressed key genes in AD and build a predictive model of diagnosis and treatment. Methods: Gene expression data of the entorhinal cortex of AD, asymptomatic AD, and control samples from the GEO database were analyzed to explore the relevant pathways and key genes in the progression of AD. Differentially expressed genes between AD and the other two groups in the module were selected to identify biological mechanisms in AD through KEGG and PPI network analysis in Metascape. Furthermore, genes with a high connectivity degree by PPI network analysis were selected to build a predictive model using different machine learning algorithms. Besides, model performance was tested with five-fold cross-validation to select the best fitting model. Results: A total of 20 co-expression gene clusters were identified after the network was constructed. Module 1 (in black) and module 2 (in royal blue) were most positively and negatively correlated with AD, respectively. Total 565 genes in module 1 and 215 genes in module 2, respectively, overlapped in two differentially expressed genes lists. They were enriched in the G protein-coupled receptor signaling pathway, immune-related processes, and so on. 11 genes were screened by using lasso logistic regression, and they were considered to play an important role in predicting AD samples. The model built by the support vector machine algorithm with 11 genes showed the best performance. Conclusion: This result shed light on the diagnosis and treatment of AD.


Author(s):  
Rithesh Pakkala P. ◽  
Prakhyath Rai ◽  
Shamantha Rai Bellipady

This chapter provides insight on pattern recognition by illustrating various approaches and frameworks which aid in the prognostic reasoning facilitated by feature selection and feature extraction. The chapter focuses on analyzing syntactical and statistical approaches of pattern recognition. Typically, a large set of features have an impact on the performance of the predictive model. Hence, there is a need to eliminate redundant and noisy pieces of data before developing any predictive model. The selection of features is independent of any machine learning algorithms. The content-rich information obtained after the elimination of noisy patterns such as stop words and missing values is then used for further prediction. The refinement and extraction of relevant features yields in performance enhancements of future prediction and analysis.


2019 ◽  
Vol 2019 ◽  
pp. 1-8 ◽  
Author(s):  
Fan Yang ◽  
Hu Ren ◽  
Zhili Hu

The maximum likelihood estimation is a widely used approach to the parameter estimation. However, the conventional algorithm makes the estimation procedure of three-parameter Weibull distribution difficult. Therefore, this paper proposes an evolutionary strategy to explore the good solutions based on the maximum likelihood method. The maximizing process of likelihood function is converted to an optimization problem. The evolutionary algorithm is employed to obtain the optimal parameters for the likelihood function. Examples are presented to demonstrate the proposed method. The results show that the proposed method is suitable for the parameter estimation of the three-parameter Weibull distribution.


2021 ◽  
Vol 13 (3) ◽  
pp. 23-34
Author(s):  
Chandrakant D. Patel ◽  
◽  
Jayesh M. Patel

With the large quantity of information offered on-line, it's equally essential to retrieve correct information for a user query. A large amount of data is available in digital form in multiple languages. The various approaches want to increase the effectiveness of on-line information retrieval but the standard approach tries to retrieve information for a user query is to go looking at the documents within the corpus as a word by word for the given query. This approach is incredibly time intensive and it's going to miss several connected documents that are equally important. So, to avoid these issues, stemming has been extensively utilized in numerous Information Retrieval Systems (IRS) to extend the retrieval accuracy of all languages. These papers go through the problem of stemming with Web Page Categorization on Gujarati language which basically derived the stem words using GUJSTER algorithms [1]. The GUJSTER algorithm is based on morphological rules which is used to derived root or stem word from inflected words of the same class. In particular, we consider the influence of extracted a stem or root word, to check the integrity of the web page classification using supervised machine learning algorithms. This research work is intended to focus on the analysis of Web Page Categorization (WPC) of Gujarati language and concentrate on a research problem to do verify the influence of a stemming algorithm in a WPC application for the Gujarati language with improved accuracy between from 63% to 98% through Machine Learning supervised models with standard ratio 80% as training and 20% as testing.


Author(s):  
Makoto Iwasaki ◽  
Junya Kanda ◽  
Yasuyuki Arai ◽  
Tadakazu Kondo ◽  
Takayuki Ishikawa ◽  
...  

Graft-versus-host-disease-free, relapse-free survival (GRFS) is a useful composite endpoint that measures survival without relapse or significant morbidity after allogeneic hematopoietic stem cell transplantation (allo-HSCT). We aimed to develop a novel analytical method that appropriately handles right-censored data and competing risks to understand the risk for GRFS and each component of GRFS. This study was a retrospective data-mining study on a cohort of 2207 adult patients who underwent their first allo-HSCT at the Kyoto Stem Cell Transplantation Group (KSCTG), a multi-institutional joint research group of 17 transplantation centers in Japan. The primary endpoint was GRFS. A stacked ensemble of Cox proportional hazard regression and seven machine learning algorithms was applied to develop a prediction model. The median age of patients was 48 years. For GRFS, the stacked ensemble model achieved better predictive accuracy evaluated by C-index than other top-of-the-art competing risk models (ensemble model: 0.670, Cox-PH: 0.668, Random Survival Forest: 0.660, Dynamic DeepHit: 0.646). The probability of GRFS after 2 years was 30.54% for the high-risk and 40.69% for the low-risk group, respectively (hazard ratio [HR] compared to the low-risk group: 2.127; 95% CI: 1.19-3.80). We developed a novel predictive model for survival analysis that showed superior risk stratification to existing methods using a stacked ensemble of multiple machine learning algorithms.


2021 ◽  
Author(s):  
Christopher Duckworth ◽  
Francis P Chmiel ◽  
Dan K. Burns ◽  
Zlatko D Zlatev ◽  
Neil M White ◽  
...  

Supervised machine learning algorithms deployed in acute healthcare settings use data describing historical episodes to predict clinical outcomes. Clinical settings are dynamic environments and the underlying data distributions characterising episodes can change with time (a phenomenon known as data drift), and so can the relationship between episode characteristics and associated clinical outcomes (so-called, concept drift). We demonstrate how explainable machine learning can be used to monitor data drift in a predictive model deployed within a hospital emergency department. We use the COVID-19 pandemic as an exemplar cause of data drift, which has brought a severe change in operational circumstances. We present a machine learning classifier trained using (pre-COVID-19) data, to identify patients at high risk of admission to hospital during an emergency department attendance. We evaluate our model's performance on attendances occurring pre-pandemic (AUROC 0.856 95\%CI [0.852, 0.859]) and during the COVID-19 pandemic (AUROC 0.826 95\%CI [0.814, 0.837]). We demonstrate two benefits of explainable machine learning (SHAP) for models deployed in healthcare settings: (1) By tracking the variation in a feature's SHAP value relative to its global importance, a complimentary measure of data drift is found which highlights the need to retrain a predictive model. (2) By observing the relative changes in feature importance emergent health risks can be identified.


2020 ◽  
Author(s):  
Nida Fatima

Abstract Background: Preoperative prognostication of clinical and surgical outcome in patients with neurosurgical diseases can improve the risk stratification, thus can guide in implementing targeted treatment to minimize these events. Therefore, the author aims to highlight the development and validation of predictive models determining neurosurgical outcomes through machine learning algorithms using logistic regression.Methods: Logistic regression (enter, backward and forward) and least absolute shrinkage and selection operator (LASSO) method for selection of variables from selected database can eventually lead to multiple candidate models. The final model with a set of predictive variables must be selected based upon the clinical knowledge and numerical results.Results: The predictive model which performed best on the discrimination, calibration, Brier score and decision curve analysis must be selected to develop machine learning algorithms. Logistic regression should be compared with the LASSO model. Usually for the big databases, the predictive model selected through logistic regression gives higher Area Under the Curve (AUC) than those with LASSO model. The predictive probability derived from the best model could be uploaded to an open access web application which is easily deployed by the patients and surgeons to make a risk assessment world-wide.Conclusions: Machine learning algorithms provide promising results for the prediction of outcomes following cranial and spinal surgery. These algorithms can provide useful factors for patient-counselling, assessing peri-operative risk factors, and predicting post-operative outcomes after neurosurgery.


Sign in / Sign up

Export Citation Format

Share Document